API Fundamentals - Authentication, Endpoints, Keys
Understand how to authenticate to Azure AI Services, work with endpoints, manage API keys securely, and design applications that consume these REST APIs reliably.
🧒 Simple Explanation (ELI5)
When you call an Azure AI Service, you are making an HTTP request to a specific web address (endpoint) and proving you have permission by including a secret key (like a password). The service checks your key, processes your data, and returns a response. Different service regions have different endpoints, but they work the same way. You always need three things: the endpoint URL, your API key, and the data you want to analyze.
🔧 Why do we need it?
- Endpoints and keys are how Azure enforces who can access the service and prevents random people from using your resources.
- Endpoints are region-specific for performance (lower latency when you call a service in your region).
- Different Azure AI Services can have different authentication schemes and response formats.
- Developers need standard patterns to securely store, rotate, and use keys without hardcoding them.
🌍 Real-world Analogy
An endpoint is your local bank branch address. An API key is your debit card. Every transaction (API call) requires both: you go to the right branch and show your card. If you lose your card, the bank cancels it and gives you a new one. Never leave your card lying around or write it on a piece of paper.
⚙️ How it works (Technical)
When you create an Azure AI Service resource, Azure generates an endpoint URL (e.g., https://myservice.cognitiveservices.azure.com/) and two API keys (primary and secondary for rotation). Each HTTP request includes the key in a header like Ocp-Apim-Subscription-Key or Authorization. The service validates the key, checks the request quota, and processes the data. Responses are typically JSON. API versioning allows services to evolve without breaking older applications.
⌨️ Commands / Syntax
import requests
endpoint = "https://myservice.cognitiveservices.azure.com/vision/v3.2/analyze"
headers = {
"Ocp-Apim-Subscription-Key": "",
"Content-Type": "application/json"
}
payload = {
"url": "https://example.com/image.jpg"
}
response = requests.post(
endpoint + "?visualFeatures=Description,Tags,Objects",
headers=headers,
json=payload,
timeout=20
)
if response.status_code == 200:
body = response.json()
print("caption:", body.get("description", {}).get("captions", [{}])[0].get("text"))
elif response.status_code in (401, 403):
print("Auth failure: check key, endpoint, RBAC/network rules")
elif response.status_code == 429:
print("Rate limited: apply retry with exponential backoff + jitter")
else:
print("Unhandled:", response.status_code, response.text) 💼 Example (Real-world Use Case)
A content moderation platform needs to analyze thousands of user-uploaded images hourly. The team stores the Vision Service endpoint and key in Azure Key Vault, which their backend application retrieves securely at startup. Every image is sent to the analyze endpoint; the service returns content flags. If rate limits are hit, the application queues requests and retries with exponential backoff.
🧪 Hands-on
- Create an Azure AI Service resource and note its endpoint URL and API key combination.
- Store the endpoint and key in environment variables (never hardcode).
- Use PowerShell or curl to make a test HTTP request to the endpoint, including your key in the header.
- Parse the JSON response to understand the service's data format.
- Practice rotating API keys (regenerate primary, update applications, then regenerate secondary).
Always use Azure Key Vault to store API keys. Never commit them to source control. Use managed identities when possible instead of keys.
Try It Yourself
- Create an Azure AI Service and retrieve its keys from the Azure Portal.
- Regenerate one key and verify the old key no longer works.
- Use tools like Postman to make an API request and inspect the response headers for rate-limit information.
- Compare endpoint URLs across different regions to understand the naming pattern.
🧠 Debugging Scenario
Failure: Your application receives "401 Unauthorized" when calling the Vision API even though the endpoint and key look correct.
- Verify the API key is for the correct service (Vision key for Vision API, not Language key).
- Check that the key has not been rotated recently without updating your application configuration.
- Ensure the key is being sent in the correct header (Ocp-Apim-Subscription-Key for most services).
- Confirm the endpoint URL is correct for the region where you created the resource.
- If the key is very old, it may have been deactivated; regenerate a new one.
- If the request is signed correctly but still fails, inspect network restrictions (private endpoint/firewall) and check for clock skew in signed-token flows.
- For 429 bursts, implement idempotency keys, queue non-urgent work, and use adaptive client throttling to avoid retry storms.
🎯 Interview Questions
Beginner
Endpoint URL, API key, and request payload (the data to analyze).
It allows key rotation without downtime. You can regenerate the old key while applications still use the new key, then rotate again.
No, never. Store them in environment variables, Azure Key Vault, or pass them through CI/CD secrets.
An endpoint URL is the base address of an Azure AI Service, e.g., https://myservice.cognitiveservices.azure.com/. Different regions have different endpoints for performance.
In headers, usually Ocp-Apim-Subscription-Key for Azure Cognitive Services or Authorization for some services.
Intermediate
Store keys in Azure Key Vault with RBAC access controls. Applications retrieve keys at startup or on-demand. Rotate keys quarterly: generate new key, update applications, then disable old key.
The request fails with 401 Unauthorized because the key does not match the service type.
Catch 401/403 errors separately from transient errors. For 401, log the error, alert ops, and fail fast (key rotation issue). For transient errors, retry with exponential backoff.
Different API versions have different endpoints (e.g., /v3.1/ vs /v3.2/). Specify the version in your URL to ensure backward compatibility when new versions are released.
Update Key Vault with the new key while applications cache old key. Applications poll for updated keys or use short cache TTL. Once all instances are using the new key, retire the old one.
Scenario-based
Never give them the raw key. Instead: (1) Assign RBAC role for Key Vault access, (2) set up their local environment to authenticate via Azure CLI, (3) show them how to retrieve the key from Key Vault at runtime in code. Keys are used by the application, not individual developers.
Immediately regenerate the key (rendering it invalid). Scan the repo for any requests made with that key and check for exfiltration. Rewrite history or force-push to remove the key from git history. Set up pre-commit hooks to detect secrets.
Check if the service has a quota/throttling limit (403 can mean quota exceeded). Look at Application Insights or logs for which endpoints are failing. Verify if keys are being rotated by multiple deployment instances simultaneously (race condition).
Centralize all keys in Key Vault. Each microservice retrieves its key from Key Vault at startup with a short TTL. Rotation happens in Key Vault; services pull the updated key on next refresh. No downtime needed.
Check Application Insights for calling patterns: are requests coming from expected IPs/services? Look for 401s (failed auth attempts). Review recent deployments. If malicious, regenerate API keys immediately and check firewall rules for restrictions.
🌐 Real-world Usage
Most production systems store API endpoints and keys in Key Vault, retrieve them securely at startup, and cache them in-memory with a refresh TTL. Error handling distinguishes between auth failures (immediate retry may not help) and transient failures (backoff retry). Monitoring tracks key usage patterns for anomaly detection.
📝 Summary
Azure AI Services use endpoint URLs and API keys for authentication. Secure practices include storing keys in Key Vault, rotating them regularly, and using error handling to distinguish authentication failures from transient errors.