Troubleshooting
Systematic troubleshooting for common security issues: access denied, permission problems, authentication failures, and secret access errors.
Troubleshooting Framework (5-Step Method)
5-Step Troubleshooting Process
- Understand the error: Read the exact error message
- Check RBAC: User/app has right role?
- Check network: Is traffic reaching the service?
- Check authentication: Is user/app correctly authenticated?
- Check audit logs: What does Azure indicate happened?
Case 1: "Access Denied" on Resource
Error
Authorization failed for action 'Microsoft.Compute/virtualMachines/start/action'
Diagnosis
# Check user's role assignments on this resource
az role assignment list \
--assignee user@contoso.com \
--resource-group myRG \
--scope /subscriptions/{subId}/resourceGroups/myRG/providers/Microsoft.Compute/virtualMachines/myVM
# Check effective permissions (what user actually can do)
az provider operation list \
--api-version 2021-04-01 \
--query "[?contains(name, 'Microsoft.Compute/virtualMachines')].operations[].name"
# View activity log for the failure
az monitor activity-log list \
--caller user@contoso.com \
--status Failed \
--query "[0:5]"
Common Causes & Fixes
- Cause: User is "Reader", needs "VM Contributor"
Fix:az role assignment create --assignee user@contoso.com --role "Virtual Machine Contributor" --resource-group myRG - Cause: Role assigned at wrong scope (different RG)
Fix: Assign at correct resource group or resource scope - Cause: User has role, but it doesn't include "start" action
Fix: Create custom role with required actions
Case 2: "Authentication Failed" on Key Vault
Error
KeyVault.GetSecret failed: Access denied (403)
Diagnosis
# Check if app has managed identity
az app service identity show --name myAppService --resource-group myRG
# Verify Key Vault access policy / RBAC
az keyvault role assignment list \
--vault-name myKeyVault \
--query "[].{roleName:roleDefinitionName, principalName:principalName}"
# Test direct secret retrieval
az keyvault secret show \
--vault-name myKeyVault \
--name mySecret \
--query value
# Check Key Vault audit logs
az monitor activity-log list \
--resource-group myRG \
--resource myKeyVault --resource-type "Microsoft.KeyVault/vaults"
Common Causes & Fixes
- Cause: No managed identity on app
Fix:az app service identity assign --name myAppService --resource-group myRG - Cause: Managed identity exists, but no Key Vault access
Fix: Grant "Key Vault Secrets Officer" RBAC role to managed identity - Cause: Using legacy access policy (not RBAC)
Fix: Add access policy:az keyvault set-policy --name myKeyVault --object-id {principalId} --secret-permissions get list
Case 3: "User Cannot Sign In" (MFA Locked Out)
Error
Your sign-in was blocked. Please see your admin.
Common Causes & Fixes
- Cause: User lost phone (MFA device)
Fix: Admin resets MFA: Azure AD Portal → User → Authentication Methods → Delete authenticator - Cause: Conditional Access policy blocks user (non-managed device, unusual location)
Fix: User uses managed device or trusted location, or admin exempts user temporarily - Cause: Account locked (too many failed attempts)
Fix: Wait 30-60 minutes for unlock, or admin unlocks in Portal
Case 4: "Connection Timeout" to Key Vault from Private Network
Error
ConnectTimeout: Connection to https://myKeyVault.vault.azure.net:443 timed out
Diagnosis Steps
# Check if Key Vault has private endpoint az keyvault show --name myKeyVault --resource-group myRG \ --query "publicNetworkAccess" # Test connectivity from VM # SSH/RDP into VM, then: curl -v https://myKeyVault.vault.azure.net/version # Check NSG on VM subnet - allow 443 outbound? az network nsg rule list --resource-group myRG --nsg-name myNSG \ --query "[?destinationPortRange=='443' || destinationPortRange=='*']" # Check if private endpoint DNS resolves nslookup myKeyVault.vault.azure.net # Check firewall/route tables az network route-table show --name myRouteTable --resource-group myRG
Common Causes & Fixes
- Cause: Key Vault has public access disabled, but no private endpoint
Fix: Create private endpoint for Key Vault in VNet - Cause: NSG blocks outbound 443 traffic
Fix: Add NSG rule allowing outbound 443 to Key Vault IP - Cause: DNS resolution failing for private endpoint
Fix: Check private DNS zone linking, verify DNS forwarders
Summary & Quick Reference
| Error Pattern | First Thing to Check | Command |
|---|---|---|
| "Access Denied" / "Unauthorized" | RBAC role assignments | az role assignment list --assignee {user/app} |
| "KeyVault 403" / "Access denied" | Managed identity + Vault access | az keyvault role assignment list --vault-name |
| "Sign-in blocked" | MFA status / Conditional Access | Azure AD Portal (no CLI equivalent) |
| "Connection Timeout" to service | NSG rules + DNS resolution | az network nsg rule list / nslookup |
Interview Questions
Q (Beginner): User reports "Access Denied" when trying to start a VM. Where do you start troubleshooting?
A: Check if user has "Virtual Machine Contributor" or higher role at the resource group or VM scope using
A: Check if user has "Virtual Machine Contributor" or higher role at the resource group or VM scope using
az role assignment list.
Q (Beginner): App can't retrieve secrets from Key Vault (403). What's missing?
A: App likely has no managed identity, or managed identity lacks Key Vault permissions. Verify managed identity exists and is granted RBAC role or access policy.
A: App likely has no managed identity, or managed identity lacks Key Vault permissions. Verify managed identity exists and is granted RBAC role or access policy.
Q (Intermediate): Walk through the 5-step troubleshooting method for "VM can't reach Key Vault."
A: 1) Understand error (connection timeout vs 403) 2) Check network (NSG allows 443?) 3) Check DNS (private endpoint DNS resolves?) 4) Check auth (managed identity has role?) 5) Check logs (activity log shows what happened?).
A: 1) Understand error (connection timeout vs 403) 2) Check network (NSG allows 443?) 3) Check DNS (private endpoint DNS resolves?) 4) Check auth (managed identity has role?) 5) Check logs (activity log shows what happened?).
Q (Intermediate): MFA is blocking user access. User lost phone. What's your fix?
A: Admin goes to Azure AD Portal → User → Authentication Methods → Remove lost phone authenticator. User re-registers MFA device next login.
A: Admin goes to Azure AD Portal → User → Authentication Methods → Remove lost phone authenticator. User re-registers MFA device next login.
Q (Scenario): Developer reports: "My app can't connect to SQL database. Works on my laptop, fails in Azure." Troubleshoot this systematically.
A: 1) Check error message (auth vs connection?) 2) RBAC: Does app's managed identity have SQL access? 3) Network: NSG allows 1433 from app subnet? 4) Auth: Is app using correct credentials/connection string from Key Vault? 5) Logs: Check SQL logs for denied connection attempts. Most likely NSG or identity issue.
A: 1) Check error message (auth vs connection?) 2) RBAC: Does app's managed identity have SQL access? 3) Network: NSG allows 1433 from app subnet? 4) Auth: Is app using correct credentials/connection string from Key Vault? 5) Logs: Check SQL logs for denied connection attempts. Most likely NSG or identity issue.