Hands-onDebugging

Troubleshooting

Systematic troubleshooting for common security issues: access denied, permission problems, authentication failures, and secret access errors.

Troubleshooting Framework (5-Step Method)

5-Step Troubleshooting Process
  1. Understand the error: Read the exact error message
  2. Check RBAC: User/app has right role?
  3. Check network: Is traffic reaching the service?
  4. Check authentication: Is user/app correctly authenticated?
  5. Check audit logs: What does Azure indicate happened?

Case 1: "Access Denied" on Resource

Error

Authorization failed for action 'Microsoft.Compute/virtualMachines/start/action'

Diagnosis

# Check user's role assignments on this resource
az role assignment list \
 --assignee user@contoso.com \
 --resource-group myRG \
 --scope /subscriptions/{subId}/resourceGroups/myRG/providers/Microsoft.Compute/virtualMachines/myVM

# Check effective permissions (what user actually can do)
az provider operation list \
 --api-version 2021-04-01 \
 --query "[?contains(name, 'Microsoft.Compute/virtualMachines')].operations[].name"

# View activity log for the failure
az monitor activity-log list \
 --caller user@contoso.com \
 --status Failed \
 --query "[0:5]"

Common Causes & Fixes

  • Cause: User is "Reader", needs "VM Contributor"
    Fix: az role assignment create --assignee user@contoso.com --role "Virtual Machine Contributor" --resource-group myRG
  • Cause: Role assigned at wrong scope (different RG)
    Fix: Assign at correct resource group or resource scope
  • Cause: User has role, but it doesn't include "start" action
    Fix: Create custom role with required actions

Case 2: "Authentication Failed" on Key Vault

Error

KeyVault.GetSecret failed: Access denied (403)

Diagnosis

# Check if app has managed identity
az app service identity show --name myAppService --resource-group myRG

# Verify Key Vault access policy / RBAC
az keyvault role assignment list \
 --vault-name myKeyVault \
 --query "[].{roleName:roleDefinitionName, principalName:principalName}"

# Test direct secret retrieval
az keyvault secret show \
 --vault-name myKeyVault \
 --name mySecret \
 --query value

# Check Key Vault audit logs
az monitor activity-log list \
 --resource-group myRG \
 --resource myKeyVault --resource-type "Microsoft.KeyVault/vaults"

Common Causes & Fixes

  • Cause: No managed identity on app
    Fix: az app service identity assign --name myAppService --resource-group myRG
  • Cause: Managed identity exists, but no Key Vault access
    Fix: Grant "Key Vault Secrets Officer" RBAC role to managed identity
  • Cause: Using legacy access policy (not RBAC)
    Fix: Add access policy: az keyvault set-policy --name myKeyVault --object-id {principalId} --secret-permissions get list

Case 3: "User Cannot Sign In" (MFA Locked Out)

Error

Your sign-in was blocked. Please see your admin.

Common Causes & Fixes

  • Cause: User lost phone (MFA device)
    Fix: Admin resets MFA: Azure AD Portal → User → Authentication Methods → Delete authenticator
  • Cause: Conditional Access policy blocks user (non-managed device, unusual location)
    Fix: User uses managed device or trusted location, or admin exempts user temporarily
  • Cause: Account locked (too many failed attempts)
    Fix: Wait 30-60 minutes for unlock, or admin unlocks in Portal

Case 4: "Connection Timeout" to Key Vault from Private Network

Error

ConnectTimeout: Connection to https://myKeyVault.vault.azure.net:443 timed out

Diagnosis Steps

# Check if Key Vault has private endpoint
az keyvault show --name myKeyVault --resource-group myRG \
 --query "publicNetworkAccess"

# Test connectivity from VM
# SSH/RDP into VM, then:
curl -v https://myKeyVault.vault.azure.net/version

# Check NSG on VM subnet - allow 443 outbound?
az network nsg rule list --resource-group myRG --nsg-name myNSG \
 --query "[?destinationPortRange=='443' || destinationPortRange=='*']"

# Check if private endpoint DNS resolves
nslookup myKeyVault.vault.azure.net

# Check firewall/route tables
az network route-table show --name myRouteTable --resource-group myRG

Common Causes & Fixes

  • Cause: Key Vault has public access disabled, but no private endpoint
    Fix: Create private endpoint for Key Vault in VNet
  • Cause: NSG blocks outbound 443 traffic
    Fix: Add NSG rule allowing outbound 443 to Key Vault IP
  • Cause: DNS resolution failing for private endpoint
    Fix: Check private DNS zone linking, verify DNS forwarders

Summary & Quick Reference

Error Pattern First Thing to Check Command
"Access Denied" / "Unauthorized" RBAC role assignments az role assignment list --assignee {user/app}
"KeyVault 403" / "Access denied" Managed identity + Vault access az keyvault role assignment list --vault-name
"Sign-in blocked" MFA status / Conditional Access Azure AD Portal (no CLI equivalent)
"Connection Timeout" to service NSG rules + DNS resolution az network nsg rule list / nslookup

Interview Questions

Q (Beginner): User reports "Access Denied" when trying to start a VM. Where do you start troubleshooting?
A: Check if user has "Virtual Machine Contributor" or higher role at the resource group or VM scope using az role assignment list.
Q (Beginner): App can't retrieve secrets from Key Vault (403). What's missing?
A: App likely has no managed identity, or managed identity lacks Key Vault permissions. Verify managed identity exists and is granted RBAC role or access policy.
Q (Intermediate): Walk through the 5-step troubleshooting method for "VM can't reach Key Vault."
A: 1) Understand error (connection timeout vs 403) 2) Check network (NSG allows 443?) 3) Check DNS (private endpoint DNS resolves?) 4) Check auth (managed identity has role?) 5) Check logs (activity log shows what happened?).
Q (Intermediate): MFA is blocking user access. User lost phone. What's your fix?
A: Admin goes to Azure AD Portal → User → Authentication Methods → Remove lost phone authenticator. User re-registers MFA device next login.
Q (Scenario): Developer reports: "My app can't connect to SQL database. Works on my laptop, fails in Azure." Troubleshoot this systematically.
A: 1) Check error message (auth vs connection?) 2) RBAC: Does app's managed identity have SQL access? 3) Network: NSG allows 1433 from app subnet? 4) Auth: Is app using correct credentials/connection string from Key Vault? 5) Logs: Check SQL logs for denied connection attempts. Most likely NSG or identity issue.