Unlike self-managed K8s where master nodes are VMs you can SSH into, AKS control plane components run on Azure's infrastructure and are completely opaque. You cannot SSH into the API server, view etcd data directly, or modify control plane configurations beyond what AKS exposes via Azure APIs.
AKS Architecture
Understand the internal architecture of AKS — the managed control plane, the _MC_ resource group, worker node infrastructure, identity model, and how all Azure components connect.
🧒 Simple Explanation (ELI5)
Think of AKS like a managed office building:
- Azure is the building management company — they handle the foundation, elevators, electrical systems, security cameras, and fire safety (the control plane).
- You rent floors (node pools) and decide what to put inside — desks, meeting rooms, kitchens (your pods and workloads).
- The building has a hidden maintenance area you can see but shouldn't touch (the _MC_ resource group) — that's where the HVAC units, generators, and plumbing live.
- The building reception (API server) routes people to the right floor. Visitors (kubectl commands) check in there first.
You never need to fix the elevators. Azure does that. But you need to know the building layout to use it effectively.
🔧 Technical Explanation
The Two Halves of AKS
Every AKS cluster is split into two distinct planes:
| Plane | Managed By | Where It Lives | Components |
|---|---|---|---|
| Control Plane | Azure | Azure-managed infrastructure (invisible to you) | API Server, etcd, Scheduler, Controller Manager, Cloud Controller Manager |
| Data Plane | You | Your Azure subscription (visible VMs) | Worker nodes (VMSS), kubelet, kube-proxy, container runtime, your pods |
Control Plane (Azure-Managed)
When you create an AKS cluster, Azure provisions a control plane that you never see as a resource in your subscription. It's multi-tenant infrastructure hosted by Azure:
- API Server — The front door. All kubectl commands, kubelet heartbeats, and controller actions go through here. Exposed via a public endpoint (or private, if configured).
- etcd — The cluster's state database. Stores all Kubernetes objects (pods, services, secrets, configmaps). Azure manages backups, encryption, and replication. You have zero access to etcd directly.
- Scheduler — Assigns pods to nodes based on resource requests, affinity rules, and taints/tolerations.
- Controller Manager — Runs reconciliation loops (ReplicaSet controller, Deployment controller, etc.).
- Cloud Controller Manager — Kubernetes' interface to Azure. Creates Azure Load Balancers when you create a LoadBalancer Service, provisions Azure Disks for PersistentVolumeClaims, and manages node lifecycle.
Data Plane (Your Subscription)
Worker nodes are Azure VMs created inside your subscription. They run as Virtual Machine Scale Sets (VMSS), managed by AKS but visible to you.
The _MC_ Resource Group
When you create an AKS cluster, Azure automatically creates a second resource group with the naming convention MC_{resource-group}_{cluster-name}_{region}. This contains all the infrastructure AKS needs:
| Resource | Purpose | Can You Modify It? |
|---|---|---|
| Virtual Machine Scale Set (VMSS) | Worker nodes for each node pool | Don't modify directly — use az aks nodepool |
| Load Balancer | Routes external traffic to services (type LoadBalancer) | Managed by cloud controller — don't edit |
| Public IP Address(es) | Outbound/inbound IPs for the load balancer | Can be pre-provisioned and passed to AKS |
| Network Security Group (NSG) | Firewall rules for nodes | Adding custom rules is possible but risky |
| Route Table | Pod networking routes (kubenet mode) | Managed by AKS — don't modify |
| Virtual Network / Subnet | Network for nodes (if AKS-created, not BYO VNet) | Use BYO VNet for production |
| Managed Disks | OS disks for worker VMs, persistent volume disks | Managed by VMSS lifecycle |
AKS's controllers continuously reconcile the state of resources in the _MC_ group. If you manually delete a VMSS, change NSG rules, or modify the load balancer, AKS may revert your changes, or worse, the cluster may break. Always use az aks commands or Kubernetes APIs to make changes.
Identity Model
AKS uses multiple identities to interact with Azure resources:
| Identity | What It Does | Type |
|---|---|---|
| Cluster Managed Identity | AKS uses this to manage Azure resources (load balancers, public IPs, VMSS). This is the "cluster identity." | System-assigned or user-assigned Managed Identity |
| Kubelet Identity | Used by kubelet on each node to pull images from ACR and access Azure resources. Separate from the cluster identity. | User-assigned Managed Identity |
| Azure AD Integration | Maps Azure AD users/groups to Kubernetes RBAC. Users authenticate with az login and kubectl uses the Azure AD token. | Azure AD (Entra ID) |
| Workload Identity | Pods get their own Azure AD identity to access Azure services (Key Vault, Storage) without storing credentials. | Federated credential on Managed Identity |
API Server Access: Public vs Private
| Mode | API Server Endpoint | Who Can Reach It | Use Case |
|---|---|---|---|
| Public (default) | Public IP with FQDN (*.hcp.region.azmk8s.io) | Anyone on the internet (filtered by authorized IP ranges if set) | Dev/test clusters, small teams |
| Private | Private IP in your VNet via Private Link | Only from within the VNet or peered networks (VPN/ExpressRoute) | Production, compliance-required environments |
📊 Full AKS Architecture
(kubectl)
(Azure-managed)
(on worker node)
→ Pod
⌨️ Hands-on
# Examine your AKS cluster details
az aks show --resource-group rg-dev --name dev-cluster -o json | jq '{
name: .name,
kubernetesVersion: .kubernetesVersion,
provisioningState: .provisioningState,
powerState: .powerState.code,
fqdn: .fqdn,
nodeResourceGroup: .nodeResourceGroup,
networkPlugin: .networkProfile.networkPlugin,
serviceCidr: .networkProfile.serviceCidr,
dnsServiceIP: .networkProfile.dnsServiceIP,
identity: .identity.type
}'
# Example output:
# {
# "name": "dev-cluster",
# "kubernetesVersion": "1.29.2",
# "provisioningState": "Succeeded",
# "powerState": "Running",
# "fqdn": "dev-cluster-rg-dev-abc123.hcp.eastus.azmk8s.io",
# "nodeResourceGroup": "MC_rg-dev_dev-cluster_eastus",
# "networkPlugin": "azure",
# "serviceCidr": "10.0.0.0/16",
# "dnsServiceIP": "10.0.0.10",
# "identity": "SystemAssigned"
# }
# Explore the _MC_ resource group — see what AKS created az resource list --resource-group MC_rg-dev_dev-cluster_eastus -o table # Example output: # Name Type Location # ---------------------------------- ------------------------------------------- -------- # aks-agentpool-12345678-vmss Microsoft.Compute/virtualMachineScaleSets eastus # kubernetes Microsoft.Network/loadBalancers eastus # aks-agentpool-12345678-nsg Microsoft.Network/networkSecurityGroups eastus # aks-agentpool-12345678-routetable Microsoft.Network/routeTables eastus # 6f4e1234-5678-abcd-ef01-23456789ab Microsoft.Network/publicIPAddresses eastus
# Examine your worker nodes from the Kubernetes side kubectl get nodes -o wide # Example output: # NAME STATUS ROLES AGE VERSION INTERNAL-IP OS-IMAGE # aks-agentpool-12345678-vmss000000 Ready <none> 5d v1.29.2 10.224.0.4 Ubuntu 22.04.3 LTS # aks-agentpool-12345678-vmss000001 Ready <none> 5d v1.29.2 10.224.0.5 Ubuntu 22.04.3 LTS # aks-agentpool-12345678-vmss000002 Ready <none> 5d v1.29.2 10.224.0.6 Ubuntu 22.04.3 LTS # Notice: nodes have NO "master" or "control-plane" role — the control plane is invisible
# Inspect a node's labels — see AKS-specific metadata kubectl describe node aks-agentpool-12345678-vmss000000 | grep -A 20 "Labels:" # Key AKS labels you'll see: # kubernetes.azure.com/agentpool=agentpool # kubernetes.azure.com/cluster=MC_rg-dev_dev-cluster_eastus # kubernetes.azure.com/os-sku=Ubuntu # kubernetes.azure.com/role=agent # node.kubernetes.io/instance-type=Standard_D2s_v5 # topology.kubernetes.io/region=eastus # topology.kubernetes.io/zone=eastus-1
# Check the cluster and kubelet identities
az aks show --resource-group rg-dev --name dev-cluster \
--query "{clusterIdentity:identity.type, kubeletIdentity:identityProfile.kubeletidentity.objectId}" -o json
# Inspect what system pods are running (AKS-managed components)
kubectl get pods -n kube-system -o wide
# You'll see pods like:
# coredns-* (DNS resolution)
# coredns-autoscaler-* (scales CoreDNS with cluster size)
# kube-proxy-* (per-node networking)
# metrics-server-* (resource metrics)
# cloud-node-manager-* (Azure cloud integration per node)
# csi-azuredisk-node-* (Azure Disk CSI driver)
# csi-azurefile-node-* (Azure File CSI driver)
The kube-system namespace in AKS contains both standard Kubernetes components (CoreDNS, kube-proxy) and AKS-specific add-ons (Azure CSI drivers, cloud-node-manager). Never delete pods in this namespace — AKS manages their lifecycle.
🐛 Debugging Scenarios
Scenario 1: "kubectl commands hang or timeout"
You run kubectl get pods and it hangs for 30 seconds, then returns Unable to connect to the server: net/http: request canceled.
# Step 1: Verify the API server endpoint resolves nslookup $(az aks show -g rg-dev -n dev-cluster --query fqdn -o tsv) # Step 2: Check if API server is reachable curl -k https://$(az aks show -g rg-dev -n dev-cluster --query fqdn -o tsv):443/healthz # Expected: "ok" # Step 3: If using authorized IP ranges, check your current IP is allowed az aks show -g rg-dev -n dev-cluster --query apiServerAccessProfile.authorizedIpRanges # If your IP isn't listed, add it: MY_IP=$(curl -s https://ifconfig.me) az aks update -g rg-dev -n dev-cluster --api-server-authorized-ip-ranges "$MY_IP/32" # Step 4: If private cluster — verify VPN/ExpressRoute connectivity az aks show -g rg-dev -n dev-cluster --query apiServerAccessProfile.enablePrivateCluster # If true, you must be inside the VNet or connected via VPN # Step 5: Check for Azure service health issues # Azure Portal → Service Health → check AKS in your region
Scenario 2: "A node shows NotReady status"
# Step 1: Check node status kubectl get nodes # NAME STATUS ROLES AGE VERSION # aks-agentpool-12345-vmss000002 NotReady <none> 5d v1.29.2 # Step 2: Describe the node to find the reason kubectl describe node aks-agentpool-12345-vmss000002 | grep -A 5 "Conditions:" # Look for: KubeletNotReady, NetworkUnavailable, MemoryPressure, DiskPressure # Step 3: Check node events kubectl get events --field-selector involvedObject.name=aks-agentpool-12345-vmss000002 # Step 4: Check VMSS instance health from Azure side az vmss list-instances --resource-group MC_rg-dev_dev-cluster_eastus \ --name aks-agentpool-12345-vmss -o table # Step 5: If the node is truly unhealthy, cordon and drain it kubectl cordon aks-agentpool-12345-vmss000002 kubectl drain aks-agentpool-12345-vmss000002 --ignore-daemonsets --delete-emptydir-data # Step 6: Delete the bad VMSS instance — AKS will auto-replace it az vmss delete-instances --resource-group MC_rg-dev_dev-cluster_eastus \ --name aks-agentpool-12345-vmss --instance-ids 2
Scenario 3: "kubectl works but newly created LoadBalancer service has no external IP"
# Step 1: Check service status kubectl get svc my-service # EXTERNAL-IP shows <pending> # Step 2: Describe the service for events kubectl describe svc my-service | grep -A 10 "Events:" # Look for: "Error syncing load balancer" or "EnsureLoadBalancer failed" # Step 3: Check if the cluster identity has permission to create LB # The cluster's managed identity needs "Network Contributor" on the _MC_ RG az role assignment list --assignee $(az aks show -g rg-dev -n dev-cluster \ --query identity.principalId -o tsv) --scope "/subscriptions/$(az account show --query id -o tsv)" -o table # Step 4: Check Azure activity log for failures az monitor activity-log list --resource-group MC_rg-dev_dev-cluster_eastus \ --status Failed --offset 1h -o table
🎯 Interview Questions
Beginner
The control plane (managed by Azure — API server, etcd, scheduler, controller manager) and the data plane (worker nodes managed by you, running as Azure VM Scale Sets in your subscription). The control plane is invisible in your subscription; the data plane resources are visible in the _MC_ resource group.
The _MC_ (managed cluster) resource group is an auto-created resource group named MC_{rg}_{cluster}_{region}. It contains the infrastructure AKS needs: VM Scale Sets (worker nodes), load balancers, public IPs, NSGs, route tables, and managed disks. Azure manages this group — you should not manually modify resources in it, as AKS may revert changes or break.
Azure manages etcd entirely. It's hosted on Azure's infrastructure, automatically backed up, encrypted at rest, replicated for HA, and scaled with the cluster. You have zero direct access to etcd — no SSH, no etcdctl, no manual backups. This is one of the biggest operational wins of managed Kubernetes.
AKS uses containerd as the container runtime. Docker (dockershim) was removed in Kubernetes 1.24, and AKS migrated to containerd before that. containerd is lighter, more secure, and the industry standard. You interact with it through Kubernetes APIs — no need for docker commands on nodes.
Yes, but it's not common practice. You can use kubectl debug node/<node-name> -it --image=mcr.microsoft.com/cbl-mariner/busybox:2.0 to get a debug pod on the node, or use az aks command invoke to run commands on nodes without setting up SSH. Direct SSH requires configuring SSH keys and is usually only needed for deep troubleshooting.
Intermediate
A private AKS cluster exposes the API server via a private IP address in your VNet (using Azure Private Link), instead of a public endpoint. This means kubectl commands can only come from within the VNet or connected networks (VPN, ExpressRoute, peered VNets). Use this for: production workloads with compliance requirements, sensitive data processing, defense-in-depth security posture, or when corporate policy prohibits public Kubernetes API endpoints.
AKS uses at least two: 1) Cluster identity (system-assigned or user-assigned managed identity) — used by the AKS resource provider to manage Azure infrastructure (create LBs, manage VMSS, update route tables). 2) Kubelet identity (user-assigned) — used by kubelet on each node to pull images from ACR and access node-level Azure resources. The separation follows least-privilege: kubelet doesn't need infrastructure-level permissions, and the cluster identity doesn't need image pull access.
With Azure AD integration enabled: 1) User runs az aks get-credentials which configures kubeconfig with an Azure AD auth provider. 2) When kubectl runs a command, it triggers Azure AD login (browser popup or device code flow). 3) Azure AD issues a token. 4) kubectl sends the token to the AKS API server. 5) API server validates the token with Azure AD. 6) Kubernetes RBAC checks if the user's AD group has the required ClusterRole/Role. This replaces client certificate auth with enterprise identity.
Kubenet: Nodes get IPs from the VNet subnet, but pods get IPs from a separate, internal CIDR. Inter-node pod traffic uses user-defined routes (UDR). Simpler, uses fewer VNet IPs, but pods aren't directly routable from outside the cluster. Azure CNI: Every pod gets an IP directly from the VNet subnet. Pods are first-class VNet citizens — directly routable, can use NSGs, integrate with Azure services natively. Uses more IPs but offers better networking integration. Azure CNI Overlay is a middle ground — pod IPs from an overlay, but with better performance than kubenet.
AKS provides multiple mechanisms: 1) Node image upgrades: az aks nodepool upgrade --node-image-only applies the latest patched OS image to nodes (rolling restart). 2) Auto-upgrade channels: Configure the cluster to automatically apply node image updates on a schedule. 3) Unattended upgrades: Security patches are applied nightly to running nodes without reboot. Patches requiring a reboot are flagged — use kured (Kubernetes Reboot Daemon) to safely drain and reboot nodes automatically. Best practice: enable auto node image upgrade + kured for production.
Scenario-Based
Since the control plane is Azure-managed, you cannot fix it directly. 1) Check Azure Service Health for AKS incidents in your region. 2) Check Azure Status page. 3) If using Standard/Premium tier, you have an SLA — file a support ticket for expedited response. 4) Check if it's actually a networking issue on your side: test API server connectivity from a different network. 5) Check if authorized IP ranges are blocking you. 6) If intermittent, it may be transient — retry kubectl commands with --request-timeout=60s. 7) For future prevention: use the Standard/Premium tier for financially-backed SLA and consider multi-region clusters for critical workloads.
1) Check VMSS instance provisioning state: az vmss list-instances. 2) Check if the node can reach the API server — NSG rules, UDR configuration, private DNS resolution (for private clusters). 3) Check kubelet logs on the node: kubectl debug node/<node> -- journalctl -u kubelet. 4) Common causes: NSG blocking port 443 outbound to API server, DNS resolution failure for the API server FQDN, exhausted subnet IP addresses, node image incompatible with the K8s version, or the VMSS extension (CSE) failed during provisioning. 5) Check VMSS extension status in Azure Portal for bootstrap errors.
Create a private AKS cluster: az aks create --enable-private-cluster. Architecture: 1) API server gets a private IP via Azure Private Link. 2) Deploy a jump box VM or Azure Bastion in the same VNet for kubectl access. 3) For CI/CD: use az aks command invoke (runs kubectl commands through Azure API without direct API server access), or deploy self-hosted GitHub/Azure DevOps agents inside the VNet. 4) For developers: set up Azure VPN Gateway or point-to-site VPN. 5) Configure Private DNS Zone for the cluster's FQDN to resolve inside the VNet. 6) Peer VNets if multiple teams need access from different networks.
1) First, assess damage: kubectl get nodes and kubectl get pods -A to see what's still working. 2) Run az aks update -g rg-dev -n dev-cluster with the same parameters — AKS will attempt to reconcile and recreate missing resources. 3) If the load balancer was deleted, recreate a Service of type LoadBalancer — the cloud controller will create a new one. 4) If VMSS was modified, scale the node pool down and back up: az aks nodepool scale --node-count 0 then scale back. 5) In worst case, if the cluster is unrecoverable, create a new cluster and restore workloads from your GitOps repo. Prevention: Set Azure resource locks on the _MC_ group and use Azure Policy to deny manual modifications.
Network: Private AKS cluster + Azure Firewall for egress filtering + NSG on subnets + network policies (Calico or Azure NPM). Identity: Azure AD integration with Conditional Access policies + Workload Identity for pod-to-Azure-service auth (no stored secrets). Secrets: Azure Key Vault with CSI driver (secrets never in etcd). Monitoring: Container Insights + Defender for Containers + diagnostic logs to Log Analytics. Compliance: Azure Policy for AKS (enforce pod security standards, require resource limits). Infrastructure: Availability Zones, Standard SLA tier, BYO VNet with pre-defined subnets, customer-managed encryption keys for OS disks.
🏗️ Production Reference Architecture
Here's a real-world AKS architecture used by production teams. Understand each layer and how the components connect:
(global LB + WAF)
(regional L7 LB)
(in-cluster)
(your app)
(build + test)
(image store)
(in-cluster GitOps)
(deployed)
Architecture Decision Record
| Decision | Choice | Why |
|---|---|---|
| Networking | Azure CNI + BYO VNet | Pods need direct VNet IPs for private endpoint access to Azure SQL, Redis, Storage |
| Ingress | App Gateway (AGIC) + NGINX | AGIC for WAF & SSL offload, NGINX for path-based routing & rate limiting inside the cluster |
| Identity | Workload Identity | No secrets in the cluster — pods federate to Managed Identity for Azure resource access |
| Secrets | Key Vault + CSI driver | Centralized secret management, automatic rotation, audit logging |
| CI/CD | GitHub Actions → Flux v2 GitOps | CI builds images; Flux reconciles desired state from Git — no direct kubectl in CI |
| Monitoring | Managed Prometheus + Grafana | Azure-managed, no infra overhead, native AKS integration, pre-built dashboards |
| Egress | Azure Firewall + UDR | All outbound traffic inspected, FQDN-based filtering, compliance requirement |
| Cluster access | Private cluster + VPN | API server not internet-exposed, developers use P2S VPN, CI uses az aks command invoke |
You don't need all this on day one. Start with a public cluster + NGINX Ingress + ACR. Add private cluster, Firewall, and GitOps as your security and compliance requirements grow. The architecture above is where mature teams end up — it's a target, not a starting point.
🌍 Real-World Use Case
A large enterprise bank deployed AKS with a private cluster architecture:
- Architecture: Private AKS cluster in a hub-spoke VNet topology. The hub VNet contains Azure Firewall, VPN Gateway, and Azure Bastion. AKS is in a spoke VNet peered to the hub.
- API Access: Developers connect via point-to-site VPN. CI/CD pipelines run on self-hosted Azure DevOps agents deployed inside the spoke VNet. No public API server endpoint exists.
- Identity: Azure AD integration with Conditional Access — kubectl only works from compliant, corporate-managed devices. Pod-level access to Key Vault and Storage uses Workload Identity.
- Result: Passed SOC 2 Type II audit with zero findings related to Kubernetes. Zero unauthorized API access attempts (because the endpoint isn't public). Full audit trail via Azure AD + Kubernetes audit logs.
📝 Summary
- AKS has two halves: Azure-managed control plane (invisible) and your data plane (worker nodes in the _MC_ resource group)
- The _MC_ resource group contains VMSS, load balancers, NSGs, route tables, and public IPs — don't modify it manually
- AKS uses managed identities (cluster identity + kubelet identity) instead of service principals
- Azure AD integration maps enterprise identities to Kubernetes RBAC
- Private clusters expose the API server only within your VNet via Private Link
- System pods in
kube-systeminclude Azure-specific components like CSI drivers and cloud-node-manager