Basics Lesson 2 of 14

AKS Architecture

Understand the internal architecture of AKS — the managed control plane, the _MC_ resource group, worker node infrastructure, identity model, and how all Azure components connect.

🧒 Simple Explanation (ELI5)

Think of AKS like a managed office building:

You never need to fix the elevators. Azure does that. But you need to know the building layout to use it effectively.

🔧 Technical Explanation

The Two Halves of AKS

Every AKS cluster is split into two distinct planes:

PlaneManaged ByWhere It LivesComponents
Control PlaneAzureAzure-managed infrastructure (invisible to you)API Server, etcd, Scheduler, Controller Manager, Cloud Controller Manager
Data PlaneYouYour Azure subscription (visible VMs)Worker nodes (VMSS), kubelet, kube-proxy, container runtime, your pods

Control Plane (Azure-Managed)

When you create an AKS cluster, Azure provisions a control plane that you never see as a resource in your subscription. It's multi-tenant infrastructure hosted by Azure:

Key Distinction

Unlike self-managed K8s where master nodes are VMs you can SSH into, AKS control plane components run on Azure's infrastructure and are completely opaque. You cannot SSH into the API server, view etcd data directly, or modify control plane configurations beyond what AKS exposes via Azure APIs.

Data Plane (Your Subscription)

Worker nodes are Azure VMs created inside your subscription. They run as Virtual Machine Scale Sets (VMSS), managed by AKS but visible to you.

The _MC_ Resource Group

When you create an AKS cluster, Azure automatically creates a second resource group with the naming convention MC_{resource-group}_{cluster-name}_{region}. This contains all the infrastructure AKS needs:

ResourcePurposeCan You Modify It?
Virtual Machine Scale Set (VMSS)Worker nodes for each node poolDon't modify directly — use az aks nodepool
Load BalancerRoutes external traffic to services (type LoadBalancer)Managed by cloud controller — don't edit
Public IP Address(es)Outbound/inbound IPs for the load balancerCan be pre-provisioned and passed to AKS
Network Security Group (NSG)Firewall rules for nodesAdding custom rules is possible but risky
Route TablePod networking routes (kubenet mode)Managed by AKS — don't modify
Virtual Network / SubnetNetwork for nodes (if AKS-created, not BYO VNet)Use BYO VNet for production
Managed DisksOS disks for worker VMs, persistent volume disksManaged by VMSS lifecycle
⚠️
Never manually modify resources in the _MC_ resource group

AKS's controllers continuously reconcile the state of resources in the _MC_ group. If you manually delete a VMSS, change NSG rules, or modify the load balancer, AKS may revert your changes, or worse, the cluster may break. Always use az aks commands or Kubernetes APIs to make changes.

Identity Model

AKS uses multiple identities to interact with Azure resources:

IdentityWhat It DoesType
Cluster Managed IdentityAKS uses this to manage Azure resources (load balancers, public IPs, VMSS). This is the "cluster identity."System-assigned or user-assigned Managed Identity
Kubelet IdentityUsed by kubelet on each node to pull images from ACR and access Azure resources. Separate from the cluster identity.User-assigned Managed Identity
Azure AD IntegrationMaps Azure AD users/groups to Kubernetes RBAC. Users authenticate with az login and kubectl uses the Azure AD token.Azure AD (Entra ID)
Workload IdentityPods get their own Azure AD identity to access Azure services (Key Vault, Storage) without storing credentials.Federated credential on Managed Identity

API Server Access: Public vs Private

ModeAPI Server EndpointWho Can Reach ItUse Case
Public (default)Public IP with FQDN (*.hcp.region.azmk8s.io)Anyone on the internet (filtered by authorized IP ranges if set)Dev/test clusters, small teams
PrivatePrivate IP in your VNet via Private LinkOnly from within the VNet or peered networks (VPN/ExpressRoute)Production, compliance-required environments

📊 Full AKS Architecture

AKS Cluster Architecture
Azure-Managed Control Plane
API Server (public or private endpoint)
etcd (managed, encrypted, backed up)
Scheduler
Controller Manager
Cloud Controller Manager
← kubectl / kubelet →
Your Subscription — _MC_ Resource Group
VMSS (System Node Pool)
VMSS (User Node Pool)
Azure Load Balancer
Public IP(s)
NSG + Route Table
VNet / Subnet
Communication Flow: kubectl → Pod
Developer
(kubectl)
→ HTTPS
API Server
(Azure-managed)
→ HTTPS
kubelet
(on worker node)
→ CRI
containerd
→ Pod
Resource Group Layout
rg-myapp (Your RG)
AKS Cluster Resource
ACR (if used)
Key Vault (if used)
App Gateway (if used)
auto-creates →
MC_rg-myapp_mycluster_eastus
VMSS (nodes)
Load Balancer
NSG, Route Table
Managed Disks
Public IPs

⌨️ Hands-on

bash
# Examine your AKS cluster details
az aks show --resource-group rg-dev --name dev-cluster -o json | jq '{
  name: .name,
  kubernetesVersion: .kubernetesVersion,
  provisioningState: .provisioningState,
  powerState: .powerState.code,
  fqdn: .fqdn,
  nodeResourceGroup: .nodeResourceGroup,
  networkPlugin: .networkProfile.networkPlugin,
  serviceCidr: .networkProfile.serviceCidr,
  dnsServiceIP: .networkProfile.dnsServiceIP,
  identity: .identity.type
}'

# Example output:
# {
#   "name": "dev-cluster",
#   "kubernetesVersion": "1.29.2",
#   "provisioningState": "Succeeded",
#   "powerState": "Running",
#   "fqdn": "dev-cluster-rg-dev-abc123.hcp.eastus.azmk8s.io",
#   "nodeResourceGroup": "MC_rg-dev_dev-cluster_eastus",
#   "networkPlugin": "azure",
#   "serviceCidr": "10.0.0.0/16",
#   "dnsServiceIP": "10.0.0.10",
#   "identity": "SystemAssigned"
# }
bash
# Explore the _MC_ resource group — see what AKS created
az resource list --resource-group MC_rg-dev_dev-cluster_eastus -o table

# Example output:
# Name                                Type                                         Location
# ----------------------------------  -------------------------------------------  --------
# aks-agentpool-12345678-vmss         Microsoft.Compute/virtualMachineScaleSets     eastus
# kubernetes                          Microsoft.Network/loadBalancers               eastus
# aks-agentpool-12345678-nsg          Microsoft.Network/networkSecurityGroups        eastus
# aks-agentpool-12345678-routetable   Microsoft.Network/routeTables                 eastus
# 6f4e1234-5678-abcd-ef01-23456789ab  Microsoft.Network/publicIPAddresses           eastus
bash
# Examine your worker nodes from the Kubernetes side
kubectl get nodes -o wide

# Example output:
# NAME                              STATUS   ROLES    AGE   VERSION   INTERNAL-IP   OS-IMAGE
# aks-agentpool-12345678-vmss000000 Ready    <none>   5d    v1.29.2   10.224.0.4    Ubuntu 22.04.3 LTS
# aks-agentpool-12345678-vmss000001 Ready    <none>   5d    v1.29.2   10.224.0.5    Ubuntu 22.04.3 LTS
# aks-agentpool-12345678-vmss000002 Ready    <none>   5d    v1.29.2   10.224.0.6    Ubuntu 22.04.3 LTS

# Notice: nodes have NO "master" or "control-plane" role — the control plane is invisible
bash
# Inspect a node's labels — see AKS-specific metadata
kubectl describe node aks-agentpool-12345678-vmss000000 | grep -A 20 "Labels:"

# Key AKS labels you'll see:
#   kubernetes.azure.com/agentpool=agentpool
#   kubernetes.azure.com/cluster=MC_rg-dev_dev-cluster_eastus
#   kubernetes.azure.com/os-sku=Ubuntu
#   kubernetes.azure.com/role=agent
#   node.kubernetes.io/instance-type=Standard_D2s_v5
#   topology.kubernetes.io/region=eastus
#   topology.kubernetes.io/zone=eastus-1
bash
# Check the cluster and kubelet identities
az aks show --resource-group rg-dev --name dev-cluster \
  --query "{clusterIdentity:identity.type, kubeletIdentity:identityProfile.kubeletidentity.objectId}" -o json

# Inspect what system pods are running (AKS-managed components)
kubectl get pods -n kube-system -o wide

# You'll see pods like:
#   coredns-*                    (DNS resolution)
#   coredns-autoscaler-*        (scales CoreDNS with cluster size)
#   kube-proxy-*                (per-node networking)
#   metrics-server-*            (resource metrics)
#   cloud-node-manager-*        (Azure cloud integration per node)
#   csi-azuredisk-node-*        (Azure Disk CSI driver)
#   csi-azurefile-node-*        (Azure File CSI driver)
💡
kube-system namespace

The kube-system namespace in AKS contains both standard Kubernetes components (CoreDNS, kube-proxy) and AKS-specific add-ons (Azure CSI drivers, cloud-node-manager). Never delete pods in this namespace — AKS manages their lifecycle.

🐛 Debugging Scenarios

Scenario 1: "kubectl commands hang or timeout"

You run kubectl get pods and it hangs for 30 seconds, then returns Unable to connect to the server: net/http: request canceled.

bash
# Step 1: Verify the API server endpoint resolves
nslookup $(az aks show -g rg-dev -n dev-cluster --query fqdn -o tsv)

# Step 2: Check if API server is reachable
curl -k https://$(az aks show -g rg-dev -n dev-cluster --query fqdn -o tsv):443/healthz
# Expected: "ok"

# Step 3: If using authorized IP ranges, check your current IP is allowed
az aks show -g rg-dev -n dev-cluster --query apiServerAccessProfile.authorizedIpRanges
# If your IP isn't listed, add it:
MY_IP=$(curl -s https://ifconfig.me)
az aks update -g rg-dev -n dev-cluster --api-server-authorized-ip-ranges "$MY_IP/32"

# Step 4: If private cluster — verify VPN/ExpressRoute connectivity
az aks show -g rg-dev -n dev-cluster --query apiServerAccessProfile.enablePrivateCluster
# If true, you must be inside the VNet or connected via VPN

# Step 5: Check for Azure service health issues
# Azure Portal → Service Health → check AKS in your region

Scenario 2: "A node shows NotReady status"

bash
# Step 1: Check node status
kubectl get nodes
# NAME                              STATUS     ROLES   AGE   VERSION
# aks-agentpool-12345-vmss000002    NotReady   <none>  5d    v1.29.2

# Step 2: Describe the node to find the reason
kubectl describe node aks-agentpool-12345-vmss000002 | grep -A 5 "Conditions:"
# Look for: KubeletNotReady, NetworkUnavailable, MemoryPressure, DiskPressure

# Step 3: Check node events
kubectl get events --field-selector involvedObject.name=aks-agentpool-12345-vmss000002

# Step 4: Check VMSS instance health from Azure side
az vmss list-instances --resource-group MC_rg-dev_dev-cluster_eastus \
  --name aks-agentpool-12345-vmss -o table

# Step 5: If the node is truly unhealthy, cordon and drain it
kubectl cordon aks-agentpool-12345-vmss000002
kubectl drain aks-agentpool-12345-vmss000002 --ignore-daemonsets --delete-emptydir-data

# Step 6: Delete the bad VMSS instance — AKS will auto-replace it
az vmss delete-instances --resource-group MC_rg-dev_dev-cluster_eastus \
  --name aks-agentpool-12345-vmss --instance-ids 2

Scenario 3: "kubectl works but newly created LoadBalancer service has no external IP"

bash
# Step 1: Check service status
kubectl get svc my-service
# EXTERNAL-IP shows <pending>

# Step 2: Describe the service for events
kubectl describe svc my-service | grep -A 10 "Events:"
# Look for: "Error syncing load balancer" or "EnsureLoadBalancer failed"

# Step 3: Check if the cluster identity has permission to create LB
# The cluster's managed identity needs "Network Contributor" on the _MC_ RG
az role assignment list --assignee $(az aks show -g rg-dev -n dev-cluster \
  --query identity.principalId -o tsv) --scope "/subscriptions/$(az account show --query id -o tsv)" -o table

# Step 4: Check Azure activity log for failures
az monitor activity-log list --resource-group MC_rg-dev_dev-cluster_eastus \
  --status Failed --offset 1h -o table

🎯 Interview Questions

Beginner

Q: What are the two main components of an AKS cluster?

The control plane (managed by Azure — API server, etcd, scheduler, controller manager) and the data plane (worker nodes managed by you, running as Azure VM Scale Sets in your subscription). The control plane is invisible in your subscription; the data plane resources are visible in the _MC_ resource group.

Q: What is the _MC_ resource group in AKS?

The _MC_ (managed cluster) resource group is an auto-created resource group named MC_{rg}_{cluster}_{region}. It contains the infrastructure AKS needs: VM Scale Sets (worker nodes), load balancers, public IPs, NSGs, route tables, and managed disks. Azure manages this group — you should not manually modify resources in it, as AKS may revert changes or break.

Q: Who manages etcd in AKS?

Azure manages etcd entirely. It's hosted on Azure's infrastructure, automatically backed up, encrypted at rest, replicated for HA, and scaled with the cluster. You have zero direct access to etcd — no SSH, no etcdctl, no manual backups. This is one of the biggest operational wins of managed Kubernetes.

Q: What container runtime does AKS use?

AKS uses containerd as the container runtime. Docker (dockershim) was removed in Kubernetes 1.24, and AKS migrated to containerd before that. containerd is lighter, more secure, and the industry standard. You interact with it through Kubernetes APIs — no need for docker commands on nodes.

Q: Can you SSH into AKS worker nodes?

Yes, but it's not common practice. You can use kubectl debug node/<node-name> -it --image=mcr.microsoft.com/cbl-mariner/busybox:2.0 to get a debug pod on the node, or use az aks command invoke to run commands on nodes without setting up SSH. Direct SSH requires configuring SSH keys and is usually only needed for deep troubleshooting.

Intermediate

Q: What is a private AKS cluster and when would you use one?

A private AKS cluster exposes the API server via a private IP address in your VNet (using Azure Private Link), instead of a public endpoint. This means kubectl commands can only come from within the VNet or connected networks (VPN, ExpressRoute, peered VNets). Use this for: production workloads with compliance requirements, sensitive data processing, defense-in-depth security posture, or when corporate policy prohibits public Kubernetes API endpoints.

Q: What types of managed identities does AKS use and why are there multiple?

AKS uses at least two: 1) Cluster identity (system-assigned or user-assigned managed identity) — used by the AKS resource provider to manage Azure infrastructure (create LBs, manage VMSS, update route tables). 2) Kubelet identity (user-assigned) — used by kubelet on each node to pull images from ACR and access node-level Azure resources. The separation follows least-privilege: kubelet doesn't need infrastructure-level permissions, and the cluster identity doesn't need image pull access.

Q: How does Azure AD authentication work with AKS?

With Azure AD integration enabled: 1) User runs az aks get-credentials which configures kubeconfig with an Azure AD auth provider. 2) When kubectl runs a command, it triggers Azure AD login (browser popup or device code flow). 3) Azure AD issues a token. 4) kubectl sends the token to the AKS API server. 5) API server validates the token with Azure AD. 6) Kubernetes RBAC checks if the user's AD group has the required ClusterRole/Role. This replaces client certificate auth with enterprise identity.

Q: What is the difference between kubenet and Azure CNI networking in AKS?

Kubenet: Nodes get IPs from the VNet subnet, but pods get IPs from a separate, internal CIDR. Inter-node pod traffic uses user-defined routes (UDR). Simpler, uses fewer VNet IPs, but pods aren't directly routable from outside the cluster. Azure CNI: Every pod gets an IP directly from the VNet subnet. Pods are first-class VNet citizens — directly routable, can use NSGs, integrate with Azure services natively. Uses more IPs but offers better networking integration. Azure CNI Overlay is a middle ground — pod IPs from an overlay, but with better performance than kubenet.

Q: How does AKS handle node OS patching and security updates?

AKS provides multiple mechanisms: 1) Node image upgrades: az aks nodepool upgrade --node-image-only applies the latest patched OS image to nodes (rolling restart). 2) Auto-upgrade channels: Configure the cluster to automatically apply node image updates on a schedule. 3) Unattended upgrades: Security patches are applied nightly to running nodes without reboot. Patches requiring a reboot are flagged — use kured (Kubernetes Reboot Daemon) to safely drain and reboot nodes automatically. Best practice: enable auto node image upgrade + kured for production.

Scenario-Based

Q: The AKS control plane is experiencing intermittent errors. Your team cannot access the API server. What do you do?

Since the control plane is Azure-managed, you cannot fix it directly. 1) Check Azure Service Health for AKS incidents in your region. 2) Check Azure Status page. 3) If using Standard/Premium tier, you have an SLA — file a support ticket for expedited response. 4) Check if it's actually a networking issue on your side: test API server connectivity from a different network. 5) Check if authorized IP ranges are blocking you. 6) If intermittent, it may be transient — retry kubectl commands with --request-timeout=60s. 7) For future prevention: use the Standard/Premium tier for financially-backed SLA and consider multi-region clusters for critical workloads.

Q: A new node in your AKS cluster isn't joining the cluster. kubectl get nodes doesn't show it, but VMSS shows the instance is running. What do you investigate?

1) Check VMSS instance provisioning state: az vmss list-instances. 2) Check if the node can reach the API server — NSG rules, UDR configuration, private DNS resolution (for private clusters). 3) Check kubelet logs on the node: kubectl debug node/<node> -- journalctl -u kubelet. 4) Common causes: NSG blocking port 443 outbound to API server, DNS resolution failure for the API server FQDN, exhausted subnet IP addresses, node image incompatible with the K8s version, or the VMSS extension (CSE) failed during provisioning. 5) Check VMSS extension status in Azure Portal for bootstrap errors.

Q: Your security team says all Kubernetes clusters must have private API endpoints. How do you architect this with AKS?

Create a private AKS cluster: az aks create --enable-private-cluster. Architecture: 1) API server gets a private IP via Azure Private Link. 2) Deploy a jump box VM or Azure Bastion in the same VNet for kubectl access. 3) For CI/CD: use az aks command invoke (runs kubectl commands through Azure API without direct API server access), or deploy self-hosted GitHub/Azure DevOps agents inside the VNet. 4) For developers: set up Azure VPN Gateway or point-to-site VPN. 5) Configure Private DNS Zone for the cluster's FQDN to resolve inside the VNet. 6) Peer VNets if multiple teams need access from different networks.

Q: Your team accidentally deleted resources in the _MC_ resource group. The cluster is behaving erratically. How do you recover?

1) First, assess damage: kubectl get nodes and kubectl get pods -A to see what's still working. 2) Run az aks update -g rg-dev -n dev-cluster with the same parameters — AKS will attempt to reconcile and recreate missing resources. 3) If the load balancer was deleted, recreate a Service of type LoadBalancer — the cloud controller will create a new one. 4) If VMSS was modified, scale the node pool down and back up: az aks nodepool scale --node-count 0 then scale back. 5) In worst case, if the cluster is unrecoverable, create a new cluster and restore workloads from your GitOps repo. Prevention: Set Azure resource locks on the _MC_ group and use Azure Policy to deny manual modifications.

Q: You need to design an AKS architecture for a financial services company that requires SOC 2 compliance. What Azure-specific components do you include?

Network: Private AKS cluster + Azure Firewall for egress filtering + NSG on subnets + network policies (Calico or Azure NPM). Identity: Azure AD integration with Conditional Access policies + Workload Identity for pod-to-Azure-service auth (no stored secrets). Secrets: Azure Key Vault with CSI driver (secrets never in etcd). Monitoring: Container Insights + Defender for Containers + diagnostic logs to Log Analytics. Compliance: Azure Policy for AKS (enforce pod security standards, require resource limits). Infrastructure: Availability Zones, Standard SLA tier, BYO VNet with pre-defined subnets, customer-managed encryption keys for OS disks.

🏗️ Production Reference Architecture

Here's a real-world AKS architecture used by production teams. Understand each layer and how the components connect:

AKS Production Architecture — Hub-Spoke Model
Hub VNet
Azure Firewall (egress filtering)
VPN Gateway / ExpressRoute
Azure Bastion (secure SSH/RDP)
Private DNS Zones
← VNet Peering →
Spoke VNet — AKS
Subnet: AKS Nodes (Azure CNI)
Subnet: AKS Internal LB
Subnet: App Gateway + WAF
Subnet: Private Endpoints
End-to-End Request Flow
User
→ HTTPS
Azure Front Door
(global LB + WAF)
App Gateway
(regional L7 LB)
NGINX Ingress
(in-cluster)
Pod
(your app)
CI/CD + GitOps Flow
Developer Push
GitHub Actions
(build + test)
ACR
(image store)
Flux v2
(in-cluster GitOps)
AKS Pods
(deployed)

Architecture Decision Record

DecisionChoiceWhy
NetworkingAzure CNI + BYO VNetPods need direct VNet IPs for private endpoint access to Azure SQL, Redis, Storage
IngressApp Gateway (AGIC) + NGINXAGIC for WAF & SSL offload, NGINX for path-based routing & rate limiting inside the cluster
IdentityWorkload IdentityNo secrets in the cluster — pods federate to Managed Identity for Azure resource access
SecretsKey Vault + CSI driverCentralized secret management, automatic rotation, audit logging
CI/CDGitHub Actions → Flux v2 GitOpsCI builds images; Flux reconciles desired state from Git — no direct kubectl in CI
MonitoringManaged Prometheus + GrafanaAzure-managed, no infra overhead, native AKS integration, pre-built dashboards
EgressAzure Firewall + UDRAll outbound traffic inspected, FQDN-based filtering, compliance requirement
Cluster accessPrivate cluster + VPNAPI server not internet-exposed, developers use P2S VPN, CI uses az aks command invoke
💡
Start Simple, Grow Into This

You don't need all this on day one. Start with a public cluster + NGINX Ingress + ACR. Add private cluster, Firewall, and GitOps as your security and compliance requirements grow. The architecture above is where mature teams end up — it's a target, not a starting point.

🌍 Real-World Use Case

A large enterprise bank deployed AKS with a private cluster architecture:

📝 Summary

← Back to AKS Course