This course assumes you've completed the Kubernetes course. We won't re-explain pods, deployments, services, or kubectl basics — we focus on what's Azure-specific.
What is AKS?
Azure Kubernetes Service — Microsoft's fully managed Kubernetes offering. Understand what Azure handles for you, what you still own, and when AKS is the right choice.
🧒 Simple Explanation (ELI5)
Imagine you want to host a big party. You have two options:
- Build your own venue (self-managed Kubernetes): You buy the land, build the building, install plumbing, electricity, security cameras, hire bouncers, manage the parking lot — then you can start planning the party.
- Rent a hotel ballroom (AKS): The hotel already has the building, plumbing, electricity, security, and parking. You just show up, choose the room size, decorate it, and focus on the party itself.
AKS is the hotel ballroom. Azure builds and maintains the Kubernetes infrastructure (the control plane, upgrades, patching, high availability). You focus on what matters — deploying and running your applications.
🔧 Technical Explanation
Azure Kubernetes Service (AKS) is a managed container orchestration service that runs on Azure. It reduces the complexity of Kubernetes operations by offloading the control plane management to Azure.
What Azure Manages For You
| Component | Who Manages It | Details |
|---|---|---|
| Control Plane (API server, scheduler, controller manager) | Azure | Multi-tenant, highly available, auto-patched |
| etcd | Azure | Managed, backed up, encrypted — you never touch it |
| Kubernetes Upgrades | Azure (you trigger) | az aks upgrade — Azure handles the rolling upgrade |
| Control Plane Scaling | Azure | Auto-scales based on cluster size — no action needed |
| Certificate Rotation | Azure | Automatic rotation of internal certificates |
What You Manage
| Component | Your Responsibility |
|---|---|
| Worker Nodes | Choose VM sizes, manage node pools, handle OS updates (or enable auto-upgrade) |
| Workloads | Deploy, scale, and manage your pods, deployments, services |
| Networking | Configure VNets, NSGs, load balancers, ingress controllers |
| Security | RBAC, pod security, network policies, secret management |
| Monitoring | Set up Container Insights, Prometheus, alerts |
AKS Cost Model
AKS does not charge for the Kubernetes control plane. You only pay for the worker node VMs, storage, networking, and any add-ons. This is a major cost advantage over self-managed clusters where you'd provision and pay for master node VMs.
AKS vs the Alternatives
| Feature | AKS (Azure) | EKS (AWS) | GKE (Google Cloud) | Self-Managed K8s |
|---|---|---|---|---|
| Control Plane Cost | Free | $0.10/hr (~$73/mo) | Free (Autopilot) / $0.10/hr (Standard) | You pay for master VMs |
| Control Plane Management | Fully managed | Fully managed | Fully managed | You manage everything |
| Upgrade Complexity | One command | One command | One command (Autopilot: auto) | Manual, risky process |
| Azure AD Integration | Native | N/A | N/A | Manual setup |
| Deep Azure Integration | ACR, Key Vault, Monitor, VNet, AD | ECR, Secrets Manager, VPC | Artifact Registry, Secret Manager | Manual for all |
| Learning Curve | Moderate | Moderate | Lowest (Autopilot) | Highest |
When to Use AKS vs Other Azure Compute
| Service | Best For | Choose Over AKS When |
|---|---|---|
| AKS | Complex microservices, full K8s feature set, multi-container workloads | — |
| Azure Container Apps | Event-driven microservices, Dapr, simpler ops | You don't need full K8s control, want serverless scaling |
| Azure App Service | Web apps, APIs, simple container hosting | Single-container apps, no orchestration needed |
| Azure Container Instances | One-off jobs, burst compute, sidecar containers | Short-lived tasks, no long-running services |
| Azure Functions | Event-driven serverless code | Simple event handlers, per-execution billing desired |
🌉 Coming from Kubernetes? Here's What Changes
If you completed the Kubernetes course, you already know pods, deployments, services, and kubectl. Here's exactly what's different on AKS:
| What You Know (K8s) | What Changes (AKS) | Impact |
|---|---|---|
kubeadm init to create a cluster | az aks create — one command, no master node setup | No more bootstrapping control planes |
| You manage etcd backups | Azure handles etcd — you never touch it | One less thing to worry about (and break) |
| Service type LoadBalancer needs MetalLB or NodePort | Azure auto-creates an Azure Load Balancer with a public IP | Services get real IPs instantly |
| PersistentVolumes need manual provisioning | AKS auto-provisions Azure Disks and Azure Files via StorageClasses | Just create a PVC — Azure creates the disk |
| Docker images stored locally or on Docker Hub | Azure Container Registry (ACR) with --attach-acr | Private registry, integrated auth, cloud builds |
| RBAC with static kubeconfig certs | Azure AD integration — kubectl triggers browser login | Enterprise SSO, MFA, conditional access |
kubectl commands are the same | kubectl commands are the same! | Your muscle memory still works |
Everything you learned about pods, deployments, services, configmaps, secrets, and kubectl works exactly the same on AKS. AKS is just Kubernetes with an Azure service layer underneath. The K8s API is identical — only the infrastructure provisioning is different.
📊 AKS Managed Boundary
⌨️ Hands-on
These commands assume you have the Azure CLI installed and an AKS cluster running. If you don't have one yet, the Cluster Creation lesson covers setup step by step.
# Login to Azure (if not already logged in) az login # List all AKS clusters in your subscription az aks list -o table # Example output: # Name Location ResourceGroup KubernetesVersion ProvisioningState # ------------- ---------- --------------- ------------------- ------------------- # dev-cluster eastus rg-dev 1.29.2 Succeeded # prod-cluster westus2 rg-prod 1.28.5 Succeeded
# Get detailed info about a specific cluster az aks show --resource-group rg-dev --name dev-cluster -o table # Key fields to look for: # - kubernetesVersion: current K8s version # - provisioningState: should be "Succeeded" # - powerState: Running or Stopped # - fqdn: API server fully qualified domain name
# Get credentials to connect kubectl to your AKS cluster az aks get-credentials --resource-group rg-dev --name dev-cluster # Verify connection kubectl cluster-info # Example output: # Kubernetes control plane is running at https://dev-cluster-rg-dev-abc123.hcp.eastus.azmk8s.io:443 # CoreDNS is running at https://dev-cluster-rg-dev-abc123.hcp.eastus.azmk8s.io:443/api/v1/... # Check the Kubernetes version running on the server kubectl version --short # List the nodes — these are your worker VMs kubectl get nodes -o wide
# Check which AKS versions are available in your region az aks get-versions --location eastus -o table # Example output: # KubernetesVersion Upgrades # ------------------- ------------------------- # 1.30.0 None available # 1.29.2 1.30.0 # 1.28.5 1.29.2 # 1.27.9 1.28.5
Notice the API server URL ends with .azmk8s.io. This is Azure's managed domain for AKS API servers. You never provision or manage this endpoint — Azure handles DNS, TLS certificates, and load balancing for it.
🐛 Debugging Scenarios
Scenario 1: "I can't connect to my AKS cluster"
You run kubectl get pods and get: Unable to connect to the server: dial tcp: lookup ... no such host
# Step 1: Check if you have the right context
kubectl config current-context
# Expected: "dev-cluster" If wrong or empty:
# Step 2: Re-fetch credentials
az aks get-credentials --resource-group rg-dev --name dev-cluster --overwrite-existing
# Step 3: Check if the cluster is actually running
az aks show --resource-group rg-dev --name dev-cluster --query powerState
# If output is { "code": "Stopped" } — the cluster is stopped!
az aks start --resource-group rg-dev --name dev-cluster
# Step 4: Verify your kubeconfig file
kubectl config view --minify
# Check that the server URL matches the cluster's FQDN
# Step 5: If using a private cluster, ensure VPN/ExpressRoute is connected
# Private clusters don't expose a public API endpoint
az aks show --resource-group rg-dev --name dev-cluster \
--query apiServerAccessProfile.enablePrivateCluster
Scenario 2: "az aks list returns empty but I know I have a cluster"
# Step 1: Verify you're logged into the right subscription
az account show --query "{name:name, id:id}" -o table
# Step 2: List all subscriptions and switch if needed
az account list -o table
az account set --subscription "correct-subscription-id"
# Step 3: Now try again
az aks list -o table
# Step 4: If still empty, check if the cluster was deleted
# Check Activity Log in Azure Portal → Resource Group → Activity Log
🎯 Interview Questions
Beginner
AKS is a managed Kubernetes service on Azure. It offloads the control plane (API server, etcd, scheduler, controller manager) to Azure so you don't have to provision, manage, or patch master nodes. You only manage the worker nodes and your application workloads. The control plane is free — you pay only for the compute resources (VMs) running your worker nodes.
Self-managed: You install Kubernetes yourself (kubeadm, kops, or from scratch). You manage the control plane VMs, etcd backups, certificate rotation, upgrades, and high availability. This requires deep K8s expertise and significant operational overhead. Managed (AKS): Azure handles all control plane operations. You trigger upgrades but Azure executes them. etcd is automatically backed up and encrypted. The control plane is multi-tenant and highly available. You focus on deploying workloads, not operating Kubernetes itself.
No. The standard AKS control plane is free. You only pay for the worker node VMs, managed disks, networking (load balancers, public IPs), and any Azure add-ons you enable. There is an optional paid Uptime SLA tier (AKS Standard/Premium) for financially backed availability guarantees on the control plane, but the base control plane itself is free.
Both are managed Kubernetes services, but key differences: Cost: AKS control plane is free; EKS charges $0.10/hour (~$73/month) per cluster. Identity: AKS integrates natively with Azure AD for RBAC; EKS uses IAM roles. Networking: AKS supports Azure CNI and kubenet; EKS uses VPC CNI. Ecosystem: AKS tightly integrates with Azure Monitor, ACR, Key Vault; EKS integrates with CloudWatch, ECR, Secrets Manager. Choose based on your existing cloud investments.
az aks get-credentials command?It downloads the cluster's kubeconfig and merges it into your local ~/.kube/config file. This configures kubectl to communicate with the AKS cluster's API server. It sets the cluster endpoint, authentication credentials (typically via Azure AD or client certificates), and sets the current context. Without this, kubectl has no idea how to reach your AKS cluster.
Intermediate
AKS is overkill when: 1) You have a single container or simple web app — use Azure App Service or Container Apps instead. 2) You need fully serverless event-driven processing — use Azure Functions. 3) Short-lived batch jobs — consider Azure Container Instances. 4) Your team lacks Kubernetes expertise and the workload doesn't justify the learning curve. 5) Strict compliance requires complete control over the control plane — consider self-managed or Azure Dedicated Host clusters.
AKS has three tiers: Free tier: No SLA on the control plane (best-effort availability, still very reliable). Standard tier: Financially backed 99.95% SLA for the API server (or 99.99% with Availability Zones). Premium tier: Includes Standard SLA plus long-term support (LTS) versions and additional features. The SLA covers the control plane only — worker node availability depends on your VM configuration and Availability Zones.
When you create an AKS cluster with Availability Zones enabled, worker nodes are distributed across zones (e.g., Zone 1, 2, 3 in a region). If one zone fails, nodes in other zones keep running. The AKS control plane is automatically zone-redundant. To use this effectively: spread node pools across zones, use zone-redundant storage (ZRS disks), and configure pod topology spread constraints so pods aren't all in one zone.
Azure patches and updates the control plane transparently. You may experience brief API server unavailability (seconds, not minutes) during maintenance windows. With the Standard SLA tier, Azure guarantees 99.95% uptime. You can configure Planned Maintenance Windows to control when non-urgent updates happen (e.g., weekends only). Critical security patches may be applied outside your window.
Not directly. AKS is a cloud-managed service that requires connectivity to Azure's control plane. For disconnected or edge scenarios, consider: Azure Arc-enabled Kubernetes (connect on-premises or edge clusters to Azure), AKS on Azure Stack HCI (run AKS on your own infrastructure), or AKS Edge Essentials (lightweight K8s on edge devices). These allow Kubernetes management with limited Azure connectivity.
Scenario-Based
AKS, without question. With 3 developers, you cannot afford to dedicate anyone to Kubernetes operations full-time. Self-managed K8s requires: provisioning HA control plane (3 master nodes minimum), managing etcd backups, handling certificate rotation, orchestrating upgrades, patching OS vulnerabilities on master nodes. That's easily a full-time job. AKS eliminates all of this with a free managed control plane. Your team focuses on application development instead of infrastructure. The only trade-off is less control over control plane configuration — which a 3-person team doesn't need anyway.
Container Apps is excellent for event-driven microservices with variable traffic, but AKS is better when you need: 1) Custom networking (VNet integration, network policies, custom CNI). 2) Stateful workloads (StatefulSets, persistent volumes). 3) Advanced scheduling (node affinity, taints/tolerations, GPU nodes). 4) Full ecosystem compatibility (service meshes, custom operators, Helm charts). 5) Fine-grained cost control (reserved instances, spot node pools). Container Apps abstracts Kubernetes away — great if you don't need it, but limiting when you do.
1) Check available versions: az aks get-upgrades --resource-group rg-prod --name prod-cluster. 2) Read the AKS release notes for breaking changes between 1.27 → 1.28. 3) Test in a non-prod cluster first: create a dev cluster on 1.28, deploy your workloads, run integration tests. 4) Check deprecated APIs: run kubectl deprecations or use kubent to find deprecated API versions in your manifests. 5) Schedule maintenance window. 6) Upgrade control plane first: az aks upgrade --kubernetes-version 1.28.5. 7) Monitor the upgrade, verify node rollout completes. 8) Validate all workloads are healthy post-upgrade.
Standardize at the Kubernetes layer, not the cloud layer. Use: 1) Helm charts for application deployment (works on both). 2) GitOps with ArgoCD or Flux (cloud-agnostic). 3) Terraform modules for infrastructure (separate modules for AKS and EKS, common interface). 4) OPA/Gatekeeper for policy enforcement (same policies, both clusters). 5) Prometheus + Grafana for monitoring (cloud-agnostic). 6) Azure Arc to manage all clusters from one pane. The goal is: apps are portable (Helm + YAML), infrastructure is cloud-specific but Terraform-managed.
Use AKS Start/Stop feature: az aks stop --resource-group rg-dev --name dev-cluster at 7 PM, az aks start at 8 AM. Automate with Azure Automation runbooks or a logic app on a schedule. This deallocates all nodes and stops billing for compute. The cluster metadata and configuration are preserved. Additionally: use spot node pools for non-critical workloads (up to 90% savings), use smaller VM sizes (Standard_B2s for dev), and set up cluster autoscaler with aggressive scale-down (--scale-down-delay-after-add=5m).
🌍 Real-World Use Case
A fintech startup with 8 engineers was running Kubernetes on 3 self-managed Azure VMs:
- Before AKS: One engineer spent 40% of their time managing the control plane — etcd backups, certificate rotation, K8s upgrades (which once caused a 4-hour outage), and OS patching. Cost: 3 master VMs + 3 worker VMs = 6 VMs total.
- After AKS: Control plane is free and managed. That engineer now builds features full-time. Cost: 3 worker node VMs only (50% compute cost reduction). Upgrades are one command with zero-downtime rolling updates. etcd has never been a concern since.
- Bonus: They enabled cluster autoscaler and spot node pools for batch workloads, saving an additional 35% on compute. Total infrastructure cost dropped by ~65%.
📝 Summary
- AKS is Azure's managed Kubernetes service — Azure owns the control plane, you own the workloads
- The control plane (API server, etcd, scheduler) is free and fully managed
- You pay only for worker node VMs, storage, and networking
- AKS integrates deeply with Azure AD, ACR, Key Vault, and Azure Monitor
- Use AKS for microservices at scale; consider Container Apps or App Service for simpler workloads
- Start/Stop feature saves money on dev clusters; spot node pools save on non-critical workloads