Basics Lesson 4 of 14

Node Pools

System vs user node pools, VM size selection, spot instances for massive savings, taints, labels, and multi-pool strategies for production workloads.

🧒 Simple Explanation (ELI5)

Think of node pools like different teams in a company:

The System pool is the IT department — they keep the lights on, manage the network, run the mail server. You need them running at all times. (These run CoreDNS, kube-proxy, metrics-server.)
The User pool is a product team — they work on the actual product. You can add more people when there's a big project, and scale down during quiet periods.
You might have a GPU pool for AI/ML engineers — expensive specialists who only come in when there's model training work.
A Spot pool is like hiring temp workers at a huge discount — they're cheap but can be sent home with no notice if the company needs the desks for someone else.

You wouldn't put the IT team and the factory workers in the same room. Similarly, you separate Kubernetes system components from your application workloads into different node pools.

🔧 Technical Explanation

System vs User Node Pools

Every AKS cluster requires at least one system node pool. System node pools run critical Kubernetes components.

Aspect	System Node Pool	User Node Pool
Purpose	Runs Kubernetes system pods (CoreDNS, metrics-server, kube-proxy, CSI drivers)	Runs your application workloads
Required?	Yes — at least one system pool must exist	Optional (but recommended for production)
Default taint	`CriticalAddonsOnly=true:NoSchedule` (when dedicated system pool)	None
Minimum nodes	1 (dev) or 3 (production with zones)	0 (can scale to zero)
Scale to zero?	No — cluster breaks without system pods	Yes — great for cost savings
VM size recommendation	Standard_D2s_v5 (2 vCPU, 8 GB)	Depends on workload

❗

CriticalAddonsOnly Taint

When you mark a node pool as mode=System, AKS applies the taint CriticalAddonsOnly=true:NoSchedule. This prevents your application pods from scheduling on system nodes — protecting system components from resource contention. Only pods with a matching toleration (which AKS system pods have) can schedule there. This separation is a best practice for production.

VM Sizes for Different Workloads

Workload Type	Recommended VM Series	Example SKU	Specs	~Monthly Cost
System pool	Dsv5 (general purpose)	Standard_D2s_v5	2 vCPU, 8 GB RAM	~$70
Web APIs / microservices	Dsv5 (general purpose)	Standard_D4s_v5	4 vCPU, 16 GB RAM	~$140
Memory-intensive (caching, search)	Esv5 (memory optimized)	Standard_E4s_v5	4 vCPU, 32 GB RAM	~$185
CPU-intensive (batch, encoding)	Fsv2 (compute optimized)	Standard_F8s_v2	8 vCPU, 16 GB RAM	~$245
AI/ML training	NC-series (GPU)	Standard_NC6s_v3	6 vCPU, 112 GB RAM, 1× V100	~$2,200
Dev/test	Bs-series (burstable)	Standard_B2s	2 vCPU, 4 GB RAM	~$30
Windows workloads	Dsv5 (general purpose)	Standard_D4s_v5	4 vCPU, 16 GB RAM	~$180 (Windows license)

Spot Node Pools

Spot instances use Azure's spare compute capacity at up to 90% discount. The catch: Azure can evict your nodes with 30 seconds notice when it needs the capacity back.

Setting	Description
`--priority Spot`	Creates a spot node pool (discounted VMs)
`--eviction-policy Delete`	Evicted VMs are deleted (recommended). Alternative: `Deallocate` (preserves OS disk)
`--spot-max-price -1`	Pay market price (recommended). Or set a cap: `--spot-max-price 0.05`

⚠️

Spot Pools Are Not For Everything

Only run workloads on spot nodes that can tolerate interruption: batch jobs, CI/CD runners, stateless workers, dev/test, data processing. Never run your production API or database on spot nodes. AKS applies a kubernetes.azure.com/scalesetpriority:spot taint — your pods need a matching toleration.

Node Pool Scaling Options

Method	How	Use Case
Manual scaling	`az aks nodepool scale --node-count 5`	Known capacity needs, planned events
Cluster autoscaler	`--enable-cluster-autoscaler --min-count 2 --max-count 10`	Variable traffic, auto-adjust to demand
Scale to zero	`--min-count 0` (user pools only)	GPU/spot pools that aren't always needed

Taints and Labels

Taints and labels on node pools control which pods schedule where:

Labels — Key-value metadata on nodes. Pods use nodeSelector or nodeAffinity to target specific node pools.
Taints — Repel pods unless they have a matching toleration. Used to reserve node pools for specific workloads.

OS SKU Options

OS SKU	Description	When to Use
Ubuntu	Default Linux OS for AKS nodes. Battle-tested, broad compatibility.	Default choice for most workloads
AzureLinux	Microsoft's Linux distro (formerly CBL-Mariner). Smaller image, faster boot, more secure.	Performance-sensitive or security-hardened clusters
Windows2022	Windows Server node pool. Runs Windows containers.	.NET Framework apps, Windows-only workloads

Max Pods Per Node

The --max-pods setting determines how many pods a single node can run:

kubenet default: 110
Azure CNI default: 30 (because each pod consumes a VNet IP)
Recommended for production: 110 with Azure CNI Overlay, or 50-110 with Azure CNI (plan subnet size accordingly)
This value is set at node pool creation and cannot be changed later — you must create a new node pool to change it

📊 Multi-Pool Architecture

Production Node Pool Strategy

System Pool (Standard_D2s_v5)

CoreDNS

metrics-server

kube-proxy

CSI drivers

Taint: CriticalAddonsOnly

App Pool (Standard_D4s_v5)

API pods

Web frontend pods

Worker pods

Autoscaler: 3-10 nodes

Spot Pool (Standard_D4s_v5)

Batch jobs

CI runners

Data processing

Taint: spot=true:NoSchedule

~90% cheaper

GPU Pool (Standard_NC6s_v3)

ML training jobs

Inference pods

Taint: gpu=true:NoSchedule

Scale to zero when idle

Taint + Toleration + nodeSelector Flow

Pod with
toleration: gpu=true
nodeSelector: pool=gpu

→ Scheduler checks →

App Pool ✗ (no gpu label)

Spot Pool ✗ (no gpu label)

GPU Pool ✓ (toleration + label match)

⌨️ Hands-on

List Existing Node Pools

bash

# List all node pools in your cluster
az aks nodepool list --resource-group rg-dev --cluster-name dev-cluster -o table

# Example output:
# Name        OsType    VmSize           Count  Mode    OrchestratorVersion
# ----------  --------  ---------------  -----  ------  -------------------
# agentpool   Linux     Standard_D2s_v5  2      System  1.29.2

Add a User Node Pool

bash

# Add a user pool for application workloads
az aks nodepool add \
  --resource-group rg-dev \
  --cluster-name dev-cluster \
  --name apppool \
  --mode User \
  --node-count 3 \
  --node-vm-size Standard_D4s_v5 \
  --max-pods 110 \
  --zones 1 2 3 \
  --labels workload=app environment=dev \
  --enable-cluster-autoscaler \
  --min-count 2 \
  --max-count 8 \
  --os-sku AzureLinux

# Verify the pool was added
az aks nodepool list -g rg-dev --cluster-name dev-cluster -o table
kubectl get nodes -l agentpool=apppool

Add a Spot Node Pool

bash

# Add a spot pool for batch/CI workloads (up to 90% cheaper)
az aks nodepool add \
  --resource-group rg-dev \
  --cluster-name dev-cluster \
  --name spotpool \
  --mode User \
  --priority Spot \
  --eviction-policy Delete \
  --spot-max-price -1 \
  --node-count 2 \
  --node-vm-size Standard_D4s_v5 \
  --max-pods 110 \
  --labels workload=batch priority=spot \
  --node-taints "kubernetes.azure.com/scalesetpriority=spot:NoSchedule" \
  --enable-cluster-autoscaler \
  --min-count 0 \
  --max-count 10

# AKS automatically adds the spot taint, but adding it explicitly ensures clarity
# To schedule pods on spot nodes, add this toleration to your pod spec:
# tolerations:
# - key: "kubernetes.azure.com/scalesetpriority"
#   operator: "Equal"
#   value: "spot"
#   effect: "NoSchedule"

Add a GPU Node Pool

bash

# Add a GPU pool for ML workloads (scale to zero when not training)
az aks nodepool add \
  --resource-group rg-dev \
  --cluster-name dev-cluster \
  --name gpupool \
  --mode User \
  --node-count 0 \
  --node-vm-size Standard_NC6s_v3 \
  --node-taints "sku=gpu:NoSchedule" \
  --labels workload=ml accelerator=nvidia \
  --enable-cluster-autoscaler \
  --min-count 0 \
  --max-count 3

# When a pod with matching toleration + GPU resource request appears,
# the autoscaler spins up a GPU node. When job finishes, scales back to 0.

Scale a Node Pool

bash

# Manual scale — set exact node count
az aks nodepool scale \
  --resource-group rg-dev \
  --cluster-name dev-cluster \
  --name apppool \
  --node-count 5

# Update autoscaler limits
az aks nodepool update \
  --resource-group rg-dev \
  --cluster-name dev-cluster \
  --name apppool \
  --update-cluster-autoscaler \
  --min-count 3 \
  --max-count 15

# Disable autoscaler (switch to manual)
az aks nodepool update \
  --resource-group rg-dev \
  --cluster-name dev-cluster \
  --name apppool \
  --disable-cluster-autoscaler

Inspect Nodes and Labels

bash

# List nodes with their pool, VM size, and zone
kubectl get nodes -o custom-columns=\
NAME:.metadata.name,\
POOL:.metadata.labels.agentpool,\
VM:.metadata.labels.node\\.kubernetes\\.io/instance-type,\
ZONE:.metadata.labels.topology\\.kubernetes\\.io/zone,\
STATUS:.status.conditions[-1].type

# Example output:
# NAME                               POOL       VM                ZONE       STATUS
# aks-agentpool-12345-vmss000000     agentpool  Standard_D2s_v5   eastus-1   Ready
# aks-agentpool-12345-vmss000001     agentpool  Standard_D2s_v5   eastus-2   Ready
# aks-apppool-67890-vmss000000       apppool    Standard_D4s_v5   eastus-1   Ready
# aks-apppool-67890-vmss000001       apppool    Standard_D4s_v5   eastus-2   Ready
# aks-apppool-67890-vmss000002       apppool    Standard_D4s_v5   eastus-3   Ready

# Check taints on a node
kubectl describe node aks-agentpool-12345-vmss000000 | grep -A 3 "Taints:"
# Taints: CriticalAddonsOnly=true:NoSchedule

# List all labels on a specific node pool's nodes
kubectl get nodes -l agentpool=spotpool --show-labels

Deploy a Pod to a Specific Node Pool

yaml

# deploy-to-apppool.yaml — target the app pool using nodeSelector
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-api
  template:
    metadata:
      labels:
        app: web-api
    spec:
      nodeSelector:
        agentpool: apppool          # targets the apppool node pool
      containers:
      - name: web-api
        image: myacr.azurecr.io/web-api:v1.2
        resources:
          requests:
            cpu: 500m
            memory: 512Mi
          limits:
            cpu: "1"
            memory: 1Gi

yaml

# batch-job-on-spot.yaml — schedule job on spot nodes with toleration
apiVersion: batch/v1
kind: Job
metadata:
  name: data-processing
spec:
  template:
    spec:
      nodeSelector:
        agentpool: spotpool           # target spot pool
      tolerations:
      - key: "kubernetes.azure.com/scalesetpriority"
        operator: "Equal"
        value: "spot"
        effect: "NoSchedule"
      containers:
      - name: processor
        image: myacr.azurecr.io/data-processor:v2.0
        resources:
          requests:
            cpu: "2"
            memory: 4Gi
      restartPolicy: OnFailure

Upgrade a Node Pool

bash

# Upgrade a specific node pool to a new K8s version
az aks nodepool upgrade \
  --resource-group rg-dev \
  --cluster-name dev-cluster \
  --name apppool \
  --kubernetes-version 1.30.0

# Upgrade node image only (no K8s version change — just OS patches)
az aks nodepool upgrade \
  --resource-group rg-dev \
  --cluster-name dev-cluster \
  --name apppool \
  --node-image-only

# Check current node image version
az aks nodepool show -g rg-dev --cluster-name dev-cluster -n apppool \
  --query nodeImageVersion -o tsv

🐛 Debugging Scenarios

Scenario 1: "Pods stuck in Pending — no matching node pool"

bash

# Step 1: Check the pod events
kubectl describe pod <pod-name> | grep -A 10 "Events:"
# Look for: "0/5 nodes are available: 3 node(s) had untolerated taint..."

# Step 2: Check what nodeSelector/tolerations the pod requires
kubectl get pod <pod-name> -o jsonpath='{.spec.nodeSelector}'
kubectl get pod <pod-name> -o jsonpath='{.spec.tolerations}'

# Step 3: Check what taints exist on nodes
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints

# Example output:
# NAME                            TAINTS
# aks-agentpool-...-vmss000000    [map[effect:NoSchedule key:CriticalAddonsOnly value:true]]
# aks-apppool-...-vmss000000      <none>

# Step 4: Common causes:
# - Pod has nodeSelector for a pool that scaled to zero → wait for autoscaler
# - Pod needs GPU but targets wrong pool → fix nodeSelector
# - All user pools are tainted and pod has no toleration → add toleration
# - Pod requests more CPU/memory than any node can provide → use larger VM size

# Step 5: If autoscaler should scale up but doesn't, check its status
kubectl -n kube-system logs -l app=cluster-autoscaler --tail=50 | grep -i "scale"

Scenario 2: "Spot node was evicted — pods rescheduled but some data was lost"

bash

# Step 1: Confirm eviction happened
kubectl get events --sort-by='.lastTimestamp' | grep -i "evict\|preempt\|spot"

# Step 2: Check which nodes were affected
kubectl get nodes -l kubernetes.azure.com/scalesetpriority=spot -o wide
# Evicted nodes will be gone; replacements may already be provisioning

# Step 3: Check pod status — pods should reschedule on surviving nodes
kubectl get pods -o wide | grep -v Running

# Step 4: Data loss root cause — spot pods used emptyDir or local volumes
# Fix: Use PersistentVolumeClaims with Azure Disks or Azure Files
# These survive node evictions because the data is on Azure storage, not the VM

# Step 5: Add a PodDisruptionBudget to ensure minimum availability
# apiVersion: policy/v1
# kind: PodDisruptionBudget
# metadata:
#   name: processor-pdb
# spec:
#   minAvailable: 1
#   selector:
#     matchLabels:
#       app: data-processor

# Step 6: Ensure your workload handles SIGTERM gracefully
# Spot evictions send SIGTERM → 30 second grace period → SIGKILL
# Your app should checkpoint or save state within those 30 seconds

Scenario 3: "Node pool add fails with 'insufficient quota'"

bash

# Step 1: Check your current quota usage
az vm list-usage --location eastus -o table | grep -i "total\|standard D\|standard NC"

# Example output:
# CurrentValue  Limit    Name
# 12            20       Total Regional vCPUs
# 8             20       Standard DSv5 Family vCPUs
# 0             0        Standard NCSv3 Family vCPUs  ← GPU quota is 0!

# Step 2: Request a quota increase
# Azure Portal → Subscriptions → Usage + Quotas → Request Increase
# Or via CLI:
az quota create --resource-name "StandardNCSv3Family" \
  --scope "/subscriptions/{sub-id}/providers/Microsoft.Compute/locations/eastus" \
  --limit-object value=12

# Step 3: For GPU VMs, quota increases may take 1-2 business days
# Workaround: try a different region with available capacity

# Step 4: Verify quota was increased before retrying
az vm list-usage --location eastus -o table | grep "NC"

Scenario 4: "Pods scheduled on system pool despite having a user pool"

bash

# Step 1: Check if the system pool has the CriticalAddonsOnly taint
kubectl describe node aks-agentpool-12345-vmss000000 | grep -A 3 "Taints:"
# If "Taints: <none>" — the system pool isn't tainted

# Step 2: The default node pool created with az aks create is mode=System
# but doesn't have the taint unless you have a separate user pool
# When only one pool exists, all pods schedule there (including yours)

# Step 3: To enforce separation, make sure you have both pools:
az aks nodepool list -g rg-dev --cluster-name dev-cluster -o table
# If only one pool, add a user pool (see Hands-on section above)

# Step 4: Add the taint to the system pool manually if needed
az aks nodepool update \
  --resource-group rg-dev \
  --cluster-name dev-cluster \
  --name agentpool \
  --node-taints "CriticalAddonsOnly=true:NoSchedule"

# Step 5: Verify pods migrate to the user pool
kubectl get pods -o wide
# Pods without the toleration will be evicted from system nodes
# and rescheduled on user pool nodes

🎯 Interview Questions

Beginner

Q: What is a node pool in AKS?▼

A node pool is a group of nodes (Azure VMs) with identical configuration — same VM size, OS, and Kubernetes version. Each node pool maps to a VM Scale Set in the _MC_ resource group. AKS clusters can have multiple node pools with different configurations, allowing you to run different workload types on optimized hardware.

Q: What is the difference between system and user node pools?▼

System pools run Kubernetes system components (CoreDNS, metrics-server, kube-proxy, CSI drivers). At least one system pool must exist. They can't scale to zero. They have the CriticalAddonsOnly=true:NoSchedule taint to prevent application pods from scheduling there. User pools run your application workloads. They're optional, can scale to zero, and have no default taints. Best practice: separate system and user pools to prevent application resource contention from affecting system stability.

Q: What are spot node pools and when should you use them?▼

Spot node pools use Azure's spare compute capacity at up to 90% discount. Tradeoff: Azure can evict these nodes with 30 seconds notice when it needs the capacity. Use for: batch processing, CI/CD runners, dev/test environments, data pipelines, ML training with checkpointing, and any workload that can tolerate interruption. Never use for: production APIs, databases, or any workload that can't handle sudden termination.

Q: Can a user node pool scale to zero?▼

Yes. User node pools can be configured with --min-count 0 when cluster autoscaler is enabled. This is especially useful for GPU or spot pools that aren't always needed. When a pod requests resources that match the pool (via nodeSelector or toleration), the autoscaler spins up a node. When the workload completes and no pods need the pool, it scales back to zero. System pools can never scale to zero.

Q: How do you control which pods go to which node pool?▼

Three mechanisms: 1) nodeSelector — simple key-value match: nodeSelector: { agentpool: apppool }. 2) Taints + Tolerations — nodes repel pods unless they have a matching toleration. Used for system pool separation and spot/GPU pool isolation. 3) Node Affinity — more expressive rules (preferred vs required, multiple conditions). In practice, most teams use nodeSelector + taints for pool targeting.

Intermediate

Q: Why can't you change max-pods on an existing node pool?▼

The max-pods setting determines the IP allocation and routing configuration for each node at creation time. With Azure CNI, each pod pre-allocates a VNet IP — changing max-pods would require re-IPing all pods and reconfiguring VMSS networking, which isn't safe to do in-place. To change max-pods, create a new node pool with the desired setting, cordon + drain the old pool, and delete it. This is a design constraint of how Azure CNI allocates IPs.

Q: How does the cluster autoscaler decide when to scale a node pool?▼

Scale up: When pods are Pending because no node has enough resources (CPU/memory) to schedule them. The autoscaler simulates adding a node and checks if the pending pods would fit. Scale down: When a node's utilization (requested resources / allocatable) drops below ~50% (default) for 10+ minutes, and all pods on that node can be moved elsewhere. Nodes with local storage, pods without controllers, or pods with restrictive PDBs won't be scaled down. Scale-down is conservative to avoid thrashing.

Q: What happens to pods when a spot node gets evicted?▼

Azure sends a 30-second eviction notice. The node is drained: kubelet sends SIGTERM to all pod containers, waits for the grace period (default 30s), then SIGKILL. If eviction-policy=Delete, the VM is deleted entirely. The pods' controller (Deployment, Job, etc.) detects the pod termination and creates replacement pods — the scheduler places them on available non-spot nodes or surviving spot nodes. In-memory data and emptyDir volumes are lost. PersistentVolumes backed by Azure Disks survive and re-attach.

Q: When would you use Windows node pools in AKS?▼

Windows node pools are for running Windows containers — typically legacy .NET Framework applications that can't run on Linux. Key constraints: Windows pools can only be user pools (system pool must be Linux), they cost more (Windows Server license included in VM pricing), have fewer AKS features (no Azure Linux, limited network policies), and have slower node startup. If your .NET app targets .NET 6+ (or later), containerize it on Linux instead — it's cheaper, faster, and has better AKS support.

Q: How do you handle node pool upgrades in production without downtime?▼

AKS performs rolling upgrades by default: 1) A new node with the new version is added (surge node). 2) An old node is cordoned (no new pods). 3) Pods are drained (evicted) from the old node. 4) Old node is deleted. This repeats for each node. Configure max surge: --max-surge 1 (one node at a time, conservative) or --max-surge 33% (faster but uses more temp resources). Ensure PDBs allow at least one pod to be evicted. The process is automatic — you just trigger the upgrade command.

Scenario-Based

Q: Your company runs a web app (3 replicas), a batch processing pipeline (variable load), and occasional ML training jobs. Design the node pool strategy.▼

Three pools: 1) System pool: 2× Standard_D2s_v5, mode=System, always-on for K8s system pods. 2) App pool: Standard_D4s_v5, autoscaler min=2 max=6, mode=User — runs the web app with nodeSelector. 3) Spot pool: Standard_D4s_v5, priority=Spot, autoscaler min=0 max=10 — runs batch jobs with tolerations. 4) GPU pool: Standard_NC6s_v3, autoscaler min=0 max=2, taint sku=gpu:NoSchedule — ML training only, scales to zero when no training jobs. Total cost optimization: web app on reliable VMs, batch on cheap spot, GPU only when needed.

Q: Pods are stuck Pending. kubectl describe shows "0/5 nodes are available: 2 node(s) had untolerated taint {CriticalAddonsOnly: true}, 3 node(s) didn't match Pod's node affinity/selector." Diagnose and fix.▼

The message tells you: 2 system nodes are blocked by the CriticalAddonsOnly taint (correct behavior — app pods shouldn't go there). 3 other nodes exist but don't match the pod's nodeSelector or affinity rule. Fix: Check the pod's nodeSelector: kubectl get pod <pod> -o jsonpath='{.spec.nodeSelector}'. It probably targets a pool that either doesn't exist yet (scale-to-zero), has wrong labels, or was deleted. Either: a) Fix the deployment's nodeSelector to target an existing pool, b) Create the expected pool with matching labels, or c) If using autoscaler at min=0, wait for scale-up (check autoscaler logs for errors).

Q: Your spot node pool keeps getting evicted during peak hours (2-4 PM), disrupting batch jobs. How do you improve reliability?▼

1) Schedule batch jobs outside peak hours (night/weekends) using CronJobs. 2) Use multiple spot VM sizes: --node-vm-size with a VMSS flexible orchestration (if available) or create multiple spot pools with different SKUs — eviction risk is per-SKU. 3) Implement job checkpointing so interrupted jobs resume from last checkpoint. 4) Consider a mixed strategy: some critical batch runs on regular user pool nodes, non-critical on spot. 5) Set --spot-max-price slightly higher than average to reduce eviction (but still cheaper than regular). 6) Use a different region with more spare capacity during those hours.

Q: You have a single node pool with 30 max-pods. Your team wants to deploy 50 microservices with 2 replicas each (100 pods). The 3 nodes can only run 90 pods total. What do you do?▼

Since max-pods cannot be changed on an existing pool: 1) Create a new node pool with --max-pods 110: az aks nodepool add --name newpool --max-pods 110. 2) Cordon the old pool: kubectl cordon <old-nodes>. 3) Drain pods from old nodes: kubectl drain <old-node> --ignore-daemonsets --delete-emptydir-data. 4) Delete the old pool: az aks nodepool delete --name agentpool. Now 3 nodes × 110 max-pods = 330 pod capacity — more than enough. Lesson learned: Always set max-pods=110 at cluster creation. The default 30 for Azure CNI is too low for most production clusters.

Q: Your production cluster costs $5,000/month. Management wants to cut it by 40%. What node pool optimizations do you recommend?▼

1) Spot pools for non-critical workloads: Move batch processing, CI runners, and background workers to spot (saves up to 90% on those VMs). 2) Right-size VMs: Check actual CPU/memory usage with kubectl top nodes and Container Insights. If nodes average 30% utilization, downsize VMs or reduce count. 3) Autoscaler: Enable cluster autoscaler on all user pools — scale down during off-peak automatically. 4) Scale-to-zero: GPU and specialty pools should scale to 0 when idle. 5) Reserved instances: For the baseline node count that always runs, purchase 1-year Azure Reserved Instances (save 30-40%). 6) Stop dev/staging clusters after hours. Combined, these typically achieve 40-60% savings.

🌍 Real-World Use Case

A media streaming company optimized their AKS cluster with a multi-pool strategy:

Before: Single node pool of 20× Standard_D8s_v5 ($2,800/month each). All workloads mixed together. Monthly cost: $56,000. Average utilization: 35%.
After (4 pools):
- System pool: 3× Standard_D2s_v5 for system pods — $210/month
- API pool: 5× Standard_D4s_v5 (autoscaler 3-8) for application pods — $700/month
- Encoding pool: 0-10× Standard_F8s_v2 Spot for video encoding — ~$250/month (90% discount)
- ML pool: 0-2× Standard_NC6s_v3 for recommendation engine training — $0-4,400/month (only when training)
Result: Monthly cost dropped from $56,000 to ~$12,000 (79% reduction). Encoding throughput actually increased because F-series VMs are compute-optimized for that workload. ML training only runs twice a week, scaling the GPU pool from 0 to 2 for 6 hours then back to 0.

📝 Summary

System pools run Kubernetes components; user pools run your workloads — always separate them in production
Choose VM sizes based on workload type: general purpose (D-series), memory-optimized (E-series), compute (F-series), GPU (NC-series)
Spot pools offer up to 90% savings for interruptible workloads — never use for production APIs or databases
Use taints + nodeSelector to control pod placement across pools
Cluster autoscaler scales node pools automatically; user pools can scale to zero
--max-pods is set at creation and cannot be changed — default to 110 for production
Multiple specialized pools beat one large generic pool for both cost and performance

← PreviousCluster Creation & Config Next →Networking in AKS

← Back to AKS Course