VM Scale Sets
VM Scale Sets (VMSS) let you deploy and manage a group of identical, auto-scaling VMs behind a load balancer — giving you elastic capacity for variable workloads.
Simple Explanation
Imagine hiring 2 workers for normal load and automatically bringing in 8 more when it gets busy — then sending them home when the rush ends. VM Scale Sets do exactly that for VMs.
When to Use VMSS
- Stateless application tiers where any instance can serve any request (web frontends, API gateways, batch processors).
- When you need VM-level control but also want automatic scaling.
- Large-scale HPC or rendering jobs needing burst capacity.
- Custom base images with specific software that can't run on PaaS.
If your app can run on App Service, auto-scaling there is simpler — no OS management, zero VM boot time. Use VMSS when you need full OS control at scale. VMSS boot time (2–5 min) is slower than App Service slot scaling (seconds).
How VMSS Works
Commands
# Create a scale set az vmss create \ --resource-group rg-app \ --name vmss-web \ --image Ubuntu2204 \ --vm-sku Standard_D2s_v3 \ --instance-count 2 \ --admin-username azureuser \ --generate-ssh-keys \ --upgrade-policy-mode automatic # Define autoscale (2–10 instances based on CPU) az monitor autoscale create \ --resource-group rg-app \ --resource vmss-web \ --resource-type Microsoft.Compute/virtualMachineScaleSets \ --name autoscale-web \ --min-count 2 --max-count 10 --count 2 # Add scale-out rule (CPU > 70% → add 2) az monitor autoscale rule create \ --resource-group rg-app \ --autoscale-name autoscale-web \ --scale out 2 \ --condition "Percentage CPU > 70 avg 5m" # List instances az vmss list-instances --resource-group rg-app --name vmss-web --output table # Manual scale az vmss scale --resource-group rg-app --name vmss-web --new-capacity 5
Hands-on
- Create a VMSS with 2 instances (Standard_B2s for cost).
- Attach an autoscale policy: scale out at CPU > 70%, scale in at CPU < 30%.
- Generate load with
stresstool and watch auto-scaling trigger. - Check the load balancer backends to see new instances register.
- Manually scale in and verify instance count drops.
Debugging Scenario
Issue: New instances are not getting traffic after scale-out.
- Check load balancer health probes — is the app returning 200 on the probe path?
- Check if instances are in
Succeededprovisioning state:az vmss list-instances. - Review the custom script extension or cloud-init logs if the app isn't starting on new instances.
- Verify NSG allows the probe port (default 80/443) from Azure Load Balancer (168.63.129.16).
Interview Questions
Beginner
A group of identical VMs managed together that can auto-scale in/out based on demand metrics or schedules.
You can configure minimum as 0 (will scale from 0 on demand) or 1+. Setting minimum to 0 is useful for batch jobs; keep minimum ≥ 2 for production web apps.
Scenario-based
Use scheduled autoscale: scale out to 10 instances at 11:45 AM, scale in to 2 at 2:00 PM. Combine with metric autoscale as a safety net for unpredictable spikes.
New instances may still be bootstrapping (cloud-init running, app starting). Add a readiness probe to the load balancer so new instances only receive traffic when fully ready.
Summary
VMSS provides elastic VM capacity with automatic scaling and integrated load balancing. Use it for stateless, VM-level workloads that need dynamic capacity. For PaaS-compatible workloads, App Services with auto-scale is simpler.