ComputeLesson 2 of 16

VM Scale Sets

VM Scale Sets (VMSS) let you deploy and manage a group of identical, auto-scaling VMs behind a load balancer — giving you elastic capacity for variable workloads.

Simple Explanation

Imagine hiring 2 workers for normal load and automatically bringing in 8 more when it gets busy — then sending them home when the rush ends. VM Scale Sets do exactly that for VMs.

When to Use VMSS

Stateless application tiers where any instance can serve any request (web frontends, API gateways, batch processors).
When you need VM-level control but also want automatic scaling.
Large-scale HPC or rendering jobs needing burst capacity.
Custom base images with specific software that can't run on PaaS.

VMSS vs App Service Auto-scaling

If your app can run on App Service, auto-scaling there is simpler — no OS management, zero VM boot time. Use VMSS when you need full OS control at scale. VMSS boot time (2–5 min) is slower than App Service slot scaling (seconds).

How VMSS Works

VMSS Auto-scaling Flow

Trigger

CPU > 70%

Schedule

Custom metric

Scale Out

Add instances

Use same image

Scale In

CPU < 30%

Remove instances

Cooldown period

Load Balancer

Distributes traffic

Health probes

Drains on scale-in

Commands

Azure CLI

# Create a scale set
az vmss create \
  --resource-group rg-app \
  --name vmss-web \
  --image Ubuntu2204 \
  --vm-sku Standard_D2s_v3 \
  --instance-count 2 \
  --admin-username azureuser \
  --generate-ssh-keys \
  --upgrade-policy-mode automatic

# Define autoscale (2–10 instances based on CPU)
az monitor autoscale create \
  --resource-group rg-app \
  --resource vmss-web \
  --resource-type Microsoft.Compute/virtualMachineScaleSets \
  --name autoscale-web \
  --min-count 2 --max-count 10 --count 2

# Add scale-out rule (CPU > 70% → add 2)
az monitor autoscale rule create \
  --resource-group rg-app \
  --autoscale-name autoscale-web \
  --scale out 2 \
  --condition "Percentage CPU > 70 avg 5m"

# List instances
az vmss list-instances --resource-group rg-app --name vmss-web --output table

# Manual scale
az vmss scale --resource-group rg-app --name vmss-web --new-capacity 5

Hands-on

Create a VMSS with 2 instances (Standard_B2s for cost).
Attach an autoscale policy: scale out at CPU > 70%, scale in at CPU < 30%.
Generate load with stress tool and watch auto-scaling trigger.
Check the load balancer backends to see new instances register.
Manually scale in and verify instance count drops.

Debugging Scenario

Issue: New instances are not getting traffic after scale-out.

Check load balancer health probes — is the app returning 200 on the probe path?
Check if instances are in Succeeded provisioning state: az vmss list-instances.
Review the custom script extension or cloud-init logs if the app isn't starting on new instances.
Verify NSG allows the probe port (default 80/443) from Azure Load Balancer (168.63.129.16).

Interview Questions

Beginner

What is a VM Scale Set?

A group of identical VMs managed together that can auto-scale in/out based on demand metrics or schedules.

What is the minimum instance count for VMSS?

You can configure minimum as 0 (will scale from 0 on demand) or 1+. Setting minimum to 0 is useful for batch jobs; keep minimum ≥ 2 for production web apps.

Scenario-based

Traffic spikes every day at noon. How do you configure VMSS?

Use scheduled autoscale: scale out to 10 instances at 11:45 AM, scale in to 2 at 2:00 PM. Combine with metric autoscale as a safety net for unpredictable spikes.

VMSS is scaling but response time is still high.

New instances may still be bootstrapping (cloud-init running, app starting). Add a readiness probe to the load balancer so new instances only receive traffic when fully ready.

Summary

VMSS provides elastic VM capacity with automatic scaling and integrated load balancing. Use it for stateless, VM-level workloads that need dynamic capacity. For PaaS-compatible workloads, App Services with auto-scale is simpler.

PreviousVirtual Machines NextAzure App Services