AdvancedResilience

High Availability and Scaling

Build resilient systems using Multi-AZ deployment, Auto Scaling, and load balancing for fault tolerance under pressure.

What Is It? (ELI5)

High availability means your app stays alive even when one part fails. Scaling means your app can handle more users when traffic grows.

Why Do We Need It?

How It Works (Technical)

Users -> ALB -> Auto Scaling Group (AZ-a + AZ-b)
If AZ-a fails, traffic continues to AZ-b.

Hands-on

# View auto scaling groups
aws autoscaling describe-auto-scaling-groups --query "AutoScalingGroups[].AutoScalingGroupName" --output table

# Set desired capacity
aws autoscaling set-desired-capacity --auto-scaling-group-name web-asg --desired-capacity 4

Debugging Scenario

Problem

Auto Scaling group is not scaling out during peak traffic.

Interview Questions

Beginner: What is Multi-AZ deployment?
Deploying resources across multiple availability zones in one region.
Intermediate: ALB vs NLB?
ALB is Layer 7 HTTP routing; NLB is Layer 4 high-performance TCP/UDP.
Scenario: One AZ goes down during Black Friday. How should system behave?
ALB routes traffic to healthy AZ, ASG rebalances capacity, and user impact stays minimal.

Real-world Usage

Retail platforms combine ALB, ASG, and Multi-AZ managed databases to keep checkout available during sale spikes.

Summary