AdvancedLesson 9 of 12

Azure vs AWS vs GCP: Platform Comparison

Choose the right cloud provider. Understand service mapping, pricing, strengths, and when to use each. Learn multi-cloud architecture patterns.

Simple Explanation (ELI5)

AWS, GCP, and Azure are like three grocery stores. AWS is the biggest and oldest (most selection, steepest learning curve). GCP is smaller but known for AI/analytics and simplicity. Azure is strong in enterprise/Windows environments. Each stores items in different aisles, charges differently, and has unique strengths. Most enterprises use all three.

Why Multi-Cloud Matters

Core Service Mapping

1. Compute Services

Use CaseAWSGCPAzure
VMs (long-running)EC2Compute EngineVirtual Machines
App hosting (managed)Elastic BeanstalkApp EngineApp Service
Containers (serverless)FargateCloud RunContainer Instances
Serverless functionsLambdaCloud FunctionsFunctions
Kubernetes (managed)EKSGKEAKS
Batch processingBatchDataflow / DataprocBatch

2. Storage Services

Use CaseAWSGCPAzure
Object storage (S3-like)S3Cloud StorageBlob Storage
Block storageEBSPersistent DiskDisk Storage
File storage (NFS)EFS / FSxFilestoreAzure Files
Data backupAWS BackupCloud BackupBackup
Cold archiveGlacierCloud Storage (Archive)Archive Storage

3. Databases

Use CaseAWSGCPAzure
Relational SQLRDS (MySQL/Postgres)Cloud SQLSQL Database
NoSQL key-valueDynamoDBFirestore / DatastoreCosmos DB (mode=Key-Value)
NoSQL documentDocumentDB (MongoDB-like)Firestore (document-native)Cosmos DB (mode=Mongo)
Distributed SQLAuroraCloud SpannerCosmos DB (SQL)
Big data warehouseRedshiftBigQuerySynapse Analytics
In-memory cacheElastiCache (Redis/Memcached)Cloud Memorystore (Redis)Redis Cache

4. Networking

Use CaseAWSGCPAzure
Virtual networkVPCVPCVirtual Network
Load balancerELB / ALB / NLBHTTP(S) LB / Network LBLoad Balancer / Application Gateway
DNSRoute 53Cloud DNSDNS
CDNCloudFrontCloud CDNFront Door
VPN / Private linkSite-to-Site VPN / PrivateLinkCloud InterconnectVPN Gateway / Private Link

5. Security & IAM

Use CaseAWSGCPAzure
Identity managementIAM (policy-based)IAM (role-based)RBAC (role-based, AD-native)
Audit loggingCloudTrailCloud Audit LogsAzure Monitor / Activity Log
Secrets managementSecrets ManagerSecret ManagerKey Vault
Key managementKMSCloud KMSKey Vault

6. Data & Analytics

Use CaseAWSGCPAzure
Data warehouseRedshiftBigQuery (best-in-class)Synapse Analytics
Data pipeline / ETLAWS GlueDataflow (best), DataprocData Factory
Machine LearningSageMakerVertex AI (best for ML)Machine Learning Service
AnalyticsAthenaBigQuerySynapse Analytics

Detailed Comparison: Strengths & Weaknesses

AWS (Largest Market Share, Most Services)

Strengths
  • Largest ecosystem (200+ services). Longest track record (since 2006).
  • Most global regions (31 regions, 99 AZs). Everywhere you need to be.
  • Enterprise-grade: If your company uses AWS, it's already invested.
  • EC2: Gold standard for VMs. Massive selection (1000+ instance types).
  • Mature ecosystem: Tools, StackOverflow answers, hiring pool.
⚠️
Weaknesses
  • Complex: 200+ services = steep learning curve. Hard to know what to use.
  • Expensive: All-premium pricing. GCP often 20-30% cheaper for same workload.
  • IAM: Policy-based (verbose JSON); harder to reason about than GCP's role-based model.
  • Data analytics: BigQuery runs circles around Redshift (50x faster for same price).
  • Opinionated design: VPC, subnets, routing tables = more networking overhead.

GCP (Data & AI Leader, Simplest Developer Experience)

Strengths
  • BigQuery: Best-in-class data warehouse. Query 100 TB in seconds.
  • Vertex AI: Best ML platform. Pre-trained models, AutoML, easy to use.
  • Cloud Run: Simplest serverless on the market. Deploy a container in seconds.
  • Pricing: 25-35% cheaper than AWS for same workload (per-second billing, sustained-use discounts).
  • Developer experience: Console UI is intuitive. CLI is clean (gcloud).
  • Global load balancer: True global LB (AWS CloudFront for HTTP ; not L7 LB).
⚠️
Weaknesses
  • Smaller market: Fewer jobs, smaller ecosystem. Not first choice for enterprises.
  • Fewer regions: 40 zones but not as global as AWS.
  • Less mature: Fewer managed services vs AWS. Some services still feel "beta."
  • Windows/Enterprise: Not preferred by Windows-heavy companies (Azure's domain).
  • Java ecosystem: Strong in Python/Go; less native for Java devs.

Azure (Enterprise & Hybrid Leader)

Strengths
  • Windows-native: Best-in-class for Windows Server, SQL Server, .NET (enterprise standards).
  • Hybrid: Azure Stack = run cloud on-prem. AWS/GCP don't have local equivalents.
  • Enterprise integration: Works seamlessly with Active Directory, Microsoft 365.
  • Azure DevOps: Best-in-class CI/CD. Equivalent to GitHub Actions + AWS CodePipeline combined.
  • Pricing: Hybrid licensing deals (BYOL) for Microsoft products = cheaper than on-prem.
  • Support: Microsoft support is strong for enterprises. SLAs negotiable.
⚠️
Weaknesses
  • Linux-focused shops struggle: Not designed for Linux-native ecosystems.
  • Data analytics: Synapse is good but slower/more expensive than BigQuery.
  • Complexity: Naming conventions are verbose (VirtualNetworks vs VPC). Learning curve steep.
  • Kubernetes: AKS works but GKE has better multi-cluster story.
  • Cost surprises: Blob storage pricing is complex; easy to get surprised bills.

Decision Matrix: Which Cloud to Choose?

ScenarioBest ChoiceWhy
Startup, no vendor preferenceGCPCheaper, simpler, Cloud Run is magical for early devs.
Fortune 500, Windows-heavy, M365 integrationAzureBuilt for enterprise. Hybrid support. Active Directory native.
Large data warehouse / BI analyticsGCP (BigQuery)50x faster than competitors for analytical queries.
Machine learning / deep learningGCP (Vertex AI)Best AutoML, pre-trained models, hosted notebooks.
Kubernetes-first companyGCP (GKE)GKE is the most Kubernetes-native implementation.
Gaming, IoT, real-time analyticsAWSDynamoDB, Kinesis, massive instance selection. Most optimized.
Hybrid on-prem + cloudAzureOnly one with true on-prem cloud (Azure Stack).
Lowest latency for global usersAWSMost regions + CloudFront as global CDN.
Cost-optimized, no preferenceGCP25-35% cheaper. Sustained-use discounts. Preemptible VMs.

Multi-Cloud Architecture Patterns

Pattern 1: Multi-Cloud for Resilience

Deploy critical app on AWS + GCP. Use DNS failover (Route 53 health checks) to route to whichever cloud is healthy. If AWS us-east-1 fails, traffic routes to GCP. Cost: Double infrastructure spending. Benefit: 99.999% uptime (5 nines).

Pattern 2: Cloud-Specific Best Tool

Use GCP BigQuery for analytics (best warehouse), AWS RDS for transactional DB (mature), Azure for Windows workloads. Move data via pub/sub (Google Cloud Pub/Sub <→> AWS SNS). Each team owns their cloud.

Pattern 3: Multi-Cloud for Vendor Lock-in Avoidance

Use Kubernetes on all clouds (GKE, EKS, AKS). Run same app everywhere. Avoid cloud-specific services (don't use BigQuery; use standard SQL instead). Costs more (lost benefits of each cloud) but gains flexibility.

Interview Questions

Beginner

What are the three major cloud providers?

AWS (largest, 33% market share), Microsoft Azure (enterprise, 23%), Google Cloud (data/AI focus, 11%). All three are viable. AWS has most services; GCP is cheapest; Azure is best for Windows/enterprise.

What is the AWS equivalent of Google Cloud Run?

AWS Fargate. Both are serverless container platforms. Cloud Run is simpler, more elegant. Fargate is more powerful, more complex. For simple HTTP services, Cloud Run is easier.

Why would you use multiple cloud providers?

Avoid vendor lock-in, resilience (if one cloud fails), and use the best tool for each job (BigQuery on GCP, DynamoDB on AWS). Most enterprises use 2-3 clouds.

Intermediate

You need a NoSQL database. Compare DynamoDB, Firestore, and Cosmos DB.

DynamoDB (AWS): Key-value only, scales to millions of RPS, expensive at scale. Firestore (GCP): Document-native, good queries, cheaper. Cosmos DB (Azure): Multi-model, expensive, but global multi-master built-in. For startups: Firestore. For scale: DynamoDB. For Microsoft shops: Cosmos DB.

Why is BigQuery better than Redshift?

BigQuery is fully serverless (no infrastructure to manage). Redshift requires clusters (to tune performance). BigQuery scans terabytes in seconds using columnar storage + Zetabyte File System. Same query on Redshift takes minutes. BigQuery costs ~same but 50x faster. Redshift is legacy.

How do you migrate from AWS to GCP?

1. Map services (EC2 → Compute Engine, RDS → Cloud SQL, S3 → Cloud Storage). 2. Use Database Migration Service or Dataflow for data transfer. 3. Rewrite code for cloud-specific services (if using S3 SDK, change to Cloud Storage). 4. Test multi-cloud setup for 1-2 months (both clouds active). 5. Switch DNS to GCP. Most migrations take 3-6 months.

Scenario-based

Your startup uses AWS but BigQuery queries run 100x faster on GCP. Migrate?

Migrate analytics to GCP BigQuery, keep operational systems on AWS. Use Change Data Capture (CDC) or ELT to stream data from AWS to BigQuery. Cost: $5k engineering + $500/month multi-cloud overhead. Benefit: Analytics 100x faster = faster insights = better product decisions. ROI: Positive in 2-3 months.

You need 99.999% uptime (5 nines). Design it.

Active-Active on multiple clouds: AWS + GCP + Azure. Each cloud runs full app. Global load balancer (Google Cloud Load Balancer + AWS CloudFront + Azure Front Door, using external DNS to route). Replicate data across clouds (3-way replication). If one cloud fails, 2/3 still running. Cost: 3x infrastructure, but 99.999% uptime guaranteed (AWS alone: 99.95 = 21 hours downtime/year; 5 clouds = 26 seconds downtime/year).

Real-world Scenarios

Scenario 1: Enterprise Multi-Cloud

Fortune 500 company: 60% workloads on Azure (Windows/SQL Server), 30% on AWS (legacy systems), 10% on GCP (data science). Each division owns cloud choice. Central team establishes ExpressRoute / AWS DirectConnect / Interconnect for private connectivity between clouds. Data flows via pub/sub bridges. No internet egress = cheaper + secure.

Scenario 2: Startup Multi-Cloud for Resilience

Early-stage SaaS: Runs on GCP for cost efficiency. But customer critical = risk. Deploy standby on AWS. GCP active (DNS weight 95%), AWS standby (weight 5%). Monthly failover test (5% of real users hit AWS to verify it works). If GCP fails, flip weight to 100% AWS. Users see <1 second DNS propagation delay.

Scenario 3: Cloud-Specific Best Tool

Data company: GCP (BigQuery for BI), AWS (RDS for transactional), Azure (Power BI for dashboards). Data flow: Transactional data → AWS RDS → extract nightly → GCP BigQuery → Azure Power BI for dashboards. Each team owns their cloud. Total cost: $15k/month (vs $30k if all on one cloud).

Summary

No "best" cloud; choose based on your use case: