Cosmos DB
Azure Cosmos DB is a globally distributed, multi-model NoSQL database with guaranteed single-digit millisecond latency, automatic scaling, and five consistency levels.
Simple Explanation
Cosmos DB is a database that lives everywhere at once. You store data in one place and it instantly replicates to datacenters worldwide — users in Tokyo and New York both get <10ms reads from a local copy.
When to Use Cosmos DB
- Globally distributed apps where users are geographically spread (e-commerce, gaming, social).
- Applications requiring guaranteed single-digit millisecond latency at any scale.
- Schema-flexible data where document structure evolves over time.
- Systems that need to process millions of requests per second.
APIs (Data Models)
| API | Data Model | Migrate From |
|---|---|---|
| NoSQL (Core/SQL) | JSON documents, SQL-like queries | New apps or DocumentDB |
| MongoDB | BSON documents | MongoDB Atlas or self-hosted |
| Cassandra | Column-family (CQL) | Apache Cassandra |
| Gremlin | Graph (vertices + edges) | Neo4j, TinkerPop |
| Table | Key-value (like Table Storage) | Azure Table Storage (with better SLA) |
| PostgreSQL | Distributed Postgres (Citus) | PostgreSQL scale-out |
Consistency Levels
Cosmos DB lets you choose the trade-off between consistency and latency/availability. Stronger = slower but more accurate; weaker = faster but may read slightly stale data.
| Level | Guarantee | Use When |
|---|---|---|
| Strong | Always reads the latest write | Financial transactions, inventory |
| Bounded Staleness | Reads lag by X versions or T seconds | Near-realtime with controlled staleness |
| Session (default) | Within a session, reads own writes | User-facing apps (recommended default) |
| Consistent Prefix | Reads never see out-of-order updates | Social feeds, timelines |
| Eventual | No ordering guarantee, highest availability | Like counters, IoT telemetry |
Partitioning
Cosmos DB scales horizontally by distributing data across partitions. The Partition Key determines how data is split. Choosing the wrong partition key is the #1 cause of performance problems.
Choose a key with high cardinality (many unique values) and even distribution of requests. For user data: use userId. Avoid "hot" partition keys like status codes (e.g., active/inactive) that funnel all writes to 2 partitions.
Request Units (RUs)
Cosmos DB billing is based on Request Units — a normalized measure of CPU, memory, and IOPS. A 1KB point read costs 1 RU. Writes cost more (~5 RU). Configure Throughput (RU/s) manually or use Autoscale.
Commands
# Create a Cosmos DB account (NoSQL API)
az cosmosdb create \
--resource-group rg-database \
--name mycosmosdb-account \
--kind GlobalDocumentDB \
--locations regionName=eastus failoverPriority=0 isZoneRedundant=False \
--default-consistency-level Session
# Add a second read region (global distribution)
az cosmosdb update \
--resource-group rg-database \
--name mycosmosdb-account \
--locations regionName=eastus failoverPriority=0 isZoneRedundant=False \
regionName=westeurope failoverPriority=1 isZoneRedundant=False
# Create a database
az cosmosdb sql database create \
--account-name mycosmosdb-account \
--resource-group rg-database \
--name AppDB
# Create a container with partition key
az cosmosdb sql container create \
--account-name mycosmosdb-account \
--resource-group rg-database \
--database-name AppDB \
--name UserProfiles \
--partition-key-path "/userId" \
--throughput 400
# Enable autoscale (instead of fixed throughput)
az cosmosdb sql container throughput update \
--account-name mycosmosdb-account \
--resource-group rg-database \
--database-name AppDB \
--name UserProfiles \
--max-throughput 4000Hands-on
- Create a Cosmos DB account with NoSQL API in a single region.
- Create a database and container with
/userIdas partition key. - Use Azure Data Explorer (in Portal) to insert and query JSON documents.
- Add a second region and observe replication in the Replicate data globally panel.
- Try changing the consistency level to
Eventualand discuss the trade-offs.
Debugging Scenario
Issue: Queries are performing 100x worse than expected.
- Enable Diagnostics in Portal → check
QueryMetricsin query response headers. - A cross-partition query (no partition key in filter) scans all partitions — extremely expensive on large containers. Add partition key to all WHERE clauses.
- Check if RU/s is throttled (429 Too Many Requests) — increase throughput or enable autoscale.
- Review partition distribution: a hot partition causes latency spikes even with sufficient total RU/s.
Interview Questions
Beginner
A normalized unit representing cost (CPU+memory+IO) of a Cosmos DB operation. A 1 KB point read = 1 RU. You provision a budget of RU/s, and exceeding it triggers throttling (429).
Intermediate
Uneven distribution creates "hot partitions" — one partition takes all the load while others sit idle. This limits throughput to the capacity of that single partition, even if total RU/s is high.
Session: reads within your session always reflect your own writes (default, recommended for most apps). Eventual: no ordering guarantee — reads may see stale data even from your own writes. Eventual is fastest but inconsistent across sessions.
Scenario-based
Cosmos DB with global distribution, Eventual consistency (leaderboard can tolerate slight staleness), partition key by region or game ID, autoscale throughput to handle bursts. Cache hot leaderboard pages in Redis.
Summary
Cosmos DB is built for global-scale, high-throughput NoSQL workloads. The multi-API support simplifies migration from MongoDB, Cassandra, and others. The main decisions are: API choice, partition key (most critical for performance), and consistency level (Session is the right default).