System Design Cheatsheet Cheatsheet

🏗️

Fundamentals

CORE

concepts/cap-theorem.md

# ── CAP Theorem ──
A distributed system can guarantee at most 2 of 3:

C - Consistency: Every read returns the most recent write
A - Availability: Every request receives a response (non-error)
P - Partition Tolerance: System operates despite network splits

In practice, you MUST choose P (networks fail), so choose:
┌──────────────────┬─────────────────┬──────────────────┐
│ CP System        │ AP System       │ CA System        │
├──────────────────┼─────────────────┼──────────────────┤
│ HBase            │ DynamoDB        │ Single-node RDBMS │
│ MongoDB (config) │ Cassandra       │ (not distributed) │
│ Redis Cluster    │ CouchDB         │                  │
│ ZooKeeper        │ Eureka          │                  │
└──────────────────┴─────────────────┴──────────────────┘

# ── BASE vs ACID ──
ACID (relational DBs):
- Atomicity:   Transaction all-or-nothing
- Consistency: Valid state to valid state
- Isolation:   Concurrent txns don't interfere
- Durability:  Committed data survives crashes

BASE (NoSQL / distributed):
- Basically Available: Reads/writes always succeed
- Soft State: Data may change without input (replication)
- Eventually Consistent: All replicas converge over time

System Design Process

Step	Description
1. Requirements	Functional + non-functional
2. Estimations	QPS, storage, bandwidth
3. High-level	Draw boxes and arrows
4. Deep Dive	Components in detail
5. Tradeoffs	Discuss alternatives
6. Bottlenecks	Identify and fix limits

Key Metrics

Metric	Description
QPS	Queries Per Second (read-heavy: 100:1 ratio)
MAU	Monthly Active Users
Storage	Per-user data * total users
Bandwidth	Data in/out per second
P50/P95/P99	Latency percentiles
5 nines	99.999% uptime = ~5 min downtime/year

concepts/estimations.md

# ── Back-of-the-Envelope Estimations ──
Twitter (example):
- 300M MAU, 100M DAU
- 500M tweets/day = ~5,800 tweets/sec
- Average tweet: 200 bytes
- Storage: 500M * 200B * 365 = ~36 TB/year
- Peak QPS: 5,800 * 2 = ~12,000/sec
- Read QPS (feed): 100x write = ~580,000/sec

YouTube (example):
- 1B users, 500M MAU
- 500 hours uploaded/min = 30,000 hours/day
- 1 hour = 1 GB → 30 TB/day uploaded
- Storage: 30 TB * 365 = ~10 PB/year
- CDN bandwidth: massive (video streaming)

2^10 ≈ 1,000 (1K)
2^20 ≈ 1,000,000 (1M)
2^30 ≈ 1,000,000,000 (1B)
2^32 ≈ 4 Billion (IPv4 addresses)

💡

Always start with requirements and estimations. Ask clarifying questions: read-heavy or write-heavy? Expected scale? Consistency requirements? These determine your architecture choices.

📈

Scalability

GROWTH

architecture/scaling.md

# ── Vertical vs Horizontal Scaling ──

VERTICAL (Scale Up):
+ Simpler (no code changes)
- Hardware limits (max RAM, CPU)
- Single point of failure
- Expensive at high end

HORIZONTAL (Scale Out):
+ Nearly unlimited capacity
+ Fault tolerance (redundancy)
- More complex (consistency, coordination)
- Requires stateless services

# ── Load Balancing ──
Client → Load Balancer → [Server 1]
                      → [Server 2]
                      → [Server 3]

Algorithms:
- Round Robin: Sequential rotation
- Weighted Round Robin: By server capacity
- Least Connections: Route to least busy
- IP Hash: Sticky sessions by client IP
- Least Latency: Route to fastest response

# ── Scaling the Database ──

READ REPLICAS:
Master (write) → Replica 1 (read)
                → Replica 2 (read)
                → Replica 3 (read)

SHARDING (Horizontal Partitioning):
Users 1-1M    → Shard 1 (DB)
Users 1M-2M   → Shard 2 (DB)
Users 2M-3M   → Shard 3 (DB)

Sharding Strategies:
- Hash-based: hash(user_id) % num_shards
- Range-based: A-M → shard 1, N-Z → shard 2
- Directory-based: Lookup service maps keys to shards

Monolith vs Microservices

Aspect	Monolith	Microservices
Deployment	One unit	Independent services
Scaling	Scale whole app	Scale per service
Complexity	Low (initially)	High (ops)
Tech Stack	Single	Polyglot
Team Size	Small	Large (per service)
Failure Impact	Full outage	Partial
Communication	In-process	Network (API/msg)
Best For	MVP, small teams	Large orgs, scale

API Gateway

Feature	Description
Routing	Route requests to services
Auth	Centralize authentication
Rate Limiting	Per-client/per-service limits
SSL Termination	Handle TLS at gateway
Load Balancing	Distribute across instances
Caching	Cache frequent responses
Logging	Central request logging
Circuit Breaker	Fail fast on downstream errors
Examples	Kong, AWS API GW, NGINX, Envoy

architecture/microservices.yaml

# ── Microservices Architecture (High-level) ──
# Client → CDN → API Gateway → Auth Service
#                           → User Service
#                           → Order Service
#                           → Payment Service
#                           → Notification Service
#
# Communication:
# - Sync: REST/gRPC between services
# - Async: Message Queue (Kafka, RabbitMQ) for events
#
# Data: Database per service (no shared DB)
# Discovery: Service registry (Consul, Eureka) or DNS
# Config: Centralized (Consul, Spring Cloud Config)
# Monitoring: Metrics + logs + traces (Prometheus, ELK, Jaeger)

⚠️

Start monolith, extract when needed. Only move to microservices when you have clear team boundaries, scaling needs per service, or independent deployment requirements. Premature microservices add massive operational overhead.

🗄️

Databases

STORAGE

SQL vs NoSQL

Aspect	SQL	NoSQL
Schema	Fixed (structured)	Flexible (dynamic)
Query	SQL language	API-specific
Scaling	Vertical (sharding hard)	Horizontal (built-in)
Joins	Native joins	Usually manual
ACID	Full ACID	Eventual consistency
Best For	Complex queries, relations	Simple access, huge scale
Examples	PostgreSQL, MySQL	MongoDB, Cassandra, DynamoDB

NoSQL Types

Type	Examples	Use Case
Document	MongoDB, CouchDB	Content management, user profiles
Key-Value	Redis, DynamoDB	Sessions, caching, shopping cart
Column	Cassandra, HBase	Time-series, analytics, IoT
Graph	Neo4j, Amazon Neptune	Social networks, recommendations
Search	Elasticsearch	Full-text search, logging

database/schema-design.sql

-- ── SQL: Denormalization for Read Performance ──
-- Normalized (write-optimized):
-- users(id, name, email)
-- posts(id, user_id, title, body)
-- comments(id, post_id, user_id, text)

-- Denormalized (read-optimized, for feed):
CREATE TABLE feed (
    id BIGINT PRIMARY KEY,
    user_id BIGINT,
    username VARCHAR(50),       -- denormalized from users
    user_avatar_url TEXT,       -- denormalized from users
    title VARCHAR(255),
    body TEXT,
    like_count INT DEFAULT 0,   -- pre-computed counter
    comment_count INT DEFAULT 0,
    created_at TIMESTAMP,
    INDEX idx_user_id (user_id),
    INDEX idx_created_at (created_at)
);

-- ── Indexing Strategy ──
-- Single column
CREATE INDEX idx_email ON users(email);

-- Composite (order matters!)
CREATE INDEX idx_user_created ON posts(user_id, created_at DESC);

-- Covering index (all columns in index)
CREATE INDEX idx_covering ON orders(user_id, status, total_amount);

-- Partial index
CREATE INDEX idx_active ON users(email) WHERE is_active = true;

-- Unique index
CREATE UNIQUE INDEX idx_unique_email ON users(email);

database/nosql-examples.json

// ── MongoDB: Document Store ──
// User document (denormalized for reads)
{
  "_id": "user_123",
  "name": "John Doe",
  "email": "john@example.com",
  "posts": [
    { "id": "post_1", "title": "Hello", "likes": 42 },
    { "id": "post_2", "title": "World", "likes": 15 }
  ],
  "followers_count": 1500,
  "settings": {
    "theme": "dark",
    "notifications": true
  },
  "created_at": "2024-01-15T10:30:00Z"
}

// ── DynamoDB: Key-Value / Document ──
// Table: Users (PK: user_id, SK: type)
{
  "user_id": "user_123",
  "sk": "PROFILE",
  "name": "John Doe",
  "email": "john@example.com"
}

// ── Cassandra: Column Family ──
// Table: user_posts (PK: user_id, CK: created_at)
// Designed around query patterns — model the query!

💡

Choose your database based on the query pattern, not just the data. If you need complex joins and ACID, use SQL. If you need horizontal scale and flexible schema, use NoSQL. Many systems use both: SQL for transactions, NoSQL for caching and large-scale reads.

⚡

Caching

PERFORMANCE

caching/strategies.md

# ── Cache Strategies ──

CACHE-ASIDE (Lazy Loading):
1. App checks cache
2. Cache miss → fetch from DB → store in cache → return
3. Cache hit → return from cache
+ Simple, never has stale data if TTL is short
- First request is slow (cold start)

READ-THROUGH:
1. App asks cache
2. Cache miss → cache fetches from DB → stores → returns
+ App doesn't know about the DB
- Data might be stale

WRITE-THROUGH:
1. App writes to cache
2. Cache writes to DB (synchronously)
+ Data always consistent
- Slower writes

WRITE-BEHIND (Write-Back):
1. App writes to cache (fast return)
2. Cache writes to DB asynchronously
+ Very fast writes
- Data loss on cache failure

WRITE-AROUND:
1. App writes to DB directly (cache is not updated)
2. Next read causes cache miss → refresh
+ Avoids cache being written to with data not read often
- Reads are slower until cache refreshes

caching/redis-examples.js

// ── Redis Cache Implementation ──
const Redis = require('redis');
const client = Redis.createClient({ url: 'redis://localhost:6379' });

async function getUser(userId) {
  // 1. Check cache
  const cached = await client.get(`user:${userId}`);
  if (cached) return JSON.parse(cached);

  // 2. Cache miss → fetch from DB
  const user = await db.users.findById(userId);

  // 3. Store in cache with TTL (5 min)
  await client.setex(
    `user:${userId}`,
    300,         // TTL in seconds
    JSON.stringify(user)
  );

  return user;
}

// ── Cache Invalidation Strategies ──

// Time-based (TTL)
await client.setex('key', 3600, value);  // expire in 1 hour

// Event-based (write invalidation)
async function updateUser(userId, data) {
  await db.users.update(userId, data);
  await client.del(`user:${userId}`);    // invalidate cache
}

// Version-based
await client.set(`user:${userId}:v2`, JSON.stringify(user));

// ── Common Cache Patterns ──

// Rate limiter (sliding window)
async function rateLimit(userId, limit = 100, window = 60) {
  const key = `rate:${userId}`;
  const current = await client.incr(key);
  if (current === 1) await client.expire(key, window);
  return current <= limit;
}

// Distributed lock
async function acquireLock(key, ttl = 10) {
  const result = await client.set(key, 'locked', 'PX', ttl * 1000, 'NX');
  return result === 'OK';
}

Cache Placement

Level	Speed	Example
Browser cache	Fastest	Cache-Control headers
CDN cache	Very fast	CloudFront, Cloudflare
Load balancer	Fast	Nginx proxy_cache
Application cache	Fast	Redis, Memcached
Database cache	Moderate	MySQL query cache
Disk cache	Slow	Page cache, OS buffer

Redis vs Memcached

Feature	Redis	Memcached
Data structures	Strings, lists, sets, hashes, zsets	Strings only
Persistence	RDB + AOF	No (memory only)
Replication	Master-slave	No
Clustering	Redis Cluster	Client-side sharding
Pub/Sub	Yes	No
Lua scripting	Yes	No
Max memory	TB+	GB range
Use case	General purpose caching	Simple caching

🚫

Cache invalidation is one of the hardest problems in CS.Always set a TTL as a safety net. Use event-based invalidation for write-heavy data. Cache only what's frequently accessed and expensive to compute.

⚖️

Load Balancing & CDN

DISTRIBUTION

architecture/load-balancer.conf

# ── Nginx Load Balancer Configuration ──
upstream backend_servers {
    # Least connections algorithm
    least_conn;

    server 10.0.1.1:8080 weight=3;  # higher weight = more traffic
    server 10.0.1.2:8080 weight=2;
    server 10.0.1.3:8080 backup;    # only used when others fail

    # Health check (commercial version)
    # health_check interval=10s fails=3 passes=2;
}

server {
    listen 80;
    server_name example.com;

    # Rate limiting
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;

    location /api/ {
        limit_req zone=api burst=20 nodelay;
        proxy_pass http://backend_servers;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

        # Timeout settings
        proxy_connect_timeout 5s;
        proxy_read_timeout 30s;
    }

    # Static files (direct serve, no proxy)
    location /static/ {
        alias /var/www/static/;
        expires 30d;
        add_header Cache-Control "public, immutable";
    }
}

Load Balancer Types

Type	Layer	Examples	Use Case
L4 (Transport)	TCP/UDP	LVS, HAProxy	Raw performance
L7 (Application)	HTTP	Nginx, ALB, Envoy	URL routing, headers
DNS	Application	Route 53, Cloudflare	Geographic routing
Global	Any	Cloudflare, Akamai	Multi-region

CDN (Content Delivery Network)

Feature	Description
Edge caching	Cache content at POPs worldwide
Static assets	JS, CSS, images, fonts
DDoS protection	Absorb and filter attacks
SSL termination	TLS at edge, reduce server load
Origin shield	Reduce origin requests
Providers	Cloudflare, AWS CloudFront, Fastly

💡

Use CDN for ALL static assets (images, JS, CSS, fonts). This reduces origin server load by 60-80% and dramatically improves latency for global users. Set Cache-Control headers with content-based hashes for immutable caching.

📨

Message Queues

ASYNC

messaging/kafka-producer.js

// ── Kafka: Distributed Event Streaming ──
const { Kafka } = require('kafkajs');
const kafka = new Kafka({
  clientId: 'order-service',
  brokers: ['kafka1:9092', 'kafka2:9092'],
});

const producer = kafka.producer();

// Produce event
async function createOrder(order) {
  await producer.send({
    topic: 'order-events',
    messages: [{
      key: order.id.toString(),
      value: JSON.stringify({
        event: 'ORDER_CREATED',
        data: order,
        timestamp: new Date().toISOString(),
      }),
    }],
  });
}

// ── Kafka Consumer ──
const consumer = kafka.consumer({ groupId: 'notification-service' });

await consumer.subscribe({ topic: 'order-events', fromBeginning: false });

await consumer.run({
  eachMessage: async ({ topic, partition, message }) => {
    const event = JSON.parse(message.value.toString());
    switch (event.event) {
      case 'ORDER_CREATED':
        await sendEmail(event.data.userId, 'Order confirmed!');
        break;
      case 'ORDER_SHIPPED':
        await sendEmail(event.data.userId, 'Order shipped!');
        break;
    }
  },
});

// ── Kafka Topics & Partitions ──
// Topic: order-events
//   Partition 0: [msg1, msg2, msg3]  → Consumer Group A
//   Partition 1: [msg4, msg5, msg6]  → Consumer Group A
//   Partition 2: [msg7, msg8, msg9]  → Consumer Group B
// Key-based partitioning ensures ordering per key

Message Queue Comparison

System	Type	Ordering	Throughput
Kafka	Log-based	Per partition	Million+/sec
RabbitMQ	Queue-based	Per queue	10K-50K/sec
SQS	Queue (AWS)	Best effort	Unlimited
Redis Streams	Log-based	Per stream	100K+/sec
Pulsar	Log-based	Per partition	Million+/sec

Queue Use Cases

Pattern	Example	Queue Type
Task queue	Email sending, PDF generation	RabbitMQ, SQS
Event streaming	Order events, user activity	Kafka, Pulsar
Pub/Sub	Notifications, cache invalidation	Redis, Kafka
Dead letter	Failed message handling	All (DLQ)
Request-reply	RPC pattern	RabbitMQ (RPC)
Fanout	Broadcast to many consumers	Kafka, Redis Pub/Sub

messaging/patterns.md

# ── Messaging Patterns ──

PUB/SUB:
Producer → [Queue] → Consumer A (notifications)
                    → Consumer B (analytics)
                    → Consumer C (audit log)

COMPETING CONSUMERS:
Producer → [Queue] → Consumer A (worker 1) → DB
                    → Consumer B (worker 2) → DB
                    → Consumer C (worker 3) → DB
# Each message goes to ONE consumer

REQUEST-REPLY:
Client → [Request Queue] → Worker → [Reply Queue] → Client
# Use correlation_id to match request/response

DEAD LETTER QUEUE:
Main Queue → Consumer (fails) → DLQ → Alert + Manual processing

💡

Use Kafka for event streaming (order events, user activity logs, real-time analytics). Use RabbitMQ/SQS for task queues (email, PDF generation, image processing). Kafka preserves ordering within a partition and supports replay.

🔒

Security

SECURITY

security/auth-flow.md

# ── Authentication Flow (JWT) ──
1. Client → POST /login (username + password)
2. Server validates credentials
3. Server returns JWT (access + refresh tokens)
4. Client stores tokens (httpOnly cookie or secure storage)
5. Client sends JWT in Authorization: Bearer <token>
6. Server validates JWT signature
7. Return protected resource

JWT Structure:
Header:    {"alg": "HS256", "typ": "JWT"}
Payload:   {"sub": "user_123", "role": "admin", "exp": 1700000000}
Signature: HMAC-SHA256(base64(header) + "." + base64(payload), secret)

# ── OAuth 2.0 Flow (Authorization Code) ──
1. Client redirects to: Auth Server /authorize?response_type=code
2. User authenticates & authorizes
3. Auth Server redirects with: ?code=AUTH_CODE
4. Client exchanges code: POST /token (code → access_token + refresh_token)
5. Client uses access_token to call APIs
6. Refresh token used when access_token expires

Security Best Practices

Practice	Description
HTTPS everywhere	TLS 1.3, HSTS headers
Password hashing	bcrypt/argon2 (never MD5/SHA1)
JWT	Short expiry, httpOnly cookies, RS256
Rate limiting	Prevent brute force attacks
Input validation	Whitelist, parameterized queries
CORS	Restrict origins, not * wildcard
Secrets	Env vars, vault, never in code
Dependency scanning	npm audit, Snyk
Least privilege	Minimal permissions per service
Logging	Audit logs, no sensitive data in logs

Security Headers

Header	Purpose
Content-Security-Policy	Prevent XSS, restrict sources
X-Frame-Options	Prevent clickjacking
X-Content-Type-Options	Prevent MIME sniffing
Strict-Transport-Security	Force HTTPS
X-XSS-Protection	XSS filter (legacy)
Access-Control-Allow-Origin	CORS policy
Referrer-Policy	Control referrer info

📖

Case Studies

DESIGN

case-studies/twitter-design.md

# ── Design Twitter / News Feed ──

Requirements:
- Post tweets (500M/day, 5,800/sec)
- Home feed (fan-out to followers)
- View timeline (chronological)
- Search tweets

Estimation:
- 300M users, 100M DAU
- Average 200 followers per user
- Tweet: 200 bytes avg
- Storage: 500M * 200B * 365 ≈ 36 TB/year

Architecture:
┌─────────┐    ┌──────────┐    ┌─────────────┐
│ Client  │───→│ API GW   │───→│ Tweet Svc   │
└─────────┘    └──────────┘    └──────┬──────┘
                                     │
               ┌─────────────────────┘
               ↓
┌──────────────┐    ┌──────────────┐
│ Write Path   │    │ Read Path    │
│ Tweet → DB   │    │ Feed Cache   │
│ Fan-out Queue│    │ (Redis)      │
└──────────────┘    └──────────────┘

Fan-out Strategies:
1. Fan-out on WRITE: Push tweet to all followers' feeds
   + Read is fast (O(1))
   - Write is slow for celebrities (millions of followers)

2. Fan-out on READ: Pull tweets from followed users
   + Write is fast
   - Read is slow (need to merge many feeds)

3. Hybrid: Fan-out on write for normal users
           Fan-out on read for celebrities (>1M followers)

Components:
- Tweet Service: Create/store tweets (PostgreSQL + Redis cache)
- Fan-out Service: Distribute tweets to follower feeds (Kafka)
- Feed Service: Compose home feed (Redis sorted sets)
- User Service: Profile, followers list
- Search Service: Elasticsearch for full-text search
- Notification Service: Email/push on events

case-studies/url-shortener.md

# ── Design URL Shortener (tinyurl.com) ──

Requirements:
- Shorten long URLs
- Redirect short → long
- Custom aliases (optional)
- Analytics (click tracking)
- High availability

Estimation:
- 100M URLs stored
- 100:1 read:write ratio
- 500K new URLs/day
- 50M redirects/day

Key Generation:
1. Counter + Base62 encoding (a-z, A-Z, 0-9)
   ID 1234567 → "5D1D" (base62)
   + Short, sortable
   - Single point of failure (counter service)

2. Hash (MD5/SHA256) → first 7 chars
   + Distributed generation
   - Collisions possible

3. Pre-generated random strings
   + No coordination needed
   - Wastes some IDs

Architecture:
┌────────┐   ┌──────────┐   ┌──────────┐
│ Client │──→│ API GW   │──→│ URL Svc  │
└────────┘   └──────────┘   └────┬─────┘
                                  │
                    ┌─────────────┤
                    ↓             ↓
              ┌──────────┐  ┌──────────┐
              │ Cache    │  │ Database │
              │ (Redis)  │  │ (SQL)    │
              └──────────┘  └──────────┘

Database Schema:
urls:
  id BIGINT PRIMARY KEY AUTO_INCREMENT,
  short_code VARCHAR(7) UNIQUE,
  long_url TEXT NOT NULL,
  user_id BIGINT,
  created_at TIMESTAMP,
  expires_at TIMESTAMP NULL

clicks:
  id BIGINT PRIMARY KEY AUTO_INCREMENT,
  short_code VARCHAR(7),
  ip_address VARCHAR(45),
  user_agent TEXT,
  referrer TEXT,
  clicked_at TIMESTAMP

Cache: short_code → long_url (Redis, TTL: 24h)
Analytics: Click events → Kafka → Analytics DB

💡

System design interviews typically cover: URL shortener, pastebin, Twitter/news feed, chat system, Google Maps, Netflix, Uber, rate limiter, key-value store, web crawler, notification system. Practice 5-6 of these thoroughly.

🎯

Interview Q&A

PREP

Q: Explain CAP theorem?In a distributed system, you can guarantee only 2 of 3: Consistency, Availability, Partition Tolerance. Since network partitions are inevitable, you choose between CP (consistent but unavailable during partitions — HBase, ZooKeeper) or AP (available but may return stale data — DynamoDB, Cassandra).

Q: SQL vs NoSQL — when to use which?SQL for structured data with relationships, complex queries, and ACID requirements (financial, inventory). NoSQL for massive scale, flexible schema, simple access patterns, and high write throughput (social feeds, IoT, session storage). Many systems use both: SQL for transactions, NoSQL for caching and analytics.

Q: How does a load balancer work?A load balancer distributes incoming traffic across multiple servers using algorithms (round-robin, least connections, IP hash). L4 operates at TCP/UDP level (fast). L7 operates at HTTP level (can route by URL, headers). Health checks remove unhealthy servers. It also handles SSL termination and rate limiting.

Q: What is eventual consistency?In an AP system, replicas may temporarily diverge but will eventually converge. Example: You post a tweet, your followers might not see it immediately but will within seconds. Achieved via read repair, hinted handoff, anti-entropy. Contrast with strong consistency (every read returns latest write).

Q: Design a rate limiter.Token Bucket: Fixed-rate token generation, allow burst. Leaky Bucket: Fixed-rate processing, smooth output. Fixed Window: Count requests per time window (edge problem at boundaries). Sliding Window: Weighted count across windows (Redis + sorted sets). Implement at API gateway with Redis as counter store.

Q: What is database sharding?Horizontal partitioning — splitting data across multiple database instances by a shard key. Hash-based (hash(key) % N), range-based (A-M shard 1, N-Z shard 2). Challenges: cross-shard queries, resharding, hot spots. Solve with consistent hashing (virtual nodes) and join-free denormalized data.

Q: How to handle server failures?Redundancy: Run multiple instances behind load balancer. Circuit breaker: Stop calling failing services. Retry with exponential backoff. Health checks: Auto-remove unhealthy servers. Graceful degradation: Return cached/default data. Failover: Hot standby or auto-promote replica. Multi-region: Active-active or active-passive.

Q: Microservices vs Monolith?Monolith: simpler to develop, deploy, debug. Better for small teams and MVPs. Microservices: independent deployment, scaling per service, polyglot. Better for large teams and mature products. Migrate when team size > 10 or services have different scaling needs. Extract services iteratively using Strangler Fig pattern.

💡

Key topics to master: CAP theorem, consistent hashing, load balancing, database sharding, caching strategies, message queues, CDN, API gateway, microservices, system design process, and common case studies (Twitter, URL shortener, chat, rate limiter).

⏳

Loading cheatsheet...