System Design Interview Prep Roadmap â Complete 8-Week Study Plan
In this tutorial, you'll learn about System Design Interview Prep} Interview Prep Roadmap. We cover key concepts, practical examples, and best practices.
A complete 8-week system design interview roadmap covering fundamentals, distributed systems patterns, real-world architecture studies, whiteboard practice, and mock interview preparation.
What You'll Learn
You will learn a structured approach to designing large-scale systems, master the common patterns used in system design interviews, understand trade-offs between architectures, and gain confidence through mock interview practice with real-world scenarios.
Why It Matters
System design interviews are the hardest part of the technical interview process for senior engineering roles. They test your ability to think at scale, make architectural trade-offs, communicate clearly, and handle ambiguity. Unlike coding interviews, there is no single correct answer -- interviewers evaluate your process, reasoning, and depth of knowledge.
Real-World Use
The DodaZIP and Doda Browser engineering teams use the same design patterns covered in this roadmap. When Doda Browser needed to build a real-time sync feature across 10 million devices, the engineering lead sketched a solution using the same methodology taught here: start with requirements, estimate scale, design the data model, then iterate on the architecture.
Your Learning Path
flowchart LR
A[Week 1-2: Fundamentals] --> B[Week 3-4: Core Patterns]
B --> C[Week 5-6: Real Systems]
C --> D[Week 7: Mock Practice]
D --> E[Week 8: Final Review]
A --> F{You Are Here}
style F fill:#f90,color:#fff
Prerequisites: 2+ years of software engineering experience. Familiarity with databases, APIs, caching, and basic networking concepts. This roadmap assumes you can already code at a senior engineer level.
Week 1-2: Fundamentals
Scalability Concepts
| Concept | Definition | Key Question |
|---|---|---|
| Vertical scaling | Add more power to a single machine | When is this still viable? |
| Horizontal scaling | Add more machines | How do you handle state? |
| Load balancing | Distribute traffic across servers | Which algorithm? Round-robin, least connections? |
| Caching | Store frequently accessed data in fast storage | What is the eviction policy? |
| CDN | Distribute static assets geographically | What should NOT go on a CDN? |
| Database sharding | Split data across multiple databases | What is the shard key? |
Estimation Skills
Example: Estimate Twitter timeline storage
Daily active users: 300M
Tweets per active user per day: 2
Total tweets per day: 600M
Average tweet size: 280 bytes (text) + metadata ~100 bytes
Daily storage: 600M * 400 bytes = 240 GB/day
Yearly storage: 240 GB * 365 â 87 TB/year
With 3x replication: ~262 TB/year
This tells us we need distributed storage from day 1.
Expected behavior: Running these estimations before designing gives you concrete constraints. If the numbers are too large for a single database, you know you need sharding or partitioning.
Key Trade-offs to Know
| Trade-off | Description |
|---|---|
| Consistency vs Availability | CAP theorem: you can have at most two of three (partition tolerance is required) |
| Read vs Write optimization | Denormalize for reads, normalize for writes |
| Latency vs Throughput | Batch for throughput, stream for latency |
| Synchronous vs Asynchronous | Sync is simpler but blocks, async is complex but scales |
| SQL vs NoSQL | SQL for complex queries, NoSQL for horizontal scale |
Week 3-4: Core Design Patterns
Load Balancer Pattern
Client -> DNS -> Load Balancer -> Web Servers
-> Cache (Redis)
-> Database (Primary/Replica)
| Layer | Technology | Purpose |
|---|---|---|
| DNS | Route53, CloudDNS | Geographic routing |
| L4 LB | HAProxy, NLBs | TCP load balancing |
| L7 LB | ALB, Envoy, Nginx | HTTP routing, SSL termination |
Database Replication
flowchart LR A[Application] -->|writes| B[Primary DB] B -->|async replication| C[Replica 1] B -->|async replication| D[Replica 2] B -->|async replication| E[Replica 3] A -->|reads| C A -->|reads| D A -->|reads| E style B fill:#4a9,color:#fff style C fill:#49a,color:#fff style D fill:#49a,color:#fff style E fill:#49a,color:#fff
Expected behavior: Writes go to the primary. Reads are distributed across replicas. If the primary fails, one replica is promoted. Read scalability increases linearly with replica count.
Caching Strategies
# Cache-aside pattern (lazy loading)
import redis
import json
cache = redis.Redis(host="localhost", port=6379)
def get_user_profile(user_id):
"""Fetch user profile with cache-aside pattern."""
# Try cache first
cached = cache.get(f"user:{user_id}")
if cached:
return json.loads(cached)
# Cache miss: query database
profile = database.query(
"SELECT * FROM users WHERE id = %s",
(user_id,)
)
if profile:
# Store in cache with TTL
cache.setex(
f"user:{user_id}",
3600, # 1 hour TTL
json.dumps(profile)
)
return profile
Expected behavior: On first request, cache misses and the database is queried. Subsequent requests hit the cache. After 1 hour, the cache entry expires and the next request refreshes it.
Rate Limiting
import time
from collections import defaultdict
class SlidingWindowRateLimiter:
"""Rate limiter using sliding window counter."""
def __init__(self, max_requests=100, window_seconds=60):
self.max_requests = max_requests
self.window = window_seconds
self.requests = defaultdict(list)
def allow_request(self, client_id):
now = time.time()
window_start = now - self.window
# Remove expired entries
self.requests[client_id] = [
t for t in self.requests[client_id]
if t > window_start
]
# Check limit
if len(self.requests[client_id]) >= self.max_requests:
return False
self.requests[client_id].append(now)
return True
limiter = SlidingWindowRateLimiter(max_requests=5, window_seconds=10)
for i in range(8):
allowed = limiter.allow_request("user-42")
print(f"Request {i+1}: {'Allowed' if allowed else 'Rate limited'}")
time.sleep(1)
Expected output:
Request 1: Allowed
Request 2: Allowed
Request 3: Allowed
Request 4: Allowed
Request 5: Allowed
Request 6: Rate limited
Request 7: Rate limited
Request 8: Rate limited
Week 5-6: Study Real Systems
System Design Practice Problems
| Problem | Key Concepts | Sample Question |
|---|---|---|
| URL Shortener | Hashing, Base62, key generation | "Design TinyURL" |
| Chat System | WebSockets, presence, persistence | "Design WhatsApp" |
| News Feed | Fan-out, pull vs push, ranking | "Design Facebook Feed" |
| Video Streaming | Chunking, CDN, adaptive bitrate | "Design YouTube" |
| Ride Sharing | Geospatial indexing, matching | "Design Uber" |
| E-commerce | Inventory, cart, order processing | "Design Amazon" |
How to Approach a Design Problem
Step 1: Clarify Requirements (5 min)
- Functional: What should the system do?
- Non-functional: Scale, latency, availability, consistency
- Constraints: Budget, timeline, team size
Step 2: High-Level Design (10 min)
- Draw the main components
- Show data flow
- Identify the core API endpoints
Step 3: Deep Dive (15 min)
- Data model (schema, storage choice)
- Key component details
- Bottlenecks and trade-offs
Step 4: Scale (10 min)
- Where do you add more servers?
- How do you handle 10x traffic?
- What breaks first?
Week 7: Mock Interviews
Practice with real system design prompts under timed conditions.
Mock Prompt: Design a real-time leaderboard for a gaming platform
Requirements:
- 10M daily active users
- Scores update in real-time during matches
- Leaderboard shows top 100 players globally
- Also need friends leaderboard
- Latency: < 500ms for leaderboard view
Expected approach:
1. Use Redis sorted sets for real-time ranking
2. ZADD for score updates (O(log N) per update)
3. ZREVRANGE for top 100 (O(log N + M))
4. Shard by game region for write scalability
5. Read replicas for friends leaderboard queries
Common System Design Interview Mistakes
1. Jumping to a Solution Without Requirements
Starting with "we need a load balancer" before clarifying the problem's constraints leads to over-engineering or wrong architecture. Always spend the first 5 minutes on requirements.
2. Ignoring Non-Functional Requirements
Designing for 100 users when the question asks for 100 million means your solution misses the point. Confirm numbers: DAU, write volume, read volume, latency requirements.
3. Not Discussing Trade-offs
Every design choice has trade-offs. Not acknowledging them shows shallow understanding. When you choose a technology, explain what you are giving up: "I chose Cassandra over Postgres because write scalability matters more here than complex queries."
4. Going Too Deep Too Early
Explaining sharding internals before drawing the high-level architecture wastes time. Start top-down: first show the box diagram, then dive into details.
5. Forgetting About Failure Modes
Systems fail. Database replicas lag. Caches evict. Networks partition. An architecture that assumes everything works perfectly is not a production design. Discuss: what happens when Redis goes down? How do you handle a database failover?
6. Not Using the Whiteboard Effectively
A messy, unlabeled diagram confuses the interviewer. Use clear labels, consistent symbols, and show data flow with arrows. Practice drawing clean diagrams.
7. Ignoring the "So What" Question
After explaining your design, the interviewer will ask follow-ups to test depth. Common follow-ups: "How do you handle 10x more traffic?", "What if this component fails?", "How do you reduce latency by 50%?"
Practice Questions
1. What is the CAP theorem and how does it apply to system design? CAP theorem states a distributed system can guarantee at most two of: Consistency (every read gets the latest write), Availability (every request gets a response), and Partition Tolerance (system works despite network failures). In practice, you choose between CP (Consistency + Partition) and AP (Availability + Partition).
2. What is the difference between vertical and horizontal scaling? Vertical scaling adds resources (CPU, RAM) to a single machine. Horizontal scaling adds more machines. Vertical has an upper limit and creates a single point of failure. Horizontal is theoretically infinite but requires handling distributed state.
3. When would you choose a NoSQL database over a SQL database? Choose NoSQL for: horizontal scalability, high write throughput, flexible schema, and simple query patterns. Choose SQL for: complex joins, transactions, strict consistency, and well-defined schema.
4. What is the cache-aside pattern and when should you use it? Cache-aside loads data into cache on demand (lazy loading). On read: check cache, on miss query DB and populate cache. Simple to implement, handles cache failures gracefully (system still works without cache). Best for read-heavy workloads with tolerable staleness.
5. Challenge: Design a distributed rate limiter that can handle 1 million requests per second across multiple data centers. Consider accuracy, latency, and fault tolerance.
Mini Project: Design a URL Shortener
Design a complete URL shortening service similar to TinyURL. Your design should include: API design for creating and resolving short URLs, data model (relational or NoSQL), hash function and collision strategy (Base62 encoding with unique ID), cache layer for frequently accessed URLs, database sharding strategy for scaling to billions of URLs, analytics tracking for click counts and referrer data, and a written explanation of all trade-offs made.
# Conceptual implementation of the core logic
import string
import hashlib
BASE62 = string.digits + string.ascii_lowercase + string.ascii_uppercase
def encode_base62(num):
"""Convert a number to Base62 string."""
if num == 0:
return BASE62[0]
result = []
while num > 0:
result.append(BASE62[num % 62])
num //= 62
return "".join(reversed(result))
def generate_short_key(url, user_id):
"""Generate a short key using a unique ID (from a distributed ID generator)."""
# In production, use a distributed unique ID service (Snowflake, etc.)
unique_id = get_next_id() # Imagine this returns a globally unique integer
return encode_base62(unique_id)
# Example
short_key = generate_short_key("https://example.com/long-url", "user_42")
print(f"Short URL: https://short.ly/{short_key}")
Expected behavior: Every URL gets a unique short key. The key is 7 characters or fewer for billions of URLs. The resolver looks up the key in a distributed cache (Redis), falls back to the database, increments the click counter asynchronously, and redirects with a 302.
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro