Skip to content

System Design Interview Prep Roadmap — Complete 8-Week Study Plan

DodaTech Updated 2026-06-22 10 min read

In this tutorial, you'll learn about System Design Interview Prep} Interview Prep Roadmap. We cover key concepts, practical examples, and best practices.

A complete 8-week system design interview roadmap covering fundamentals, distributed systems patterns, real-world architecture studies, whiteboard practice, and mock interview preparation.

What You'll Learn

You will learn a structured approach to designing large-scale systems, master the common patterns used in system design interviews, understand trade-offs between architectures, and gain confidence through mock interview practice with real-world scenarios.

Why It Matters

System design interviews are the hardest part of the technical interview process for senior engineering roles. They test your ability to think at scale, make architectural trade-offs, communicate clearly, and handle ambiguity. Unlike coding interviews, there is no single correct answer -- interviewers evaluate your process, reasoning, and depth of knowledge.

Real-World Use

The DodaZIP and Doda Browser engineering teams use the same design patterns covered in this roadmap. When Doda Browser needed to build a real-time sync feature across 10 million devices, the engineering lead sketched a solution using the same methodology taught here: start with requirements, estimate scale, design the data model, then iterate on the architecture.

Your Learning Path

flowchart LR
  A[Week 1-2: Fundamentals] --> B[Week 3-4: Core Patterns]
  B --> C[Week 5-6: Real Systems]
  C --> D[Week 7: Mock Practice]
  D --> E[Week 8: Final Review]
  A --> F{You Are Here}
  style F fill:#f90,color:#fff
â„šī¸ Info

Prerequisites: 2+ years of software engineering experience. Familiarity with databases, APIs, caching, and basic networking concepts. This roadmap assumes you can already code at a senior engineer level.

Week 1-2: Fundamentals

Scalability Concepts

Concept Definition Key Question
Vertical scaling Add more power to a single machine When is this still viable?
Horizontal scaling Add more machines How do you handle state?
Load balancing Distribute traffic across servers Which algorithm? Round-robin, least connections?
Caching Store frequently accessed data in fast storage What is the eviction policy?
CDN Distribute static assets geographically What should NOT go on a CDN?
Database sharding Split data across multiple databases What is the shard key?

Estimation Skills

Example: Estimate Twitter timeline storage

Daily active users: 300M
Tweets per active user per day: 2
Total tweets per day: 600M
Average tweet size: 280 bytes (text) + metadata ~100 bytes
Daily storage: 600M * 400 bytes = 240 GB/day
Yearly storage: 240 GB * 365 ≈ 87 TB/year
With 3x replication: ~262 TB/year

This tells us we need distributed storage from day 1.

Expected behavior: Running these estimations before designing gives you concrete constraints. If the numbers are too large for a single database, you know you need sharding or partitioning.

Key Trade-offs to Know

Trade-off Description
Consistency vs Availability CAP theorem: you can have at most two of three (partition tolerance is required)
Read vs Write optimization Denormalize for reads, normalize for writes
Latency vs Throughput Batch for throughput, stream for latency
Synchronous vs Asynchronous Sync is simpler but blocks, async is complex but scales
SQL vs NoSQL SQL for complex queries, NoSQL for horizontal scale

Week 3-4: Core Design Patterns

Load Balancer Pattern

Client -> DNS -> Load Balancer -> Web Servers
                               -> Cache (Redis)
                               -> Database (Primary/Replica)
Layer Technology Purpose
DNS Route53, CloudDNS Geographic routing
L4 LB HAProxy, NLBs TCP load balancing
L7 LB ALB, Envoy, Nginx HTTP routing, SSL termination

Database Replication

flowchart LR
  A[Application] -->|writes| B[Primary DB]
  B -->|async replication| C[Replica 1]
  B -->|async replication| D[Replica 2]
  B -->|async replication| E[Replica 3]
  A -->|reads| C
  A -->|reads| D
  A -->|reads| E

  style B fill:#4a9,color:#fff
  style C fill:#49a,color:#fff
  style D fill:#49a,color:#fff
  style E fill:#49a,color:#fff

Expected behavior: Writes go to the primary. Reads are distributed across replicas. If the primary fails, one replica is promoted. Read scalability increases linearly with replica count.

Caching Strategies

# Cache-aside pattern (lazy loading)
import redis
import json

cache = redis.Redis(host="localhost", port=6379)

def get_user_profile(user_id):
    """Fetch user profile with cache-aside pattern."""

    # Try cache first
    cached = cache.get(f"user:{user_id}")
    if cached:
        return json.loads(cached)

    # Cache miss: query database
    profile = database.query(
        "SELECT * FROM users WHERE id = %s",
        (user_id,)
    )

    if profile:
        # Store in cache with TTL
        cache.setex(
            f"user:{user_id}",
            3600,  # 1 hour TTL
            json.dumps(profile)
        )

    return profile

Expected behavior: On first request, cache misses and the database is queried. Subsequent requests hit the cache. After 1 hour, the cache entry expires and the next request refreshes it.

Rate Limiting

import time
from collections import defaultdict

class SlidingWindowRateLimiter:
    """Rate limiter using sliding window counter."""

    def __init__(self, max_requests=100, window_seconds=60):
        self.max_requests = max_requests
        self.window = window_seconds
        self.requests = defaultdict(list)

    def allow_request(self, client_id):
        now = time.time()
        window_start = now - self.window

        # Remove expired entries
        self.requests[client_id] = [
            t for t in self.requests[client_id]
            if t > window_start
        ]

        # Check limit
        if len(self.requests[client_id]) >= self.max_requests:
            return False

        self.requests[client_id].append(now)
        return True

limiter = SlidingWindowRateLimiter(max_requests=5, window_seconds=10)

for i in range(8):
    allowed = limiter.allow_request("user-42")
    print(f"Request {i+1}: {'Allowed' if allowed else 'Rate limited'}")
    time.sleep(1)

Expected output:

Request 1: Allowed
Request 2: Allowed
Request 3: Allowed
Request 4: Allowed
Request 5: Allowed
Request 6: Rate limited
Request 7: Rate limited
Request 8: Rate limited

Week 5-6: Study Real Systems

System Design Practice Problems

Problem Key Concepts Sample Question
URL Shortener Hashing, Base62, key generation "Design TinyURL"
Chat System WebSockets, presence, persistence "Design WhatsApp"
News Feed Fan-out, pull vs push, ranking "Design Facebook Feed"
Video Streaming Chunking, CDN, adaptive bitrate "Design YouTube"
Ride Sharing Geospatial indexing, matching "Design Uber"
E-commerce Inventory, cart, order processing "Design Amazon"

How to Approach a Design Problem

Step 1: Clarify Requirements (5 min)
  - Functional: What should the system do?
  - Non-functional: Scale, latency, availability, consistency
  - Constraints: Budget, timeline, team size

Step 2: High-Level Design (10 min)
  - Draw the main components
  - Show data flow
  - Identify the core API endpoints

Step 3: Deep Dive (15 min)
  - Data model (schema, storage choice)
  - Key component details
  - Bottlenecks and trade-offs

Step 4: Scale (10 min)
  - Where do you add more servers?
  - How do you handle 10x traffic?
  - What breaks first?

Week 7: Mock Interviews

Practice with real system design prompts under timed conditions.

Mock Prompt: Design a real-time leaderboard for a gaming platform

Requirements:
- 10M daily active users
- Scores update in real-time during matches
- Leaderboard shows top 100 players globally
- Also need friends leaderboard
- Latency: < 500ms for leaderboard view

Expected approach:
1. Use Redis sorted sets for real-time ranking
2. ZADD for score updates (O(log N) per update)
3. ZREVRANGE for top 100 (O(log N + M))
4. Shard by game region for write scalability
5. Read replicas for friends leaderboard queries

Common System Design Interview Mistakes

1. Jumping to a Solution Without Requirements

Starting with "we need a load balancer" before clarifying the problem's constraints leads to over-engineering or wrong architecture. Always spend the first 5 minutes on requirements.

2. Ignoring Non-Functional Requirements

Designing for 100 users when the question asks for 100 million means your solution misses the point. Confirm numbers: DAU, write volume, read volume, latency requirements.

3. Not Discussing Trade-offs

Every design choice has trade-offs. Not acknowledging them shows shallow understanding. When you choose a technology, explain what you are giving up: "I chose Cassandra over Postgres because write scalability matters more here than complex queries."

4. Going Too Deep Too Early

Explaining sharding internals before drawing the high-level architecture wastes time. Start top-down: first show the box diagram, then dive into details.

5. Forgetting About Failure Modes

Systems fail. Database replicas lag. Caches evict. Networks partition. An architecture that assumes everything works perfectly is not a production design. Discuss: what happens when Redis goes down? How do you handle a database failover?

6. Not Using the Whiteboard Effectively

A messy, unlabeled diagram confuses the interviewer. Use clear labels, consistent symbols, and show data flow with arrows. Practice drawing clean diagrams.

7. Ignoring the "So What" Question

After explaining your design, the interviewer will ask follow-ups to test depth. Common follow-ups: "How do you handle 10x more traffic?", "What if this component fails?", "How do you reduce latency by 50%?"

Practice Questions

1. What is the CAP theorem and how does it apply to system design? CAP theorem states a distributed system can guarantee at most two of: Consistency (every read gets the latest write), Availability (every request gets a response), and Partition Tolerance (system works despite network failures). In practice, you choose between CP (Consistency + Partition) and AP (Availability + Partition).

2. What is the difference between vertical and horizontal scaling? Vertical scaling adds resources (CPU, RAM) to a single machine. Horizontal scaling adds more machines. Vertical has an upper limit and creates a single point of failure. Horizontal is theoretically infinite but requires handling distributed state.

3. When would you choose a NoSQL database over a SQL database? Choose NoSQL for: horizontal scalability, high write throughput, flexible schema, and simple query patterns. Choose SQL for: complex joins, transactions, strict consistency, and well-defined schema.

4. What is the cache-aside pattern and when should you use it? Cache-aside loads data into cache on demand (lazy loading). On read: check cache, on miss query DB and populate cache. Simple to implement, handles cache failures gracefully (system still works without cache). Best for read-heavy workloads with tolerable staleness.

5. Challenge: Design a distributed rate limiter that can handle 1 million requests per second across multiple data centers. Consider accuracy, latency, and fault tolerance.

Mini Project: Design a URL Shortener

Design a complete URL shortening service similar to TinyURL. Your design should include: API design for creating and resolving short URLs, data model (relational or NoSQL), hash function and collision strategy (Base62 encoding with unique ID), cache layer for frequently accessed URLs, database sharding strategy for scaling to billions of URLs, analytics tracking for click counts and referrer data, and a written explanation of all trade-offs made.

# Conceptual implementation of the core logic
import string
import hashlib

BASE62 = string.digits + string.ascii_lowercase + string.ascii_uppercase

def encode_base62(num):
    """Convert a number to Base62 string."""
    if num == 0:
        return BASE62[0]

    result = []
    while num > 0:
        result.append(BASE62[num % 62])
        num //= 62
    return "".join(reversed(result))

def generate_short_key(url, user_id):
    """Generate a short key using a unique ID (from a distributed ID generator)."""
    # In production, use a distributed unique ID service (Snowflake, etc.)
    unique_id = get_next_id()  # Imagine this returns a globally unique integer
    return encode_base62(unique_id)

# Example
short_key = generate_short_key("https://example.com/long-url", "user_42")
print(f"Short URL: https://short.ly/{short_key}")

Expected behavior: Every URL gets a unique short key. The key is 7 characters or fewer for billions of URLs. The resolver looks up the key in a distributed cache (Redis), falls back to the database, increments the click counter asynchronously, and redirects with a 302.

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro