Event Sourcing — State as a Sequence of Events (2026)

Q: How do you handle soft deletes and GDPR?

Add "deleted" events (like `AccountClosed`, `DataPurged`) rather than deleting events. For GDPR "right to be forgotten", create a compensating event that marks fields as redacted. The event stream remains intact — old event data is simply obfuscated.

DodaTech Updated 2026-06-20 7 min read

In this tutorial, you'll learn event sourcing — storing every state change as an immutable event rather than only the current state. Why does this matter? Traditional databases overwrite the past — you lose the history of how state evolved. Event sourcing preserves every change, giving you a complete audit trail, temporal querying, and the ability to reconstruct past states. Real-world use: financial systems (every transaction is an event), version control (git stores commits as events), and event-driven systems at companies like EventStore and Axon.

What Is Event Sourcing?

Event sourcing stores state as a sequence of events. Instead of saving the current balance of a bank account ($500), you save every transaction — AccountOpened, Deposited(200), Withdrew(50), Deposited(350). The current balance is derived by replaying those events ($0 + 200 - 50 + 350 = $500).

The event store is an append-only log. Events are never updated or deleted — only appended. This is the foundation for audit, debugging, and time travel.

graph LR
    subgraph AppendOnlyLog[Event Store — Append-Only Log]
        E1[AccountOpened]
        E2[Deposited $200]
        E3[Withdrew $50]
        E4[Deposited $350]
    end
    
    subgraph CurrentState[Current State]
        B[Balance: $500]
    end
    
    subgraph TemporalQuery[Temporal Query]
        Q[State at time t₃
Balance: $150]
    end
    
    E1 --> E2 --> E3 --> E4
    E1 --> B
    E1 --> Q
    E2 --> Q
    
    style AppendOnlyLog fill:#4A90D9,color:#fff
    style CurrentState fill:#2ECC71,color:#fff
    style TemporalQuery fill:#E67E22,color:#fff

How Event Sourcing Works

The Event Store

Events are stored sequentially with a version number. Each event represents a fact that happened:

from dataclasses import dataclass, field
from datetime import datetime
from uuid import uuid4

@dataclass
class Event:
    event_id: str = field(default_factory=lambda: str(uuid4()))
    aggregate_id: str = ""
    event_type: str = ""
    data: dict = field(default_factory=dict)
    version: int = 0
    timestamp: datetime = field(default_factory=datetime.utcnow)

@dataclass
class AccountOpened(Event):
    def __init__(self, account_id: str, owner: str, initial_deposit: float):
        super().__init__(
            aggregate_id=account_id,
            event_type="AccountOpened",
            data={"owner": owner, "initial_deposit": initial_deposit}
        )

@dataclass
class Deposited(Event):
    def __init__(self, account_id: str, amount: float):
        super().__init__(
            aggregate_id=account_id,
            event_type="Deposited",
            data={"amount": amount}
        )

@dataclass
class Withdrew(Event):
    def __init__(self, account_id: str, amount: float):
        super().__init__(
            aggregate_id=account_id,
            event_type="Withdrew",
            data={"amount": amount}
        )

Rebuilding State

The current state is derived by replaying events for an aggregate:

class BankAccount:
    def __init__(self):
        self._balance = 0.0
        self._owner = ""
        self._version = 0
    
    @classmethod
    def from_events(cls, events: list[Event]) -> "BankAccount":
        account = cls()
        for event in events:
            account.apply(event)
        return account
    
    def apply(self, event: Event) -> None:
        if isinstance(event, AccountOpened):
            self._owner = event.data["owner"]
            self._balance = event.data["initial_deposit"]
        elif isinstance(event, Deposited):
            self._balance += event.data["amount"]
        elif isinstance(event, Withdrew):
            self._balance -= event.data["amount"]
        self._version = event.version
    
    def deposit(self, amount: float) -> Deposited:
        if amount <= 0:
            raise ValueError("Deposit amount must be positive")
        return Deposited(aggregate_id="", amount=amount)
    
    def withdraw(self, amount: float) -> Withdrew:
        if amount <= 0:
            raise ValueError("Withdrawal amount must be positive")
        if amount > self._balance:
            raise ValueError("Insufficient funds")
        return Withdrew(aggregate_id="", amount=amount)

Snapshots

Replaying thousands of events to load state is slow. Snapshots capture state at a point in time so you can start from the snapshot and replay only events since then:

class SnapshotStore:
    def __init__(self, event_store: EventStore, threshold: int = 100):
        self._event_store = event_store
        self._threshold = threshold  # Snapshot every N events
    
    async def load_aggregate(self, aggregate_id: str) -> BankAccount:
        snapshot = await self.load_snapshot(aggregate_id)
        
        if snapshot:
            from_version = snapshot.version + 1
            account = snapshot.state
        else:
            from_version = 0
            account = BankAccount()
        
        events = await self._event_store.get_events(
            aggregate_id, from_version=from_version
        )
        
        for event in events:
            account.apply(event)
        
        if account._version - from_version >= self._threshold:
            await self.save_snapshot(aggregate_id, account)
        
        return account

Event Versioning

Events evolve over time. A UserRegistered event might start with 3 fields and grow to 10. Handle versioning with:

Upcasting — transform old event formats to new ones during replay
Backward compatibility — new consumers handle both old and new event formats

class EventUpcaster:
    def upcast(self, event: dict) -> dict:
        if event["event_type"] == "UserRegistered":
            # Old format: no "phone" field
            if "phone" not in event["data"]:
                event["data"]["phone"] = None
        return event

CQRS + Event Sourcing

CQRS and event sourcing are natural partners. Commands produce events. Projections consume events to build read models:

class OrderProjection:
    def on_order_placed(self, event: OrderPlaced) -> None:
        read_db.insert("order_views", {
            "order_id": event.aggregate_id,
            "customer_name": event.data["customer_name"],
            "total": event.data["total"],
            "version": event.version
        })

Pros and Cons

Pros	Cons
Complete audit trail — every state change is recorded forever	Complexity — replay, snapshots, versioning add overhead
Temporal queries — reconstruct state at any point in time	Storage growth — event store grows indefinitely
Debugging — replay events to reproduce bugs	Event schema evolution — versioning old events is tricky
Debugging — replay events to reproduce bugs	Query performance — replaying events for current state is slow without snapshots
Integration — events feed analytics, ML, and other systems naturally	Learning curve — thinking in events is different from thinking in state
CQRS compatibility — events are the perfect source for read projections	Deletion challenges — "right to be forgotten" requires compensating events

Real-World Examples

Banking and Financial Systems

Every transaction is an event — deposits, withdrawals, transfers, interest calculations. Event sourcing provides a complete audit trail required by regulation. A bank can replay all events for an account to verify the balance.

Git

Git is an event-sourced system. Commits are events, the working tree is the current state. You can check out any commit to see the state at that point. Branching is forking the event stream.

E-Commerce Order Systems

Orders go through many states (pending, paid, shipped, delivered, returned). Each state change is an event. Event sourcing enables accurate order tracking, customer notifications, and analytics on order flow.

When to Use Event Sourcing

Use event sourcing when:

Audit compliance — regulations require a full history of changes
Temporal queries — you need to answer "what was the state last Tuesday?"
Complex event-driven workflows — multiple systems react to state changes
Debugging and replay — you need to replay production scenarios

Skip event sourcing for simple CRUD applications, systems with high write throughput and low storage tolerance, or when the team lacks experience with event-driven patterns.

FAQ

What is the difference between event sourcing and event-driven architecture?

Event-Driven Architecture is about communication between services via events. Event sourcing is about persistence — storing state as events. They complement each other: event sourcing produces the events that drive an event-driven architecture.

How do you update events that were stored with the wrong schema?

Use upcasting — transform old events to the current schema during replay. Never mutate stored events. The upcast function reads the old format and returns events in the new format. Store the upcasted events in a new stream or cache.

Does event sourcing require a specific database?

Any append-only store works — PostgreSQL (event store tables), Kafka (log compaction), or purpose-built databases like EventStoreDB and AxonDB. PostgreSQL is a common starting point because teams already know it.

How do you handle soft deletes and GDPR?

Add "deleted" events (like AccountClosed, DataPurged) rather than deleting events. For GDPR "right to be forgotten", create a compensating event that marks fields as redacted. The event stream remains intact — old event data is simply obfuscated.

How do snapshots work in practice?

After every N events (typically 100-1000), serialize the aggregate state and store it alongside the event stream. On load, fetch the latest snapshot and replay only subsequent events. This keeps load times constant regardless of total event count

CQRS Pattern — commands produce events, projections consume them
Event-Driven Architecture — events as the communication backbone
Saga Pattern — compensation events in sagas
Repository Pattern — event stores as repositories
Microservices Architecture — services communicate via events

Practice Questions

How does event sourcing differ from storing current state in a traditional database?
What is a snapshot, and why is it necessary for performance?
How do you handle schema changes when old events used a different data format?
Why is event sourcing particularly well-suited for banking and financial applications?
What is a compensating event, and when would you use one?

Challenge

Implement a simple bank account using event sourcing. Define at least three event types. Implement the event store (use PostgreSQL or an in-memory list). Implement the aggregate that replays events to derive current balance. Add snapshot support. Test by replaying events to verify the final balance.

Real-World Task

Identify one aggregate in your current system that would benefit from a full history. Design the event types it would produce. Estimate the storage requirements if you used event sourcing (average events per aggregate × fields per event × number of aggregates). Implement a read-side projection that publishes these events to a message bus.

← Previous Encapsulation — Explained with Examples Next → Repository Pattern — Explained with Examples

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Software Architecture