Skip to content

Saga Pattern — Distributed Transaction Management (2026)

DodaTech Updated 2026-06-20 7 min read

In this tutorial, you'll learn the Saga pattern — managing distributed transactions across microservices using sequences of local transactions with compensating actions for rollback. Why does this matter? In a monolith, a database transaction with ACID guarantees ensures consistency. In microservices, no single database spans all services. Sagas provide data consistency without distributed transactions (which are slow, fragile, and often unsupported). Real-world use: every major microservice deployment — Amazon order processing, Uber ride lifecycle, airline booking systems — uses sagas to coordinate multi-step workflows.

What Is the Saga Pattern?

A Saga is a sequence of local transactions. Each transaction is a local ACID operation within a single service. When a transaction fails, the saga executes compensating transactions to undo the effect of previous steps. There is no two-phase commit (2PC) — sagas use eventual consistency with rollback capability.

Two implementation styles exist: choreography (services react to events) and orchestration (a central coordinator tells services what to do).

graph TB
    subgraph BookingSaga[Flight + Hotel + Payment Saga]
        subgraph SuccessFlow[Happy Path]
            F1[Book Flight ✅]
            H1[Book Hotel ✅]
            P1[Process Payment ✅]
            C1[Confirmation ✅]
        end
        
        subgraph FailureFlow[Compensation Path]
            P2[Process Payment ❌]
            H2[Cancel Hotel 🔄]
            F2[Cancel Flight 🔄]
        end
        
        F1 --> H1
        H1 --> P1
        P1 --> C1
        
        P1 -.->|Failure| P2
        P2 --> H2
        H2 --> F2
    end
    
    style SuccessFlow fill:#2ECC71,color:#fff
    style FailureFlow fill:#E74C3C,color:#fff

How Sagas Work

Choreography Saga

Each service emits events and listens for events from other services. No central coordinator:

# --- Choreography Saga: Order Processing ---

# Order Service
class OrderService:
    def create_order(self, user_id: str, items: list) -> None:
        order = Order.create(user_id, items)
        self.repo.save(order)
        self.event_bus.publish(OrderCreated(order.id, items))
    
    def on_payment_failed(self, event: PaymentFailed) -> None:
        # Compensating action: cancel the order
        order = self.repo.find_by_id(event.order_id)
        order.cancel(reason="Payment failed")
        self.repo.save(order)
        self.event_bus.publish(OrderCancelled(order.id))

# Inventory Service
class InventoryService:
    def on_order_created(self, event: OrderCreated) -> None:
        try:
            self.reserve_items(event.items)
            self.event_bus.publish(InventoryReserved(event.order_id))
        except InsufficientStockError:
            self.event_bus.publish(InventoryReservationFailed(event.order_id))

# Payment Service
class PaymentService:
    def on_inventory_reserved(self, event: InventoryReserved) -> None:
        try:
            self.charge(event.order_id)
            self.event_bus.publish(PaymentProcessed(event.order_id))
        except PaymentError:
            self.event_bus.publish(PaymentFailed(event.order_id))
            # Inventory Service listens and releases reservation

Orchestration Saga

A central Saga Orchestrator tells each service what to do and handles failures:

# --- Orchestration Saga ---
class BookingSagaOrchestrator:
    def __init__(self):
        self._steps: list[SagaStep] = []
        self._compensations: list[callable] = []
    
    def add_step(self, forward: callable, compensate: callable) -> None:
        self._steps.append(SagaStep(forward, compensate))
    
    async def execute(self, context: SagaContext) -> None:
        executed_steps = []
        
        for step in self._steps:
            try:
                result = await step.forward(context)
                context.add_result(result)
                executed_steps.append(step)
            except Exception as e:
                # Compensate in reverse order
                for executed in reversed(executed_steps):
                    await executed.compensate(context)
                raise SagaFailedError(str(e))

# Usage
saga = BookingSagaOrchestrator()
saga.add_step(
    forward=lambda ctx: flight_service.book(ctx.flight_id),
    compensate=lambda ctx: flight_service.cancel(ctx.flight_id)
)
saga.add_step(
    forward=lambda ctx: hotel_service.book(ctx.hotel_id),
    compensate=lambda ctx: hotel_service.cancel(ctx.hotel_id)
)
saga.add_step(
    forward=lambda ctx: payment_service.charge(ctx.user_id, ctx.total),
    compensate=lambda ctx: payment_service.refund(ctx.user_id, ctx.total)
)

await saga.execute(context)

Compensating Transactions

Every step must have a compensating action that semantically undoes it:

Step Compensating Action
Reserve inventory Release inventory
Charge payment Refund payment
Book flight Cancel flight
Send email (No compensation needed — it's already sent. Log it.)

Compensations are business rollbacks, not database rollbacks. Charge → Refund is a separate transaction, not a ROLLBACK. The refund itself must succeed — sagas assume compensations eventually succeed.

Idempotency

Steps may be retried. Sagas must be idempotent:

class PaymentService:
    def charge(self, order_id: str, amount: float) -> None:
        if self.has_been_charged(order_id):
            return  # Idempotent — already processed
        self.db.execute("INSERT INTO payments ...")
    
    def refund(self, order_id: str) -> None:
        if self.has_been_refunded(order_id):
            return  # Idempotent
        self.db.execute("UPDATE payments SET status = 'refunded' ...")

Data Flow

sequenceDiagram
    participant O as Orchestrator
    participant F as Flight Service
    participant H as Hotel Service
    participant P as Payment Service
    
    O->>F: Book Flight
    F-->>O: Confirmed
    O->>H: Book Hotel
    H-->>O: Confirmed
    O->>P: Process Payment
    P-->>O: Failed ❌
    O->>H: Cancel Hotel (compensate)
    H-->>O: Done
    O->>F: Cancel Flight (compensate)
    F-->>O: Done
    O-->>Client: Saga Failed

Real-World Examples

Amazon Order Processing

Amazon's order flow is a choreography saga: OrderCreated → InventoryReserved → PaymentCharged → OrderConfirmed. If payment fails, inventory is released via compensation events.

Uber Ride Lifecycle

A ride goes through: DriverAssigned → RiderPicked → RideStarted → RideCompleted → PaymentProcessed. If the driver cancels, a compensation (unassign driver, find new driver) runs.

Airline Booking

Airlines use orchestration sagas: Book Flight → Book Hotel → Book Car → Process Payment. If payment fails, all bookings are cancelled using compensating transactions.

Pros and Cons

Pros Cons
No distributed transaction — no 2PC, no locking across services Eventual consistency — temporary inconsistency between steps
Resilience — failures handled via compensation, not rollback Complex failure logic — compensations must be reliable
Scalability — each service scales independently Debugging difficulty — tracing saga executions across services
Flexibility — choreography or orchestration styles No isolation — intermediate states are visible (lack of ACID isolation)
Business alignment — compensations mirror real-world rollbacks Idempotency required — every step must handle retries

When to Use Sagas

Use sagas when:

  • Multi-service transaction — a workflow spans multiple microservices
  • No 2PC available — services use different databases or databases don't support distributed transactions
  • Eventual consistency acceptable — temporary inconsistency (seconds to minutes) is tolerable
  • Business rollbacks exist — real-world processes have natural compensations (cancel booking, refund payment)

Skip sagas for single-service transactions (use ACID), for workflows requiring strict ACID isolation, or when compensations can't be defined (e.g., irreversible side effects like "email already sent").

FAQ

What is the difference between choreography and orchestration sagas?

Choreography uses events — services react to each other's events with no central coordinator. Orchestration uses a central orchestrator that commands each service. Choreography is simpler for small workflows; orchestration is better for complex workflows with many failure scenarios.

How do sagas handle the 'lost message' problem?

Use persistent message brokers with exactly-once or at-least-once delivery combined with idempotent consumers. Kafka with log compaction or a transactional outbox pattern ensures events are eventually delivered. Saga orchestrators can poll or use compensating timers.

Can a saga mixing both synchronous and async steps?

Yes. Some steps (like payment) may be async responses. The saga waits for callback events. Orchestration sagas handle this naturally — the orchestrator awaits responses before proceeding to the next step.

What happens if a compensating transaction fails?

Sagas assume compensations eventually succeed. Retry with exponential backoff. If compensation repeatedly fails, the saga enters a "manual intervention required" state — log it, alert the operations team, and provide a manual tool to complete the compensation.

How do you test sagas?

Test each service's saga steps independently with unit tests. Use integration tests with a real message broker to verify event chains. Use contract tests to ensure event formats are compatible. For orchestration sagas, test the orchestrator with mock services and verify compensation execution order

Practice Questions

  1. What is the fundamental difference between a saga and a distributed ACID transaction?

  2. Compare choreography and orchestration sagas — provide one advantage of each.

  3. What is a compensating transaction, and how does it differ from a database rollback?

  4. Why must every step in a saga be idempotent?

  5. What happens when a compensating transaction itself fails?

Challenge

Design a saga for an e-commerce checkout flow with these steps: validate cart, reserve inventory, charge payment, create shipment, send confirmation email. Implement both choreography and orchestration versions. Add compensation for each step that can fail.

Real-World Task

Identify a multi-step workflow in your application that currently runs in a single database transaction but spans multiple services. Convert it to a saga. Define the steps, compensations, and event contracts. Test the saga by simulating a failure in the middle step and verifying that all preceding steps are compensated correctly.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro