Saga Pattern — Distributed Transaction Management (2026)
In this tutorial, you'll learn the Saga pattern — managing distributed transactions across microservices using sequences of local transactions with compensating actions for rollback. Why does this matter? In a monolith, a database transaction with ACID guarantees ensures consistency. In microservices, no single database spans all services. Sagas provide data consistency without distributed transactions (which are slow, fragile, and often unsupported). Real-world use: every major microservice deployment — Amazon order processing, Uber ride lifecycle, airline booking systems — uses sagas to coordinate multi-step workflows.
What Is the Saga Pattern?
A Saga is a sequence of local transactions. Each transaction is a local ACID operation within a single service. When a transaction fails, the saga executes compensating transactions to undo the effect of previous steps. There is no two-phase commit (2PC) — sagas use eventual consistency with rollback capability.
Two implementation styles exist: choreography (services react to events) and orchestration (a central coordinator tells services what to do).
graph TB
subgraph BookingSaga[Flight + Hotel + Payment Saga]
subgraph SuccessFlow[Happy Path]
F1[Book Flight ✅]
H1[Book Hotel ✅]
P1[Process Payment ✅]
C1[Confirmation ✅]
end
subgraph FailureFlow[Compensation Path]
P2[Process Payment ❌]
H2[Cancel Hotel 🔄]
F2[Cancel Flight 🔄]
end
F1 --> H1
H1 --> P1
P1 --> C1
P1 -.->|Failure| P2
P2 --> H2
H2 --> F2
end
style SuccessFlow fill:#2ECC71,color:#fff
style FailureFlow fill:#E74C3C,color:#fff
How Sagas Work
Choreography Saga
Each service emits events and listens for events from other services. No central coordinator:
# --- Choreography Saga: Order Processing ---
# Order Service
class OrderService:
def create_order(self, user_id: str, items: list) -> None:
order = Order.create(user_id, items)
self.repo.save(order)
self.event_bus.publish(OrderCreated(order.id, items))
def on_payment_failed(self, event: PaymentFailed) -> None:
# Compensating action: cancel the order
order = self.repo.find_by_id(event.order_id)
order.cancel(reason="Payment failed")
self.repo.save(order)
self.event_bus.publish(OrderCancelled(order.id))
# Inventory Service
class InventoryService:
def on_order_created(self, event: OrderCreated) -> None:
try:
self.reserve_items(event.items)
self.event_bus.publish(InventoryReserved(event.order_id))
except InsufficientStockError:
self.event_bus.publish(InventoryReservationFailed(event.order_id))
# Payment Service
class PaymentService:
def on_inventory_reserved(self, event: InventoryReserved) -> None:
try:
self.charge(event.order_id)
self.event_bus.publish(PaymentProcessed(event.order_id))
except PaymentError:
self.event_bus.publish(PaymentFailed(event.order_id))
# Inventory Service listens and releases reservation
Orchestration Saga
A central Saga Orchestrator tells each service what to do and handles failures:
# --- Orchestration Saga ---
class BookingSagaOrchestrator:
def __init__(self):
self._steps: list[SagaStep] = []
self._compensations: list[callable] = []
def add_step(self, forward: callable, compensate: callable) -> None:
self._steps.append(SagaStep(forward, compensate))
async def execute(self, context: SagaContext) -> None:
executed_steps = []
for step in self._steps:
try:
result = await step.forward(context)
context.add_result(result)
executed_steps.append(step)
except Exception as e:
# Compensate in reverse order
for executed in reversed(executed_steps):
await executed.compensate(context)
raise SagaFailedError(str(e))
# Usage
saga = BookingSagaOrchestrator()
saga.add_step(
forward=lambda ctx: flight_service.book(ctx.flight_id),
compensate=lambda ctx: flight_service.cancel(ctx.flight_id)
)
saga.add_step(
forward=lambda ctx: hotel_service.book(ctx.hotel_id),
compensate=lambda ctx: hotel_service.cancel(ctx.hotel_id)
)
saga.add_step(
forward=lambda ctx: payment_service.charge(ctx.user_id, ctx.total),
compensate=lambda ctx: payment_service.refund(ctx.user_id, ctx.total)
)
await saga.execute(context)
Compensating Transactions
Every step must have a compensating action that semantically undoes it:
| Step | Compensating Action |
|---|---|
| Reserve inventory | Release inventory |
| Charge payment | Refund payment |
| Book flight | Cancel flight |
| Send email | (No compensation needed — it's already sent. Log it.) |
Compensations are business rollbacks, not database rollbacks. Charge → Refund is a separate transaction, not a ROLLBACK. The refund itself must succeed — sagas assume compensations eventually succeed.
Idempotency
Steps may be retried. Sagas must be idempotent:
class PaymentService:
def charge(self, order_id: str, amount: float) -> None:
if self.has_been_charged(order_id):
return # Idempotent — already processed
self.db.execute("INSERT INTO payments ...")
def refund(self, order_id: str) -> None:
if self.has_been_refunded(order_id):
return # Idempotent
self.db.execute("UPDATE payments SET status = 'refunded' ...")
Data Flow
sequenceDiagram
participant O as Orchestrator
participant F as Flight Service
participant H as Hotel Service
participant P as Payment Service
O->>F: Book Flight
F-->>O: Confirmed
O->>H: Book Hotel
H-->>O: Confirmed
O->>P: Process Payment
P-->>O: Failed ❌
O->>H: Cancel Hotel (compensate)
H-->>O: Done
O->>F: Cancel Flight (compensate)
F-->>O: Done
O-->>Client: Saga Failed
Real-World Examples
Amazon Order Processing
Amazon's order flow is a choreography saga: OrderCreated → InventoryReserved → PaymentCharged → OrderConfirmed. If payment fails, inventory is released via compensation events.
Uber Ride Lifecycle
A ride goes through: DriverAssigned → RiderPicked → RideStarted → RideCompleted → PaymentProcessed. If the driver cancels, a compensation (unassign driver, find new driver) runs.
Airline Booking
Airlines use orchestration sagas: Book Flight → Book Hotel → Book Car → Process Payment. If payment fails, all bookings are cancelled using compensating transactions.
Pros and Cons
| Pros | Cons |
|---|---|
| No distributed transaction — no 2PC, no locking across services | Eventual consistency — temporary inconsistency between steps |
| Resilience — failures handled via compensation, not rollback | Complex failure logic — compensations must be reliable |
| Scalability — each service scales independently | Debugging difficulty — tracing saga executions across services |
| Flexibility — choreography or orchestration styles | No isolation — intermediate states are visible (lack of ACID isolation) |
| Business alignment — compensations mirror real-world rollbacks | Idempotency required — every step must handle retries |
When to Use Sagas
Use sagas when:
- Multi-service transaction — a workflow spans multiple microservices
- No 2PC available — services use different databases or databases don't support distributed transactions
- Eventual consistency acceptable — temporary inconsistency (seconds to minutes) is tolerable
- Business rollbacks exist — real-world processes have natural compensations (cancel booking, refund payment)
Skip sagas for single-service transactions (use ACID), for workflows requiring strict ACID isolation, or when compensations can't be defined (e.g., irreversible side effects like "email already sent").
FAQ
Related Concepts
- Event-Driven Architecture — sagas run on events
- Microservices Architecture — sagas solve the distributed data problem
- CQRS Pattern — commands in sagas produce events
- Event Sourcing — event-sourced aggregates in saga steps
- Modular Monolith vs Microservices — avoid sagas with modular monoliths (use ACID)
Practice Questions
What is the fundamental difference between a saga and a distributed ACID transaction?
Compare choreography and orchestration sagas — provide one advantage of each.
What is a compensating transaction, and how does it differ from a database rollback?
Why must every step in a saga be idempotent?
What happens when a compensating transaction itself fails?
Challenge
Design a saga for an e-commerce checkout flow with these steps: validate cart, reserve inventory, charge payment, create shipment, send confirmation email. Implement both choreography and orchestration versions. Add compensation for each step that can fail.
Real-World Task
Identify a multi-step workflow in your application that currently runs in a single database transaction but spans multiple services. Convert it to a saga. Define the steps, compensations, and event contracts. Test the saga by simulating a failure in the middle step and verifying that all preceding steps are compensated correctly.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro