Skip to content

TLA+ — Specifying Distributed Systems

DodaTech Updated 2026-06-21 4 min read

In this tutorial, you'll learn about TLA+. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

TLA+ (Temporal Logic of Actions) is a formal specification language designed by Leslie Lamport for modeling and verifying distributed and concurrent systems through state machines and temporal properties.

Learning Path

flowchart LR
  A["Temporal Logic"] --> B["TLA+
Specifying Distributed Systems"] B --> C["Model Checking"] B --> D["Correct-by-Construction"] style B fill:#f90,color:#fff,stroke-width:2px
â„šī¸ Info

What you'll learn: TLA+ syntax, specifying systems as state machines, writing invariants and temporal properties, and running the TLC model checker.

Why it matters: Distributed Systems bugs are notoriously hard to find through testing. TLA+ catches design errors before any code is written.

Real-world use: AWS uses TLA+ to verify DynamoDB, S3, and EBS Consistency Models. Durga Antivirus Pro uses TLA+ to model distributed signature update protocols.

Prerequisites

Temporal logic understanding. Basic Distributed Systems concepts (consensus, Replication).

What Is TLA+?

TLA+ describes a system as a state machine. A state is an assignment of values to variables. An action is a relation describing how variables change. A behavior is an infinite sequence of states.

A specification is a formula Init ∧ □[Next]_vars ∧ Fairness describing all possible behaviors.

Step-by-Step: A Simple TLA+ Spec

Step 1: Define the Module

---- MODULE Counter ---- EXTENDS Naturals VARIABLE counter

Init == counter = 0

Increment == counter' = counter + 1

Next == Increment

Spec == Init ∧ [][Next]_counter


### Step 2: Write Invariants

TypeOK == counter ∈ Nat


### Step 3: Check with TLC

```python
# Python simulation of the TLA+ spec
class CounterSpec:
    def __init__(self):
        self.counter = 0  # Init

    def increment(self):
        self.counter = self.counter + 1

    def invariant(self):
        return self.counter >= 0

spec = CounterSpec()
for _ in range(10):
    spec.increment()
    if not spec.invariant():
        print("Invariant violated!")
        break
print(f"Final counter: {spec.counter}, invariant holds: {spec.invariant()}")

Expected output:

Final counter: 10, invariant holds: True

Step-by-Step: The PlusCal Algorithm Language

PlusCal is a C-like pseudocode language that compiles to TLA+. It is easier to use for algorithmic specifications.

Step 1: Write a PlusCal Algorithm

(* --algorithm transfer
variables
    balance_a = 100,
    balance_b = 0;

process Transfer = 1
begin
    TransferLoop:
        while TRUE do
            if balance_a > 0 then
                balance_a := balance_a - 1;
                balance_b := balance_b + 1;
            end if;
        end while;
end process;
end algorithm; *)

Step 2: Translate to TLA+ and Check

class TransferSimulation:
    def __init__(self):
        self.balance_a = 100
        self.balance_b = 0

    def transfer(self):
        if self.balance_a > 0:
            self.balance_a -= 1
            self.balance_b += 1

    def invariant(self):
        return self.balance_a + self.balance_b == 100

sim = TransferSimulation()
for _ in range(50):
    sim.transfer()
    if not sim.invariant():
        print("Invariant violated!")
        break
print(f"A={sim.balance_a}, B={sim.balance_b}, invariant: {sim.invariant()}")

Expected output:

A=50, B=50, invariant: True

Step-by-Step: Checking Liveness

Liveness properties use temporal operators. Example: every request is eventually served.

Liveness == □ (request → ◇ served)
def check_liveness(trace):
    for i, state in enumerate(trace):
        if state["request"]:
            # Find future serve
            served_later = False
            for j in range(i, len(trace)):
                if trace[j]["served"]:
                    served_later = True
                    break
            if not served_later:
                return False
    return True

trace = [
    {"request": True, "served": False},
    {"request": True, "served": True},
    {"request": False, "served": False}
]
print(f"Liveness holds: {check_liveness(trace)}")

Expected output:

Liveness holds: True

Common Errors

1. Forgetting Primed Variables in Next

In TLA+, x' means the value of x in the next state. Forgetting primes means the action never changes any variable.

2. Weak Fairness Missing

Without fairness constraints, a TLA+ specification may allow infinite stuttering where enabled actions never occur.

3. Non-Constant vs Constant Parameters

Constants (declared with CONSTANT) differ from variables (declared with VARIABLE). Confusing them leads to incorrect specifications.

4. State Explosion in TLC

TLC is an explicit model checker. Too many variables or values cause state explosion. Use symmetry sets and model parameters.

5. Infinite State Without Refinement

TLA+ specifications can have infinite state spaces. TLC can only model check finite instances. Prove infinite-state properties with TLAPS.

Practice Questions

Q1: What does [][Next]_vars mean in TLA+?

Every step is either a Next action or a stuttering step where variables unchanged.

Q2: What is stuttering in TLA+?

A step where all variables remain unchanged. Stuttering invariance means a specification works under arbitrary interleavings.

Q3: How does TLA+ differ from Alloy?

TLA+ focuses on temporal behavior of dynamic systems. Alloy models structural properties of static snapshots.

Q4: What is PlusCal?

A high-level algorithm language that compiles to TLA+, making specification more accessible to programmers.

Q5: Can TLA+ verify liveness properties?

Yes. TLC model checker verifies both safety (invariants) and liveness (temporal formulas) properties.

Challenge

Model a simple distributed lock service with two clients in TLA+. Each client acquires and releases a lock. Write invariants: mutual exclusion (only one client holds the lock) and freedom from deadlock. Simulate in Python.

FAQ

### Do I need TLA+ for simple systems?

For single-threaded sequential programs, unit tests suffice. TLA+ excels at concurrent and Distributed Systems with complex interleavings.

### How is TLA+ different from testing?

Testing checks one execution. TLA+ specifications describe all possible executions.

### What is the TLC model checker?

An explicit-state model checker for TLA+ specifications that explores all reachable states up to configurable bounds.

### Is TLA+ used in industry?

Yes. AWS, Microsoft, Oracle, and Alibaba use TLA+ to design and verify distributed protocols.

### What is the best way to learn TLA+?

Read Leslie Lamport's "Specifying Systems" book and work through the TLA+ course on the TLA+ website.


Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro