Abstract Interpretation — Sound Static Analysis

DodaTech Updated 2026-06-21 5 min read

In this tutorial, you'll learn about Abstract Interpretation. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

Abstract Interpretation provides a mathematical framework for sound Static Analysis by overapproximating program behavior, enabling compilers and analyzers to prove the absence of certain bugs.

Learning Path

flowchart LR
  A["Model Checking"] --> B["Abstract Interpretation
Sound Static Analysis"]
  B --> C["Invariant Generation"]
  B --> D["Separation Logic"]
  style B fill:#f90,color:#fff,stroke-width:2px

ℹ️ Info

What you'll learn: Abstract domains (sign, interval, polyhedra), Galois connections, the abstract Interpreter design, widening for convergence, and soundness proofs.

Why it matters: Abstract Interpretation powers production static analyzers like Astrée, Polyspace, and Frama-C, used in aerospace and automotive software certification.

Real-world use: Astrée verified the absence of runtime errors in Airbus flight control software. Durga Antivirus Pro uses Abstract Interpretation to statically detect malicious code patterns.

Prerequisites

Program analysis basics, order theory (lattices), and a programming language like C or Python.

What Is Abstract Interpretation?

Abstract Interpretation runs a program over an abstract domain instead of concrete values. The abstract domain overapproximates all possible concrete values. If the analysis shows no error in the abstract, no error can occur in the concrete program.

The key insight: it is acceptable to have false positives (reporting possible errors that cannot happen), but never false negatives (missing real errors).

Step-by-Step: Sign Domain

Step 1: Define the Abstract Domain

The sign domain classifies integers as negative, zero, or positive.

from enum import Enum

class Sign(Enum):
    BOT = 0     # bottom (impossible)
    NEG = 1     # negative
    ZERO = 2    # zero
    POS = 3     # positive
    TOP = 4     # top (any)

    def join(self, other):
        if self == Sign.BOT:
            return other
        if other == Sign.BOT:
            return self
        if self == other:
            return self
        return Sign.TOP

Step 2: Define Abstract Operators

def abstract_add(a, b):
    if a == Sign.BOT or b == Sign.BOT:
        return Sign.BOT
    if a == Sign.TOP or b == Sign.TOP:
        return Sign.TOP
    if a == Sign.ZERO:
        return b
    if b == Sign.ZERO:
        return a
    if a == Sign.POS and b == Sign.POS:
        return Sign.POS
    if a == Sign.NEG and b == Sign.NEG:
        return Sign.NEG
    return Sign.TOP  # POS + NEG could be anything

print(f"POS + POS = {abstract_add(Sign.POS, Sign.POS)}")
print(f"POS + NEG = {abstract_add(Sign.POS, Sign.NEG)}")

Expected output:

POS + POS = Sign.POS
POS + NEG = Sign.TOP

Step-by-Step: Interval Domain

Step 2 (Interval): Define the Interval Lattice

class Interval:
    def __init__(self, lo, hi):
        self.lo = lo  # None means -infinity
        self.hi = hi  # None means +infinity

    @staticmethod
    def top():
        return Interval(None, None)

    @staticmethod
    def bot():
        return Interval(None, None)  # flagged by is_bot

    def join(self, other):
        lo = min(self.lo, other.lo) if self.lo is not None and other.lo is not None else None
        hi = max(self.hi, other.hi) if self.hi is not None and other.hi is not None else None
        return Interval(lo, hi)

    def __repr__(self):
        lo = "-inf" if self.lo is None else str(self.lo)
        hi = "+inf" if self.hi is None else str(self.hi)
        return f"[{lo}, {hi}]"

interval = Interval(1, 5).join(Interval(3, 10))
print(f"Joined interval: {interval}")

Expected output:

Joined interval: [1, 10]

Widening for Convergence

Loops require infinite ascending chains in the abstract domain. Widening ensures termination by extrapolating to a fixed point.

def widening(prev, curr):
    # Widen: if upper bound increases, set to +inf
    lo = prev.lo if prev.lo == curr.lo else None
    hi = prev.hi if prev.hi == curr.hi else None
    return Interval(lo, hi)

prev = Interval(0, 1)
curr = Interval(0, 2)
result = widening(prev, curr)
print(f"Widened: {result}")

Expected output:

Widened: [0, +inf]

Common Errors

1. Forgetting Bottom

Failing to define bottom leads to incorrect results for dead code paths. Every abstract domain must include bottom.

2. Non-Monotonic Transfer Functions

Abstract operators must be monotonic with respect to the lattice order. Non-monotonic functions break the fixed-point computation.

3. Not Widening Early Enough

Without widening, loops may never converge. Apply widening after a fixed number of iterations.

4. Over-Abstracting Precision

An overly coarse domain (e.g., sign only) produces too many false positives to be useful. Choose a domain with sufficient precision for your properties.

5. Soundness vs Completeness Tradeoff

Abstract Interpretation is sound but incomplete. False positives are guaranteed; false negatives are not. Understanding this tradeoff is critical for tool users.

Practice Questions

Q1: What is a Galois connection?

A pair of monotonic functions (alpha, gamma) between concrete and abstract domains such that alpha concretizes to at most gamma and gamma abstracts to at most alpha.

Q2: Why does widening guarantee termination?

Widening replaces an infinite ascending chain with a finite one by extrapolating to the top element.

Q3: What is the interval domain good for?

Proving the absence of buffer overflows and array out-of-bounds errors by tracking variable ranges.

Q4: What is a false positive in Abstract Interpretation?

The analyzer reports a possible error that cannot actually occur in the concrete program.

Q5: Name three abstract domains.

Sign domain (neg/zero/pos), interval domain ([lo, hi]), polyhedron domain (linear inequalities).

Challenge

Implement a simple abstract Interpreter for a tiny imperative language with assignments, if-statements, and while-loops. Use the interval domain to prove that for i in range(0, n): a[i] = 0 never accesses array a out of bounds.

FAQ

### How does Abstract Interpretation relate to type systems?

Type systems are a form of Abstract Interpretation where types abstract over sets of values. Type soundness is analogous to Abstract Interpretation soundness.

### What is the octagon domain?

An abstract domain representing constraints of the form ±x ± y ≤ c, more precise than intervals but less costly than polyhedra.

### Can Abstract Interpretation prove program termination?

Yes, with ranking functions abstracted over well-founded sets.

### What tools use Abstract Interpretation?

Astrée, Polyspace, Frama-C, CodeSonar, and Infer all use Abstract Interpretation for different analysis goals.

### Is Abstract Interpretation used in compilers?

Yes. GCC and LLVM use Abstract Interpretation in their optimization passes for constant propagation and range analysis.

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

← Previous Symbolic Execution — KLEE, Angr and Path Exploration Next → TLA+ — Specifying Distributed Systems

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Formal Verification