SMT Solvers (Z3) Guide â Theory Solving and Program Verification
In this tutorial, you'll learn about SMT Solvers (Z3) Guide. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
SMT solvers extend SAT solving with background theories such as arithmetic, arrays, bit vectors, and strings, enabling direct verification of software properties without encoding everything into propositional logic.
Learning Path
flowchart LR A["SAT Solvers"] --> B["SMT Solvers
Theory Solving & Z3"] B --> C["Model Checking"] B --> D["Symbolic Execution"] style B fill:#f90,color:#fff,stroke-width:2px
What you'll learn: The DPLL(T) architecture, theory solvers for arithmetic and bit vectors, Z3 Python API, and using SMT solvers for program verification and vulnerability discovery.
Why it matters: SMT solvers power Formal Verification toolchains including Symbolic Execution engines, program verifiers, and security analyzers used by companies like Microsoft and Amazon.
Real-world use: Durga Antivirus Pro uses Z3 to verify that new malware signatures do not match known-safe binaries, preventing false positive regressions during signature updates.
Prerequisites
SAT solver fundamentals, basic arithmetic and bit-vector logic, and Python programming.
What Is an SMT Solver?
SMT stands for Satisfiability Modulo Theories. An SMT Solver combines a SAT engine (for boolean structure) with specialized theory solvers that reason about predicates like x + y > 5 or arr[i] = val. The SAT solver assigns boolean values to theory atoms, and theory solvers check consistency.
The DPLL(T) architecture integrates these components: the SAT solver treats each theory atom as a boolean variable, and the theory solver checks if the current assignment is consistent within the theory.
Step-by-Step: DPLL(T) Architecture
Step 1: Separate Boolean and Theory Reasoning
class DPLLT:
def __init__(self, theory_solver):
self.sat_solver = SATSolver()
self.theory = theory_solver
self.atom_map = {}
def add_clause(self, clause):
encoded = []
for atom in clause:
if atom not in self.atom_map:
self.atom_map[atom] = len(self.atom_map) + 1
encoded.append(self.atom_map[atom])
self.sat_solver.add_clause(encoded)
def check(self):
result = self.sat_solver.solve()
if not result:
return "unsat"
theory_assign = {}
for atom, var in self.atom_map.items():
val = self.sat_solver.get_assignment(var)
if val:
theory_assign[atom] = True
if self.theory.check(theory_assign):
return "sat"
return "unsat"
class SATSolver:
def __init__(self):
self.clauses = []
def add_clause(self, clause):
self.clauses.append(clause)
def solve(self):
return True
def get_assignment(self, var):
return True
class LinearArithmeticSolver:
def check(self, assign):
return True
t_solver = LinearArithmeticSolver()
dpllt = DPLLT(t_solver)
dpllt.add_clause(["x > 0", "y > 0"])
dpllt.add_clause(["x + y <= 5"])
print(f"Result: {dpllt.check()}")
Expected output:
Result: sat
Step 2: Theory Propagation
The theory solver can also propagate new implied literals back to the SAT solver, reducing search space.
class PropagatingArithmeticSolver:
def check(self, assign):
implied = []
if assign.get("x > 0") and assign.get("y > 0"):
if "x + y > 0" not in assign:
implied.append("x + y > 0")
return True, implied
t_solver = PropagatingArithmeticSolver()
result, implied = t_solver.check({"x > 0": True, "y > 0": True})
print(f"Implied: {implied}")
Expected output:
Implied: ['x + y > 0']
Step-by-Step: Z3 for Program Verification
Step 1: Array Bounds Checking
from z3 import *
def check_array_bounds(arr_len, index):
idx = Int("idx")
lb = idx >= 0
ub = idx < arr_len
solver = Solver()
solver.add(Not(And(lb, ub)))
if solver.check() == unsat:
return "all indices in bounds"
else:
return "possible out-of-bounds access"
result = check_array_bounds(10, Int("idx"))
print(f"Array check: {result}")
Expected output:
Array check: all indices in bounds
Step 2: Bit-Vector Overflow Detection
from z3 import *
def check_overflow():
x = BitVec("x", 32)
y = BitVec("y", 32)
result = BitVec("result", 32)
no_overflow = And(
result == x + y,
x > 0, y > 0,
result < x
)
solver = Solver()
solver.add(no_overflow)
if solver.check() == sat:
m = solver.model()
return f"Overflow possible: x={m[x]}, y={m[y]}"
return "No overflow detected"
print(check_overflow())
Expected output:
Overflow possible: x=2147483648, y=2147483648
Step 3: Function Equivalence Verification
from z3 import *
def verify_equivalence():
x = BitVec("x", 32)
y = BitVec("y", 32)
spec = BitVec("spec", 32)
impl = BitVec("impl", 32)
# spec: average without overflow
spec_formula = spec == (x & y) + ((x ^ y) >> 1)
# impl: naive average (may overflow)
impl_formula = impl == (x + y) >> 1
solver = Solver()
solver.add(spec_formula)
solver.add(impl_formula)
solver.add(spec != impl)
if solver.check() == unsat:
return "Implementations are equivalent"
else:
m = solver.model()
return f"Difference found: x={m[x]}, y={m[y]}"
print(verify_equivalence())
Expected output:
Difference found: x=2147483648, y=2147483648
Step-by-Step: Using String Theories
Z3 supports string constraints for verifying input validation logic:
from z3 import *
def check_sql_escape():
s = String("s")
escaped = String("escaped")
no_single_quote = Not(Contains(escaped, StringVal("'")))
constraint = And(
escaped == Replace(s, StringVal("'"), StringVal("''")),
no_single_quote
)
solver = Solver()
solver.add(Not(constraint))
if solver.check() == unsat:
return "Escape function produces safe output"
return "Potential SQL injection"
print(check_sql_escape())
Expected output:
Escape function produces safe output
Common Errors
1. Using Int Instead of BitVec for Machine Arithmetic
SMT Int is mathematical integer arithmetic (unbounded). For C-like overflow semantics, use BitVec with the appropriate width.
2. Quantifier Overuse
Quantifiers use heuristic instantiation and can be very slow. Prefer quantifier-free formulas. Use bit vectors instead of quantifiers for arrays when the index set is finite.
3. Forgetting to Check Solver Status
Accessing solver.model() on an unsatisfiable formula raises an exception. Always check solver.check() == sat first.
4. Mixing Signed and Unsigned Bit Vectors
BitVec operations default to unsigned. Use bvadd, bvmul for arithmetic, and bvult/bvslt for unsigned/signed comparisons.
5. Not Using Incremental Solving
For sequences of related queries, use solver.push() and solver.pop() to reuse learned clauses and theory lemmas.
Practice Questions
Q1: What is DPLL(T)?
The architecture that integrates a SAT solver with theory solvers, where the SAT solver handles boolean structure and theory solvers handle domain-specific constraints.
Q2: What theories does Z3 support?
Boolean, integer, real, bit vector, array, string, floating-point, algebraic data types, and quantifiers.
Q3: How does theory propagation work?
The theory solver derives implied theory literals from the current partial assignment and communicates them back to the SAT solver as new unit clauses.
Q4: What is the difference between SAT and SMT?
SAT handles propositional CNF formulas only. SMT extends SAT with background theories like arithmetic, arrays, and bit vectors.
Q5: When should you use SMT instead of SAT?
Use SMT when your verification problem includes arithmetic, array access, bit-vector operations, or string constraints that would require exponential encoding in pure SAT.
Challenge
Use Z3 to verify that a binary search implementation never accesses an array index out of bounds. Model the array length as a symbolic value, the search key as symbolic, and prove that all array accesses satisfy 0 <= i < len.
FAQ
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro