Skip to content

Error Handling and Recovery in Compilers

DodaTech Updated 2026-06-21 7 min read

In this tutorial, you'll learn about Error Handling and Recovery in Compilers. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

Error handling and recovery in compilers refers to the techniques used to detect, report, and recover from errors in source code, allowing the compiler to continue processing and discover multiple errors in a single compilation session.

What You'll Learn & Why It Matters

In this tutorial, you will learn how production compilers detect errors, report them with actionable messages, and recover to find additional errors. Good error handling determines whether a compiler is usable or frustrating. Users judge compiler quality primarily by its error messages.

Real-world use: Durga Antivirus Pro uses compiler-grade error recovery to parse malformed executables that attackers intentionally corrupt to evade detection, recovering meaningful analysis data from broken structures.

Prerequisites

You should understand Parsing from the syntax analysis tutorial. Knowledge of lexer and parser implementation from the lexical analysis tutorial and AST tutorial is helpful.

Error Classification

Compilers classify errors into three categories:

Error Type Phase Examples Severity
Lexical Tokenization Invalid character, unterminated string Recoverable
Syntax Parsing Missing semicolon, unmatched brace Recoverable
Semantic Type Checking Type mismatch, undeclared variable Recoverable
Logical Runtime Division by zero, null dereference Not detectable
Fatal Any Out of memory, disk full Unrecoverable
graph TD
    A[Source Code] --> B[Lexical Analysis]
    B -->|Error| C[Lexical Error: bad character]
    B --> D[Syntax Analysis]
    D -->|Error| E[Syntax Error: missing semicolon]
    D --> F[Semantic Analysis]
    F -->|Error| G[Semantic Error: type mismatch]
    F --> H[Code Generation]
    H --> I[Executable]
    style C fill:#f44336,color:#fff
    style E fill:#f44336,color:#fff
    style G fill:#f44336,color:#fff

Error Reporting

A good error message contains three parts:

  1. Location: File, line, and column
  2. Description: What went wrong, in human terms
  3. Suggestion: How to fix it (when possible)
class CompilerError(Exception):
    def __init__(self, message, line, column, filename="<stdin>"):
        self.message = message
        self.line = line
        self.column = column
        self.filename = filename

    def __str__(self):
        return f"{self.filename}:{self.line}:{self.column}: error: {self.message}"

class ErrorReporter:
    def __init__(self):
        self.errors = []
        self.warnings = []

    def report(self, error):
        self.errors.append(error)
        print(error)

    def warn(self, message, line, column):
        warning = f"<stdin>:{line}:{column}: warning: {message}"
        self.warnings.append(warning)
        print(warning)

    def has_errors(self):
        return len(self.errors) > 0

reporter = ErrorReporter()
reporter.report(CompilerError("Expected ';' after statement", 5, 12))
reporter.warn("Variable 'x' is assigned but never used", 3, 1)

Expected output:

<stdin>:5:12: error: Expected ';' after statement
<stdin>:3:1: warning: Variable 'x' is assigned but never used

Panic Mode Recovery

Panic mode is the simplest recovery Strategy. When the parser encounters an error, it discards tokens until it finds a synchronization token (typically ;, }, end, or a keyword).

class PanicModeParser:
    def __init__(self, tokens):
        self.tokens = tokens
        self.pos = 0
        self.reporter = ErrorReporter()
        self.sync_tokens = {";", "}", "end", "else", "fi"}

    def parse(self):
        while self.pos < len(self.tokens):
            try:
                self.parse_statement()
            except CompilerError as e:
                self.reporter.report(e)
                self.synchronize()

    def synchronize(self):
        while self.pos < len(self.tokens):
            token = self.tokens[self.pos]
            if token[0] == "SEPARATOR" and token[1] in self.sync_tokens:
                return
            if token[0] == "KEYWORD":
                return
            self.pos += 1

    def parse_statement(self):
        token = self.peek()
        if token[0] == "IDENTIFIER":
            self.parse_assignment()
        elif token[0] == "KEYWORD" and token[1] == "if":
            self.parse_if()
        else:
            raise CompilerError(
                f"Unexpected token: {token[1]}", 0, 0
            )

    def peek(self):
        return self.tokens[self.pos] if self.pos < len(self.tokens) else ("EOF", "")

    def parse_assignment(self):
        ident = self.consume()
        if self.peek()[0] != "OPERATOR" or self.peek()[1] != "=":
            raise CompilerError("Expected '=' in assignment", 0, 0)
        self.consume()
        self.parse_expression()
        if self.peek()[0] != "SEPARATOR" or self.peek()[1] != ";":
            raise CompilerError("Expected ';' after assignment", 0, 0)
        self.consume()

    def parse_expression(self):
        if self.peek()[0] == "NUMBER":
            self.consume()
        elif self.peek()[0] == "IDENTIFIER":
            self.consume()
        else:
            raise CompilerError("Expected expression", 0, 0)

    def consume(self):
        token = self.peek()
        self.pos += 1
        return token

Error Productions

Error productions extend the grammar with rules that match common mistakes, allowing the parser to produce better error messages and continue Parsing.

class ErrorProductionDecorator:
    def __init__(self, parser):
        self.parser = parser

    def parse_expression(self):
        try:
            return self.parser.parse_expression()
        except CompilerError:
            # Check for common pattern: missing operator between operands
            if self.parser.peek()[0] == "NUMBER" and \
               self.parser.tokens[self.parser.pos + 1][0] == "NUMBER":
                raise CompilerError(
                    "Missing operator between numbers. Did you mean '3 + 4'?",
                    self.parser.line, self.parser.column
                )
            raise

Error Recovery Strategies Comparison

Strategy Complexity Error Quality Implementation
Panic mode Low Poor Skip to sync token
Error productions Medium Good Add error grammar rules
Phrase level Medium Good Local patch of parse stack
Global correction High Best Find minimal edit distance
Fault tolerance Medium Good Continue with speculative parse

Contextual Error Messages

Modern compilers provide rich contextual information:

def format_error_with_context(source, line, column, message, fix=None):
    lines = source.split("\n")
    context_line = lines[line - 1]
    pointer = " " * (column - 1) + "^"
    result = f"error: {message}\n"
    result += f" --> line {line}, column {column}\n"
    result += f"  {line} | {context_line}\n"
    result += f"     | {pointer}\n"
    if fix:
        result += f"help: {fix}\n"
    return result

source = "let x = 5\nreturn x + "
msg = format_error_with_context(
    source, 2, 12, "Unexpected end of file",
    "Add the missing expression after the '+' operator."
)
print(msg)

Expected output:

error: Unexpected end of file
 --> line 2, column 12
  2 | return x +
     |            ^
help: Add the missing expression after the '+' operator.

Common Errors in Error Handling

Error 1: Cascade Errors

A single missing semicolon can cause dozens of spurious errors. Always synchronize after the first error to avoid error cascades.

Error 2: Vague Messages

"Syntax error" without location or context is useless. Always include line number, column, and a specific description of what was expected vs found.

Error 3: Stopping at First Error

Stopping compilation at the first error wastes the developer's time. Implement recovery to report multiple errors per compilation.

Error 4: Incorrect Error Recovery

Recovering by skipping too many tokens can cause real errors later to be missed. Choose synchronization points carefully.

Error 5: Ignoring Warnings

Warnings highlight potential bugs. Modern compilers let users promote warnings to errors or suppress specific warning codes.

Practice Questions

Question 1

What is panic mode error recovery?

Show answer Panic mode discards input tokens until a synchronization token (like `;` or `}`) is found, then resumes Parsing. It is simple to implement but may skip valid code between the error and the sync point.

Question 2

What are error productions?

Show answer Error productions are grammar rules that match common programming mistakes, allowing the parser to produce specific, helpful error messages and continue Parsing normally.

Question 3

What information should every compiler error message include?

Show answer Every error message should include the file name, line number, column number, a description of the problem, what was expected, and ideally a suggestion for fixing it.

Question 4

What is error cascading?

Show answer Error cascading occurs when a single error triggers many subsequent errors because the parser has lost synchronization with the input stream. Proper error recovery techniques prevent cascading.

Question 5

How do modern compilers like Rust and Elm improve error messages?

Show answer Modern compilers show the erroneous code with markers pointing to the problem area, suggest fixes, include examples of correct code, and provide links to documentation for the relevant error code.

Challenge

Implement an error recovery system for a recursive descent parser that handles three common mistakes: missing semicolons, unmatched parentheses, and missing operator between operands. Each recovery Strategy should produce a specific error message and resume Parsing on the next statement boundary.

FAQ

What is the difference between an error and a warning?

An error indicates a violation of the language specification that makes the program invalid. A warning indicates questionable or potentially incorrect code that is still technically valid. Warnings can usually be ignored; errors cannot.

How does GCC's -fmax-errors work?

-fmax-errors sets a limit on the number of error messages produced. Once the limit is reached, compilation stops. This prevents a single missing header from generating thousands of useless errors.

What is an ICE (Internal Compiler Error)?

An ICE indicates a bug in the compiler itself, not in the user's program. Compiler developers request bug reports with test cases to fix ICEs. Modern compilers include crash reporting tools.

How do IDEs use compiler error output?

IDEs parse compiler error output to highlight errors inline, show tooltips with error messages, and provide quick-fix suggestions. The error format must be machine-parseable for IDE integration.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro