Code Generation — From IR to Machine Code

DodaTech Updated 2026-06-21 6 min read

In this tutorial, you'll learn about Code Generation. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

Code Generation is the final phase of a compiler that translates the optimized Intermediate Representation into target machine code, selecting appropriate machine instructions and managing registers to produce executable output.

What You'll Learn & Why It Matters

In this tutorial, you will learn how compilers select machine instructions, map virtual registers to physical registers, and generate executable code for real processors. Code Generation determines the runtime performance and correctness of the compiled output.

Real-world use: Game engines generate specialized shader code for different GPUs using Code Generation techniques. Durga Antivirus Pro uses instruction-level analysis to detect malicious shellcode embedded in documents by recognizing suspicious instruction sequences.

Prerequisites

You should understand intermediate representations from the IR tutorial. Familiarity with C programming or assembly language basics is helpful. Understanding CPU architecture concepts like registers and instruction pipelines is assumed.

The Code Generation Pipeline

Code Generation involves three major tasks:

Instruction selection — mapping IR operations to target machine instructions
Register Allocation — assigning virtual registers to physical machine registers
Instruction scheduling — reordering instructions for pipeline efficiency

graph TD
    A[IR Instructions] --> B[Instruction Selection]
    B --> C[Register Allocation]
    C --> D[Instruction Scheduling]
    D --> E[Machine Code Output]

    B --> F[Pattern matching IR opcodes to machine instructions]
    C --> G[Graph coloring or linear scan for register assignment]
    D --> H[Reorder for pipeline and cache efficiency]

    style A fill:#4CAF50,color:#fff
    style E fill:#f44336,color:#fff
    style B fill:#2196F3,color:#fff
    style C fill:#2196F3,color:#fff
    style D fill:#2196F3,color:#fff

Instruction Selection

Instruction selection maps each IR operation to one or more target machine instructions. The simplest approach uses a macro-expansion pattern: each IR opcode maps to a fixed sequence of machine instructions.

Simple Macro-Expansion Code Generator

class CodeGenerator:
    def __init__(self, target="x86"):
        self.target = target
        self.code = []
        self.label_counter = 0

    def new_label(self):
        self.label_counter += 1
        return f"L{self.label_counter}"

    def generate(self, ir_instructions):
        for instr in ir_instructions:
            self.emit_instruction(instr)
        return self.code

    def emit_instruction(self, instr):
        parts = instr.split()
        if len(parts) >= 3 and parts[1] == "=":
            self.emit_assignment(parts)
        elif parts[0] == "if":
            self.emit_branch(parts)
        elif parts[0] == "goto":
            self.code.append(f"  jmp {parts[1]}")
        elif parts[0] == "return":
            self.code.append("  ret")
        elif parts[0] == "label:":
            self.code.append(f"{parts[0]}")
        else:
            self.code.append(f"  ; unrecognized: {instr}")

    def emit_assignment(self, parts):
        target = parts[0]
        source = parts[2:]
        if len(source) == 1:
            self.code.append(f"  mov eax, [{source[0]}]")
            self.code.append(f"  mov [{target}], eax")
        elif len(source) == 3:
            _, op, right = source
            left = parts[2]
            self.code.append(f"  mov eax, [{left}]")
            if op == "+":
                self.code.append(f"  add eax, [{right}]")
            elif op == "-":
                self.code.append(f"  sub eax, [{right}]")
            elif op == "*":
                self.code.append(f"  imul eax, [{right}]")
            elif op == "/":
                self.code.append(f"  idiv [{right}]")
            self.code.append(f"  mov [{target}], eax")

    def emit_branch(self, parts):
        _, left, op, right, _, label = parts
        self.code.append(f"  mov eax, [{left}]")
        self.code.append(f"  cmp eax, [{right}]")
        jump_map = {">": "jg", "<": "jl", ">=": "jge", "<=": "jle", "==": "je", "!=": "jne"}
        self.code.append(f"  {jump_map[op]} {label}")

Testing the code generator:

ir_code = [
    "t1 = 5",
    "t2 = 3",
    "t3 = t1 + t2",
    "if t3 > 0 goto L1",
    "return",
    "label: L1",
    "x = t3",
]
gen = CodeGenerator()
output = gen.generate(ir_code)
for line in output:
    print(line)

Expected output:

  mov eax, [5]
  mov [t1], eax
  mov eax, [3]
  mov [t2], eax
  mov eax, [t1]
  add eax, [t2]
  mov [t3], eax
  mov eax, [t3]
  cmp eax, [0]
  jg L1
  ret
L1:
  mov eax, [t3]
  mov [x], eax

Tree Pattern Matching

Modern compilers use tree pattern matching (like LLVM's DAG-to-DAG selection) to produce optimal instruction sequences. The IR is treated as a tree of operations, and the instruction selector finds the best covering of the tree using target-specific patterns.

class TreePatternMatcher:
    patterns = {
        ("add", ("reg", "reg")): "ADD R{}, R{}, R{}",
        ("add", ("reg", "imm")): "ADD R{}, R{}, #{}",
        ("mul", ("reg", "reg")): "MUL R{}, R{}, R{}",
        ("load", ("mem",)): "LDR R{}, [{}]",
        ("store", ("reg", "mem")): "STR R{}, [{}]",
    }

    def select(self, ir_node, registers):
        op = ir_node[0]
        args = ir_node[1:]
        for pattern, template in self.patterns:
            if self.match(pattern, (op, args)):
                operands = self.extract_operands(args, registers)
                return template.format(*operands)
        raise CodeGenError(f"No pattern matches {op}")

Memory-to-Memory vs Register Machines

The generated code depends on the target architecture:

Architecture	Model	Example
x86	Register-memory	`ADD eax, [mem]`
ARM	Load-store	`LDR r0, [addr]; ADD r0, r0, r1; STR r0, [addr]`
RISC-V	Load-store	`lw t0, 0(a0); add t0, t0, t1; sw t0, 0(a0)`

The code generator must respect the target's instruction formats and addressing modes.

Common Errors in Code Generation

Error 1: Incorrect Operand Order

x86 uses AT&T syntax (add src, dst) vs Intel syntax (add dst, src). Getting operand order wrong produces incorrect code without warnings.

Error 2: Forgetting to Spill Registers

When all physical registers are in use, the compiler must spill a register to memory. Failing to handle spilling causes Register Allocation failures.

Error 3: Misaligned Memory Access

Some architectures require aligned memory access. Generating unaligned load/store instructions causes runtime faults on ARM and other RISC processors.

Error 4: Incorrect Instruction Sizes

x86 instructions have varying sizes (BYTE, WORD, DWORD, QWORD). Specifying the wrong size truncates or corrupts data.

Error 5: Missing Prologue/Epilogue

Function calls require stack frame setup (prologue) and teardown (epilogue). Omitting these corrupts the call stack and causes crashes.

Practice Questions

Question 1

What is instruction selection?

Show answer

Instruction selection maps IR operations to target machine instructions. Each IR operation may map to one or more machine instructions depending on the target architecture's capabilities.

Question 2

Why do compilers use an instruction selector with tree pattern matching?

Show answer

Tree pattern matching produces better code than simple macro expansion because it considers the structure of the entire expression, choosing optimal instruction sequences and exploiting complex addressing modes.

Question 3

What is register spilling?

Show answer

When the number of live variables exceeds available physical registers, the compiler moves (spills) some values to memory and reloads them when needed. Spilling is expensive because memory access is much slower than register access.

Question 4

What does a function prologue do?

Show answer

The prologue saves the return address, caller's frame pointer, and callee-saved registers on the stack, then allocates space for local variables by adjusting the stack pointer.

Question 5

How does instruction scheduling improve performance?

Show answer

Instruction scheduling reorders instructions to reduce pipeline stalls. For example, it separates a load instruction from the instruction that uses the loaded value to hide memory latency.

Challenge

Build a code generator for a RISC-V target that handles a subset of IR including arithmetic operations, loads, stores, branches, and function calls. Ensure it generates valid RISC-V assembly that can be assembled with the standard RISC-V toolchain.

FAQ

What is the difference between a compiler and an assembler?

A compiler generates assembly code or machine code from a high-level language. An assembler converts assembly code (mnemonics) into binary machine code. Many compilers output assembly and invoke the assembler as a separate step.

How does position-independent code work?

Position-independent code (PIC) uses relative addressing instead of absolute addresses. The compiler generates code that works regardless of where it is loaded in memory, essential for shared libraries and ASLR security features.

Can Code Generation introduce security vulnerabilities?

Yes. Poor Code Generation can enable buffer overflows (missing bounds checks), information leaks (uninitialized memory), or timing side channels (data-dependent branch patterns). Security-conscious compilers add mitigations.

What is a cross-compiler?

A cross-compiler runs on one platform (host) but generates code for a different platform (target). Cross-compilation is essential for Embedded Systems, mobile apps, and game consoles where the target cannot run a compiler.

← Previous Intermediate Representations — Three-Address Code and SSA Next → Compiler Optimization Passes — Constant Folding, Inlining and Loop Unrolling

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Compiler Design