Skip to content

Memory Leak Detection & Fixing -- Heap Analysis, Object Retention & Garbage Collection

DodaTech Updated 2026-06-23 9 min read

Memory leaks cause applications to slow down and crash after running for hours or days -- this guide teaches you how to detect them using heap analysis, identify what is retaining objects, and fix the root cause in Python, Node.js, and JVM-based applications.

What You'll Learn

Why It Matters

A leak that grows by 1MB per minute crashes a 2GB server in 33 hours. Most teams only notice when pager duty alerts at 3 AM. Learning to detect leaks before they reach production, and to fix them quickly when they do, is a critical skill for any developer.

Real-World Use

When your Python data processing script consumes all available RAM after running for 6 hours, a Node.js API server's RSS grows from 200MB to 2GB over a week, or a Java microservice gets OOM-killed by Kubernetes every 3 days, these detection and fixing techniques apply.

Common Memory Leak Patterns Table

Pattern Language Cause Detection Method
Closure retention Python, JS, Java Function closure retains large variable Heap snapshot comparison
Event listener leak JS, Java Listeners registered but never removed Retained size analysis
Global cache growth All Unbounded cache without eviction Memory timeline monitoring
Thread-local storage Java Thread pool with accumulating data Thread dump analysis
Circular references Python, JS Objects referencing each other preventing GC Weak reference implementation
Native memory leak All C/C++ extension or JNI not freeing memory RSS monitoring, valgrind
Detached DOM nodes JavaScript DOM elements removed but still referenced Heap snapshot with Detached filter

Step-by-Step Fixes

Fix 1: Python Memory Leak Detection

# leaky.py -- Global list that grows forever
import time
import gc

processed = []  # Global cache -- never cleared

def process_item(item):
    # Simulates a leak: data accumulates
    result = {
        "id": item,
        "data": "x" * 100_000,  # 100KB per item
        "metadata": {"timestamp": time.time(), "status": "processed"}
    }
    processed.append(result)
    return result

def get_memory_usage():
    import tracemalloc
    snapshot = tracemalloc.take_snapshot()
    top_stats = snapshot.statistics("lineno")
    for stat in top_stats[:5]:
        print(f"{stat.count:>6} blocks  {stat.size_kb:>8.1f} KB  {stat.traceback.format()[0]}")

# Check for objects keeping references
def find_leaks():
    gc.collect()
    for obj in gc.get_objects():
        if isinstance(obj, list) and len(obj) > 1000:
            referrers = gc.get_referrers(obj)
            print(f"Large list ({len(obj)} items) held by {referrers[:3]}")

Expected output:

   500 blocks   50000.0 KB  /app/leaky.py:10
   100 blocks   10000.0 KB  /app/leaky.py:15

Fix 2: Node.js Heap Snapshot Comparison

// snapshot-helper.js
const v8 = require('v8');
const fs = require('fs');

function takeHeapSnapshot(label) {
  const filename = `/tmp/heap-${label}-${Date.now()}.heapsnapshot`;
  const snapshot = v8.getHeapSnapshot();
  const stream = fs.createWriteStream(filename);
  snapshot.pipe(stream);
  stream.on('finish', () => {
    console.log(`Heap snapshot saved: ${filename}`);
    console.log(`Heap used: ${(process.memoryUsage().heapUsed / 1024 / 1024).toFixed(1)}MB`);
  });
  return filename;
}

// Take snapshot every 10 minutes to observe growth
setInterval(() => {
  takeHeapSnapshot('periodic');
  // Clear any caches that may have grown
}, 10 * 60 * 1000).unref();

// Expose snapshot trigger via HTTP
require('http').createServer((req, res) => {
  if (req.url === '/snapshot') {
    takeHeapSnapshot('manual');
    res.end('Snapshot taken');
  } else {
    res.end('OK');
  }
}).listen(3000);
# Compare snapshots in Chrome DevTools
# 1. Take snapshot A
# 2. Perform operations
# 3. Take snapshot B
# 4. Open DevTools -> Memory -> Load both
# 5. Select "Comparison" view -> filter by "Delta" -> sort by "Size Delta"

Expected output:

Heap snapshot saved: /tmp/heap-periodic-1719123456789.heapsnapshot
Heap used: 245.3MB

Fix 3: Java Memory Leak with JProfiler / VisualVM

// LeakExample.java -- Accidental object retention
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

public class LeakExample {
    // Static list acts as a global cache -- never cleared
    private static List<byte[]> cache = new ArrayList<>();

    public static void main(String[] args) {
        Executors.newScheduledThreadPool(1).scheduleAtFixedRate(() -> {
            // Each run adds 1MB to the cache
            cache.add(new byte[1024 * 1024]);
            long used = Runtime.getRuntime().totalMemory() -
                        Runtime.getRuntime().freeMemory();
            System.out.printf("Memory used: %d MB%n", used / 1024 / 1024);
        }, 0, 1, TimeUnit.SECONDS);
    }
}
# Run with JVM flags for heap dump on OOM
java -Xmx256m \
     -XX:+HeapDumpOnOutOfMemoryError \
     -XX:HeapDumpPath=/tmp/heapdump.hprof \
     LeakExample

# Analyze with jhat (included with JDK)
jhat /tmp/heapdump.hprof

# Open http://localhost:7000 to browse the heap

Expected output:

java.lang.OutOfMemoryError: Java heap space
Dumping heap to /tmp/heapdump.hprof ...
Heap dump file created [268435456 bytes in 0.456 secs]

Fix 4: RSS Monitoring and Heap Trend Analysis

# monitor_memory.py -- Track memory over time
import psutil
import time
import json
from datetime import datetime

def track_memory(pid=None, interval=5, duration=300):
    """Track RSS and heap usage over time."""
    process = psutil.Process(pid) if pid else psutil.Process()
    start = time.time()
    data_points = []

    try:
        while time.time() - start < duration:
            mem_info = process.memory_info()
            cpu_percent = process.cpu_percent()
            data_points.append({
                "timestamp": datetime.now().isoformat(),
                "rss_mb": mem_info.rss / 1024 / 1024,
                "vms_mb": mem_info.vms / 1024 / 1024,
                "cpu_percent": cpu_percent,
                "elapsed_sec": round(time.time() - start, 1)
            })
            time.sleep(interval)
    except KeyboardInterrupt:
        pass

    # Identify growth rate
    rss_values = [d["rss_mb"] for d in data_points]
    if len(rss_values) > 1:
        growth = (rss_values[-1] - rss_values[0]) / len(rss_values)
        print(f"Memory growth rate: {growth:.2f} MB per sample")
        print(f"Start RSS: {rss_values[0]:.1f} MB")
        print(f"End RSS: {rss_values[-1]:.1f} MB")

    return data_points

if __name__ == "__main__":
    data = track_memory(interval=2, duration=60)
    with open("memory-trend.json", "w") as f:
        json.dump(data, f, indent=2)

Expected output:

Memory growth rate: 2.35 MB per sample
Start RSS: 145.2 MB
End RSS: 215.8 MB

Fix 5: Fixing Event Listener Leaks

// bad.js -- Event listeners accumulate
class ButtonHandler {
  constructor(button) {
    this.clickCount = 0;
    // Each constructor call adds a new listener to the button
    button.addEventListener('click', () => {
      this.clickCount++;
      console.log(`Clicked ${this.clickCount} times`);
    });
  }
}

// Every time this runs, a new listener is added without removing the old one
function attachHandler() {
  const btn = document.getElementById('myButton');
  new ButtonHandler(btn); // Listener added but never removable
}
// fixed.js -- Proper listener cleanup
class ButtonHandler {
  constructor(button) {
    this.clickCount = 0;
    this.handleClick = this.handleClick.bind(this);
    button.addEventListener('click', this.handleClick);
    // Store the element reference for cleanup
    this.button = button;
  }

  handleClick() {
    this.clickCount++;
    console.log(`Clicked ${this.clickCount} times`);
  }

  destroy() {
    this.button.removeEventListener('click', this.handleClick);
    this.button = null;
  }
}

// Use with proper lifecycle management
const handler = new ButtonHandler(document.getElementById('myButton'));
// Later, when cleaning up:
handler.destroy();

Expected output:

Clicked 1 times
Clicked 2 times
(no duplicate listeners accumulating)

Memory Leak Detection Flowchart

flowchart TD
    A[Suspect Memory Leak] --> B{Check RSS growth}
    B -->|Increases over time| C[Take pair of heap snapshots]
    C --> D[Compare snapshots for growing objects]
    D --> E{Objects growing?}
    E -->|Yes| F[Trace GC roots and referrers]
    F --> G[Identify what retains the objects]
    G --> H[Fix: clear caches, remove listeners, null refs]
    E -->|No| I{Is native memory growing?}
    I -->|Yes| J[Check with valgrind or similar]
    J --> K[Update native library or fix native code]
    I -->|No| L[Monitor again under different load]
    H --> M[Memory Stabilized]
    K --> M
    L --> M

Prevention Tips

  • Set memory limits on all application containers using --memory (Docker) or resource quotas (Kubernetes)
  • Implement cache eviction with LRU or TTL strategies instead of unbounded caches
  • Use weak references (WeakRef in JS, WeakValueDictionary in Python, WeakReference in Java) for caches
  • Run memory profiling as part of your CI pipeline with tools like memray, clinic.js, or jmh
  • Monitor RSS and heap usage in production with Prometheus and set alerts on growth trends
  • Always pair addEventListener with removeEventListener in JavaScript, especially in SPA frameworks

Practice Questions

  1. What is the difference between a heap snapshot and a memory timeline? Answer: A heap snapshot is a point-in-time picture of all objects in memory, their sizes, and references to them. A memory timeline tracks memory usage over time, showing when allocations and GC events occur. Use snapshots for detailed object analysis and timelines for observing growth patterns.

  2. How do you find what is retaining a specific object in a heap snapshot? Answer: In Chrome DevTools Memory panel, select the object and view the "Retainers" section -- it shows the chain of references from the GC root to the selected object. In Python, use gc.get_referrers(obj) to find what holds a reference. In Java, use jhat or Eclipse MAT to view the "incoming references" of an object.

  3. What is the difference between a memory leak and memory bloat? Answer: A memory leak is memory that is allocated but never freed because it is still referenced (reachable) but no longer needed -- it accumulates indefinitely. Memory bloat is using more memory than necessary for current operations (e.g., oversized buffers, inefficient data structures) but can still be garbage collected. Bloat is usually fixed by optimization; leaks require reference analysis.

  4. Challenge: Write a Python decorator that tracks the memory usage of a function, prints the peak memory, and identifies any objects that were created but not freed after the function returns. Answer:

    import tracemalloc
    import gc
    from functools import wraps
    
    def track_memory(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            gc.collect()
            tracemalloc.start()
            before = tracemalloc.take_snapshot()
            result = func(*args, **kwargs)
            after = tracemalloc.take_snapshot()
            diff = after.compare_to(before, "lineno")
            peak = tracemalloc.get_traced_memory()[1]
            tracemalloc.stop()
            print(f"{func.__name__}: peak {peak/1024:.1f} KB")
            for stat in diff[:3]:
                print(f"  +{stat.size_kb:.1f} KB at {stat.traceback.format()[0]}")
            return result
        return wrapper
    

Quick Reference

Pattern Detection Tool Fix
Closure/retention Heap snapshot comparison Null references after use
Event listener leak Retained size analysis Always call removeEventListener
Global cache Memory timeline Add LRU eviction or TTL
Detached DOM Heap snapshot Detached filter Store references in WeakRef
Native memory RSS + valgrind Fix native library or update drivers

FAQ

How do you distinguish a memory leak from normal GC pressure?

Normal GC pressure shows a sawtooth pattern in memory usage -- allocation increases until a GC cycle drops it back to a baseline. A memory leak shows a steady upward trend where each GC cycle drops to a higher baseline than the previous one. Use the Memory or Performance panel to observe the GC pattern over several minutes.

What is the most reliable way to detect memory leaks in production?

Use a combination of: (1) RSS monitoring with alerts on growth trends over a 24-hour window, (2) periodic heap dump collection (every 6-12 hours) compared to find growing object graphs, and (3) integration of a leak detection tool like memray (Python), clinic.js (Node.js), or Eclipse MAT (Java) into your CI pipeline.

Why does my application use more memory over time even without a traditional leak?

This can be caused by JIT compilation caches (Java/.NET), V8 code caching (Node.js), module import caches (Python), or database connection pools that grow under load. Monitor the heap and identify which object types are growing. If the growth plateaus after warmup, it is likely JIT/metadata caching -- if it grows linearly without bound, it is a leak.

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro