Memory Leak Detection & Fixing -- Heap Analysis, Object Retention & Garbage Collection
Memory leaks cause applications to slow down and crash after running for hours or days -- this guide teaches you how to detect them using heap analysis, identify what is retaining objects, and fix the root cause in Python, Node.js, and JVM-based applications.
What You'll Learn
Why It Matters
A leak that grows by 1MB per minute crashes a 2GB server in 33 hours. Most teams only notice when pager duty alerts at 3 AM. Learning to detect leaks before they reach production, and to fix them quickly when they do, is a critical skill for any developer.
Real-World Use
When your Python data processing script consumes all available RAM after running for 6 hours, a Node.js API server's RSS grows from 200MB to 2GB over a week, or a Java microservice gets OOM-killed by Kubernetes every 3 days, these detection and fixing techniques apply.
Common Memory Leak Patterns Table
| Pattern | Language | Cause | Detection Method |
|---|---|---|---|
| Closure retention | Python, JS, Java | Function closure retains large variable | Heap snapshot comparison |
| Event listener leak | JS, Java | Listeners registered but never removed | Retained size analysis |
| Global cache growth | All | Unbounded cache without eviction | Memory timeline monitoring |
| Thread-local storage | Java | Thread pool with accumulating data | Thread dump analysis |
| Circular references | Python, JS | Objects referencing each other preventing GC | Weak reference implementation |
| Native memory leak | All | C/C++ extension or JNI not freeing memory | RSS monitoring, valgrind |
| Detached DOM nodes | JavaScript | DOM elements removed but still referenced | Heap snapshot with Detached filter |
Step-by-Step Fixes
Fix 1: Python Memory Leak Detection
# leaky.py -- Global list that grows forever
import time
import gc
processed = [] # Global cache -- never cleared
def process_item(item):
# Simulates a leak: data accumulates
result = {
"id": item,
"data": "x" * 100_000, # 100KB per item
"metadata": {"timestamp": time.time(), "status": "processed"}
}
processed.append(result)
return result
def get_memory_usage():
import tracemalloc
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics("lineno")
for stat in top_stats[:5]:
print(f"{stat.count:>6} blocks {stat.size_kb:>8.1f} KB {stat.traceback.format()[0]}")
# Check for objects keeping references
def find_leaks():
gc.collect()
for obj in gc.get_objects():
if isinstance(obj, list) and len(obj) > 1000:
referrers = gc.get_referrers(obj)
print(f"Large list ({len(obj)} items) held by {referrers[:3]}")
Expected output:
500 blocks 50000.0 KB /app/leaky.py:10
100 blocks 10000.0 KB /app/leaky.py:15
Fix 2: Node.js Heap Snapshot Comparison
// snapshot-helper.js
const v8 = require('v8');
const fs = require('fs');
function takeHeapSnapshot(label) {
const filename = `/tmp/heap-${label}-${Date.now()}.heapsnapshot`;
const snapshot = v8.getHeapSnapshot();
const stream = fs.createWriteStream(filename);
snapshot.pipe(stream);
stream.on('finish', () => {
console.log(`Heap snapshot saved: ${filename}`);
console.log(`Heap used: ${(process.memoryUsage().heapUsed / 1024 / 1024).toFixed(1)}MB`);
});
return filename;
}
// Take snapshot every 10 minutes to observe growth
setInterval(() => {
takeHeapSnapshot('periodic');
// Clear any caches that may have grown
}, 10 * 60 * 1000).unref();
// Expose snapshot trigger via HTTP
require('http').createServer((req, res) => {
if (req.url === '/snapshot') {
takeHeapSnapshot('manual');
res.end('Snapshot taken');
} else {
res.end('OK');
}
}).listen(3000);
# Compare snapshots in Chrome DevTools
# 1. Take snapshot A
# 2. Perform operations
# 3. Take snapshot B
# 4. Open DevTools -> Memory -> Load both
# 5. Select "Comparison" view -> filter by "Delta" -> sort by "Size Delta"
Expected output:
Heap snapshot saved: /tmp/heap-periodic-1719123456789.heapsnapshot
Heap used: 245.3MB
Fix 3: Java Memory Leak with JProfiler / VisualVM
// LeakExample.java -- Accidental object retention
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
public class LeakExample {
// Static list acts as a global cache -- never cleared
private static List<byte[]> cache = new ArrayList<>();
public static void main(String[] args) {
Executors.newScheduledThreadPool(1).scheduleAtFixedRate(() -> {
// Each run adds 1MB to the cache
cache.add(new byte[1024 * 1024]);
long used = Runtime.getRuntime().totalMemory() -
Runtime.getRuntime().freeMemory();
System.out.printf("Memory used: %d MB%n", used / 1024 / 1024);
}, 0, 1, TimeUnit.SECONDS);
}
}
# Run with JVM flags for heap dump on OOM
java -Xmx256m \
-XX:+HeapDumpOnOutOfMemoryError \
-XX:HeapDumpPath=/tmp/heapdump.hprof \
LeakExample
# Analyze with jhat (included with JDK)
jhat /tmp/heapdump.hprof
# Open http://localhost:7000 to browse the heap
Expected output:
java.lang.OutOfMemoryError: Java heap space
Dumping heap to /tmp/heapdump.hprof ...
Heap dump file created [268435456 bytes in 0.456 secs]
Fix 4: RSS Monitoring and Heap Trend Analysis
# monitor_memory.py -- Track memory over time
import psutil
import time
import json
from datetime import datetime
def track_memory(pid=None, interval=5, duration=300):
"""Track RSS and heap usage over time."""
process = psutil.Process(pid) if pid else psutil.Process()
start = time.time()
data_points = []
try:
while time.time() - start < duration:
mem_info = process.memory_info()
cpu_percent = process.cpu_percent()
data_points.append({
"timestamp": datetime.now().isoformat(),
"rss_mb": mem_info.rss / 1024 / 1024,
"vms_mb": mem_info.vms / 1024 / 1024,
"cpu_percent": cpu_percent,
"elapsed_sec": round(time.time() - start, 1)
})
time.sleep(interval)
except KeyboardInterrupt:
pass
# Identify growth rate
rss_values = [d["rss_mb"] for d in data_points]
if len(rss_values) > 1:
growth = (rss_values[-1] - rss_values[0]) / len(rss_values)
print(f"Memory growth rate: {growth:.2f} MB per sample")
print(f"Start RSS: {rss_values[0]:.1f} MB")
print(f"End RSS: {rss_values[-1]:.1f} MB")
return data_points
if __name__ == "__main__":
data = track_memory(interval=2, duration=60)
with open("memory-trend.json", "w") as f:
json.dump(data, f, indent=2)
Expected output:
Memory growth rate: 2.35 MB per sample
Start RSS: 145.2 MB
End RSS: 215.8 MB
Fix 5: Fixing Event Listener Leaks
// bad.js -- Event listeners accumulate
class ButtonHandler {
constructor(button) {
this.clickCount = 0;
// Each constructor call adds a new listener to the button
button.addEventListener('click', () => {
this.clickCount++;
console.log(`Clicked ${this.clickCount} times`);
});
}
}
// Every time this runs, a new listener is added without removing the old one
function attachHandler() {
const btn = document.getElementById('myButton');
new ButtonHandler(btn); // Listener added but never removable
}
// fixed.js -- Proper listener cleanup
class ButtonHandler {
constructor(button) {
this.clickCount = 0;
this.handleClick = this.handleClick.bind(this);
button.addEventListener('click', this.handleClick);
// Store the element reference for cleanup
this.button = button;
}
handleClick() {
this.clickCount++;
console.log(`Clicked ${this.clickCount} times`);
}
destroy() {
this.button.removeEventListener('click', this.handleClick);
this.button = null;
}
}
// Use with proper lifecycle management
const handler = new ButtonHandler(document.getElementById('myButton'));
// Later, when cleaning up:
handler.destroy();
Expected output:
Clicked 1 times
Clicked 2 times
(no duplicate listeners accumulating)
Memory Leak Detection Flowchart
flowchart TD
A[Suspect Memory Leak] --> B{Check RSS growth}
B -->|Increases over time| C[Take pair of heap snapshots]
C --> D[Compare snapshots for growing objects]
D --> E{Objects growing?}
E -->|Yes| F[Trace GC roots and referrers]
F --> G[Identify what retains the objects]
G --> H[Fix: clear caches, remove listeners, null refs]
E -->|No| I{Is native memory growing?}
I -->|Yes| J[Check with valgrind or similar]
J --> K[Update native library or fix native code]
I -->|No| L[Monitor again under different load]
H --> M[Memory Stabilized]
K --> M
L --> M
Prevention Tips
- Set memory limits on all application containers using
--memory(Docker) or resource quotas (Kubernetes) - Implement cache eviction with LRU or TTL strategies instead of unbounded caches
- Use weak references (
WeakRefin JS,WeakValueDictionaryin Python,WeakReferencein Java) for caches - Run memory profiling as part of your CI pipeline with tools like
memray,clinic.js, orjmh - Monitor RSS and heap usage in production with Prometheus and set alerts on growth trends
- Always pair
addEventListenerwithremoveEventListenerin JavaScript, especially in SPA frameworks
Practice Questions
What is the difference between a heap snapshot and a memory timeline? Answer: A heap snapshot is a point-in-time picture of all objects in memory, their sizes, and references to them. A memory timeline tracks memory usage over time, showing when allocations and GC events occur. Use snapshots for detailed object analysis and timelines for observing growth patterns.
How do you find what is retaining a specific object in a heap snapshot? Answer: In Chrome DevTools Memory panel, select the object and view the "Retainers" section -- it shows the chain of references from the GC root to the selected object. In Python, use
gc.get_referrers(obj)to find what holds a reference. In Java, usejhator Eclipse MAT to view the "incoming references" of an object.What is the difference between a memory leak and memory bloat? Answer: A memory leak is memory that is allocated but never freed because it is still referenced (reachable) but no longer needed -- it accumulates indefinitely. Memory bloat is using more memory than necessary for current operations (e.g., oversized buffers, inefficient data structures) but can still be garbage collected. Bloat is usually fixed by optimization; leaks require reference analysis.
Challenge: Write a Python decorator that tracks the memory usage of a function, prints the peak memory, and identifies any objects that were created but not freed after the function returns. Answer:
import tracemalloc import gc from functools import wraps def track_memory(func): @wraps(func) def wrapper(*args, **kwargs): gc.collect() tracemalloc.start() before = tracemalloc.take_snapshot() result = func(*args, **kwargs) after = tracemalloc.take_snapshot() diff = after.compare_to(before, "lineno") peak = tracemalloc.get_traced_memory()[1] tracemalloc.stop() print(f"{func.__name__}: peak {peak/1024:.1f} KB") for stat in diff[:3]: print(f" +{stat.size_kb:.1f} KB at {stat.traceback.format()[0]}") return result return wrapper
Quick Reference
| Pattern | Detection Tool | Fix |
|---|---|---|
| Closure/retention | Heap snapshot comparison | Null references after use |
| Event listener leak | Retained size analysis | Always call removeEventListener |
| Global cache | Memory timeline | Add LRU eviction or TTL |
| Detached DOM | Heap snapshot Detached filter | Store references in WeakRef |
| Native memory | RSS + valgrind | Fix native library or update drivers |
FAQ
How do you distinguish a memory leak from normal GC pressure?
Normal GC pressure shows a sawtooth pattern in memory usage -- allocation increases until a GC cycle drops it back to a baseline. A memory leak shows a steady upward trend where each GC cycle drops to a higher baseline than the previous one. Use the Memory or Performance panel to observe the GC pattern over several minutes.
What is the most reliable way to detect memory leaks in production?
Use a combination of: (1) RSS monitoring with alerts on growth trends over a 24-hour window, (2) periodic heap dump collection (every 6-12 hours) compared to find growing object graphs, and (3) integration of a leak detection tool like memray (Python), clinic.js (Node.js), or Eclipse MAT (Java) into your CI pipeline.
Why does my application use more memory over time even without a traditional leak?
This can be caused by JIT compilation caches (Java/.NET), V8 code caching (Node.js), module import caches (Python), or database connection pools that grow under load. Monitor the heap and identify which object types are growing. If the growth plateaus after warmup, it is likely JIT/metadata caching -- if it grows linearly without bound, it is a leak.
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro