Grafana Tempo Ingester Out of Memory Fix
In this tutorial, you'll learn about Grafana Tempo Ingester Out of Memory Fix. We cover key concepts, practical examples, and best practices.
Your Grafana Tempo ingester runs out of memory and crashes — the process is killed by OOM killer, or the ingester refuses new traces with ingester full errors. The ingester holds traces in memory before flushing to the backend, and heavy traffic can exhaust available memory.
The Problem
level=error msg="ingester full, refusing traces" component=ingester
FATAL: OOM killer terminated tempo-ingester (exit code 137)
The ingester cannot flush blocks to the backend fast enough, causing memory to accumulate until the process is killed.
Step-by-Step Fix
1. Reduce max block duration
ingester:
lifecycler: ...
trace_idle_period: 5s # Default: 10s — flush idle traces sooner
max_block_duration: 30s # Default: 5m — flush blocks frequently
max_block_bytes: 500_000_000 # 500MB max block size
2. Limit ingester memory usage
ingester:
max_block_duration: 1m
max_block_bytes: 256_000_000 # 256MB per block
concurrent_flushes: 16 # More concurrent flushes
flush_check_period: 10s # Check for flush every 10s
3. Configure overrides for tenant limits
overrides:
defaults:
ingestion:
rate_limit: 15000 # 15k spans/sec per tenant
burst_size: 30000 # 30k burst
max_traces_per_user: 10000 # Max traces in memory
global:
max_ingesters: 3
4. Scale ingesters horizontally
# Increase ingester replicas
ingester:
replicas: 5 # Distribute load across more ingesters
5. Monitor ingester memory
// Memory usage per ingester
sum(container_memory_usage_bytes{container="tempo-ingester"}) by (pod)
// Blocks ready to flush
tempo_ingester_blocks_flush_queue_length
// Bytes received
rate(tempo_ingester_bytes_received_total[5m])
Expected metrics after tuning:
Memory usage: 2GB → 800MB (ingester)
Blocks flushed per minute: 10 → 120
Ingester full errors: 0
Prevention Tips
- Set
max_block_durationto 30s-1m for high-throughput services - Monitor
tempo_ingester_blocks_flush_queue_length— should stay near zero - Use
concurrent_flushes: 16for faster backend writes - Align
max_block_byteswith available memory per ingester - Scale ingesters as traffic grows
Common Mistakes with tempo ingester
- Non-exhaustive pattern matches that compile with warnings then crash at runtime
- Misunderstanding that
Stringis[Char]with poor performance for large text operations - Using
foldlinstead offoldl'causing stack overflow on large lists
These mistakes appear frequently in real-world GRAFANA code. DodaTech's contributors have identified these patterns through analysis of open-source projects and production systems.
Practice Exercise
Write a pure function that safely divides two integers using Maybe, then test it with edge cases like division by zero and negative numbers.
This exercise reinforces the concepts covered in this guide. Try implementing it before checking online solutions.
FAQ
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro