Grafana Tempo Span Metrics Not Generating Fix
In this tutorial, you'll learn about Grafana Tempo Span Metrics Not Generating Fix. We cover key concepts, practical examples, and best practices.
Your Grafana Tempo span metrics show zero data in Prometheus — traces_span_metrics_latency has no values, or the metrics are not appearing in the target Prometheus instance. The span metrics processor is not enabled, or the remote write configuration is incorrect.
The Problem
# tempo.yaml — missing span metrics configuration
metrics_generator:
processor:
service_graphs:
enabled: true
registry:
collection_interval: 15s
The metrics generator runs but only processes service graphs. Span metrics are not generated because the span_metrics processor is not enabled.
Step-by-Step Fix
1. Enable span metrics processor
metrics_generator:
processor:
service_graphs:
enabled: true
span_metrics:
enabled: true
registry:
collection_interval: 15s
2. Configure remote write
metrics_generator:
processor:
span_metrics:
enabled: true
histogram_buckets: [0.01, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10]
registry:
collection_interval: 15s
storage:
path: /var/tempo/generator/wal
remote_write:
- url: http://prometheus:9090/api/v1/write
3. Configure per-service latency buckets
metrics_generator:
processor:
span_metrics:
enabled: true
histogram_buckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1]
dimensions:
- http.method
- http.status_code
- grpc.method
4. Query span metrics in Prometheus
// Latency histogram (p99)
histogram_quantile(0.99,
sum(rate(traces_span_metrics_latency_bucket[5m])) by (le, service_name)
)
// Error rate
sum(rate(traces_span_metrics_calls_total{status_code="STATUS_CODE_ERROR"}[5m])) by (service_name)
// Request rate
sum(rate(traces_span_metrics_calls_total[5m])) by (service_name)
5. Add span metrics to Grafana dashboard
{
"title": "Span Metrics (RED)",
"panels": [
{
"title": "Request Rate by Service",
"type": "timeseries",
"targets": [{
"expr": "sum(rate(traces_span_metrics_calls_total[5m])) by (service_name)"
}]
},
{
"title": "Error Rate by Service",
"type": "timeseries",
"targets": [{
"expr": "sum(rate(traces_span_metrics_calls_total{status_code=\"STATUS_CODE_ERROR\"}[5m])) by (service_name)"
}]
},
{
"title": "Latency p99 by Service",
"type": "timeseries",
"targets": [{
"expr": "histogram_quantile(0.99, sum(rate(traces_span_metrics_latency_bucket[5m])) by (le, service_name))"
}]
}
]
}
Expected output in Prometheus:
traces_span_metrics_calls_total{service_name="payment-service"} 15000
traces_span_metrics_latency_bucket{service_name="payment-service", le="0.1"} 12000
traces_span_metrics_latency_bucket{service_name="payment-service", le="0.5"} 14800
traces_span_metrics_latency_bucket{service_name="payment-service", le="+Inf"} 15000
Prevention Tips
- Enable both
span_metricsandservice_graphsprocessors - Configure
remote_writeto a Prometheus-compatible backend - Use custom
histogram_bucketsthat match your latency SLOs - Add relevant span attributes as metric dimensions
- Verify metrics with a test trace before production rollout
Common Mistakes with tempo span metrics
- Forgetting
deriving (Show, Eq)on custom data types needed for debugging - Placing the wildcard pattern first in case expressions, making all subsequent patterns unreachable
- Using
headandtailinstead of pattern matching, causing runtime errors on empty lists
These mistakes appear frequently in real-world GRAFANA code. DodaTech's contributors have identified these patterns through analysis of open-source projects and production systems.
Practice Exercise
Write a pure function that safely divides two integers using Maybe, then test it with edge cases like division by zero and negative numbers.
This exercise reinforces the concepts covered in this guide. Try implementing it before checking online solutions.
FAQ
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro