Skip to content

Metrics — Complete Request Rate, Errors, Latency Guide

DodaTech Updated 2026-06-28 2 min read

In this tutorial, you will learn about Metrics. We cover key concepts, practical examples, and best practices to help you master this topic.

The three golden signals of API monitoring are request rate (traffic volume), error rate (failed requests), and latency (response time). These metrics give you a comprehensive view of API health.

What You'll Learn

You'll learn the key API metrics, how to measure them, and how to set meaningful thresholds.

Why It Matters

These three metrics capture most API problems. A traffic spike, error burst, or latency increase all signal issues that need investigation.

Real-World Use

An API dashboard shows: 1000 requests/sec, 0.5% error rate, p50 latency 50ms, p95 200ms, p99 500ms. When p99 latency jumps to 2000ms, the team investigates the slowest 1% of requests.

Key Metrics

from prometheus_client import Counter, Histogram, Gauge
import time

REQUEST_COUNT = Counter(
    "api_requests_total",
    "Total API requests",
    ["method", "endpoint", "status"]
)

REQUEST_LATENCY = Histogram(
    "api_request_duration_seconds",
    "API request latency in seconds",
    ["method", "endpoint"],
    buckets=[.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5]
)

IN_FLIGHT = Gauge(
    "api_requests_in_flight",
    "Current requests being processed",
    ["method"]
)

@app.route("/api/data")
def get_data():
    method = request.method
    endpoint = "/api/data"
    IN_FLIGHT.labels(method=method).inc()
    start = time.time()
    try:
        result = process_request()
        REQUEST_COUNT.labels(method=method, endpoint=endpoint, status=200).inc()
        return result
    except Exception:
        REQUEST_COUNT.labels(method=method, endpoint=endpoint, status=500).inc()
        raise
    finally:
        IN_FLIGHT.labels(method=method).dec()
        REQUEST_LATENCY.labels(method=method, endpoint=endpoint).observe(
            time.time() - start
        )

Metric Types

Metric Description Example Alert
Request rate Requests per second Drop below 50% of normal
Error rate Percentage of 5xx responses Error rate > 5% for 5 minutes
Latency p50 Median response time p50 > 200ms
Latency p95 Slowest 5% p95 > 500ms
Latency p99 Slowest 1% p99 > 2000ms

Common Mistakes

| Mistake | Fix | |---------|-----| | Only tracking average latency | Hides tail latency | Track p50, p95, p99 percentiles | | No error rate by endpoint | One broken endpoint hidden | Track errors per endpoint | | Counting 4xx as errors | Client errors skew data | Track 5xx as errors, 4xx separately | | No rate of change alerts | Gradual degradation missed | Alert on rate of change metrics | | Too many metrics | Noise hides signal | Start with 5 key metrics |

Practice Questions

  1. What are the three golden signals?
  2. Why is p99 latency more important than average?
  3. What is a good error rate threshold?
  4. How do you set latency SLOs?
  5. What is the difference between p95 and p99?

Challenge

Instrument a Flask API with Prometheus metrics: request count (by method, endpoint, status), latency histogram, and in-flight requests gauge. Query with PromQL to find p99 latency.

What's Next

Learn about structured JSON logging for APIs.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro