Skip to content

Distributed Tracing — Complete Request Tracing Guide

DodaTech Updated 2026-06-28 2 min read

In this tutorial, you will learn about Distributed Tracing. We cover key concepts, practical examples, and best practices to help you master this topic.

Distributed tracing follows a request as it travels through multiple Microservices. It uses trace IDs to correlate spans (individual operations) across service boundaries.

What You'll Learn

You'll learn how distributed tracing works, trace context propagation, and how to identify performance bottlenecks.

Why It Matters

Without tracing, you cannot tell which service in a chain is slow. A 2-second API response could be slow in any of 5 services. Tracing pinpoints the exact bottleneck.

Real-World Use

A checkout request takes 3 seconds. Tracing shows: gateway (50ms), auth (30ms), inventory (200ms), payment (2500ms), notification (220ms). The payment service is the bottleneck.

sequenceDiagram
    participant G as Gateway
    participant A as Auth
    participant P as Payment
    participant N as Notify
    G->>G: Span: total (3000ms)
    G->>A: Span: auth (30ms)
    G->>P: Span: payment (2500ms)
    G->>N: Span: notify (220ms)
    Note over G,N: Trace ID: abc-123

Implementation

from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor

provider = TracerProvider()
processor = BatchSpanProcessor(
    OTLPSpanExporter(endpoint="http://otel-collector:4318/v1/traces")
)
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
tracer = trace.get_tracer(__name__)

@app.route("/api/checkout")
def checkout():
    with tracer.start_as_current_span("checkout") as span:
        span.set_attribute("user_id", current_user.id)
        with tracer.start_as_current_span("validate_cart") as child:
            cart = validate_cart()
            child.set_attribute("cart_total", cart.total)
        with tracer.start_as_current_span("process_payment") as child:
            result = payment_service.charge(cart)
            child.set_attribute("payment_id", result.id)
        return jsonify({"status": "success"})

Common Mistakes

| Mistake | Fix | |---------|-----| | Not propagating trace context | Broken trace chain | Always propagate traceparent header | | Sampling too aggressively | Missing traces for rare errors | Use head-based sampling with error-trailing | | Too many spans | Overhead and noise | Only trace meaningful operations | | No span attributes | Cannot analyze traces | Add key attributes (user_id, order_id) | | Ignoring async operations | Missing spans for Background Jobs | Trace async processes with separate traces |

Practice Questions

  1. What is a trace vs a span?
  2. How is trace context propagated?
  3. What is sampling and why is it needed?
  4. How do you identify bottlenecks with tracing?
  5. What is distributed context propagation?

Challenge

Instrument a Flask API with OpenTelemetry tracing. Create a service that calls two downstream services. View the trace in Jaeger or Zipkin. Identify the slowest span.

What's Next

Learn about OpenTelemetry for vendor-neutral instrumentation.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro