OpenTelemetry Collector: Architecture and Deployment Guide
In this tutorial, you'll learn about OpenTelemetry Collector: Architecture and Deployment Guide. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
What You Will Learn
This tutorial teaches you how to deploy and configure the OpenTelemetry Collector for receiving telemetry data (traces, metrics, logs) from instrumented applications, processing and enriching the data, and exporting it to multiple backends.
Why It Matters
Directly sending telemetry from each application to every backend does not scale. The OpenTelemetry Collector acts as a central gateway: it decouples instrumentation from backends, provides data transformation and filtering, and reduces the configuration burden on application developers.
Real-World Use
The DodaTech platform runs 200 Microservices across multiple Kubernetes clusters. Instead of configuring each service to send data to Datadog, Jaeger, and Prometheus separately, every service sends OTLP to the Collector, which routes traces to Jaeger, metrics to Prometheus, and logs to Loki -- all with a single configuration file.
The OpenTelemetry is a vendor-agnostic agent for receiving, processing, and exporting telemetry data. It consists of receivers (data ingestion), processors (data transformation), and exporters (data output). It supports the OTLP protocol natively and can also ingest data in other formats.
Prerequisites
- Docker and Docker Compose installed
- Basic understanding of OpenTelemetry Tracing
- A target backend (Jaeger, Prometheus, Loki)
- Kubernetes cluster (optional, for production deployment)
Step-by-Step Tutorial
Step 1: Deploy the OpenTelemetry Collector
docker run -d --name otel-collector \
-p 4317:4317 \
-p 4318:4318 \
-p 8888:8888 \
-p 8889:8889 \
-v $(pwd)/otel-config.yaml:/etc/otel-collector-config.yaml \
otel/opentelemetry-collector-contrib:0.104.0
Expected output: The Collector starts and listens on OTLP gRPC (4317), OTLP HTTP (4318), and internal metrics (8888).
Step 2: Create a Basic Configuration
Create otel-config.yaml:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
timeout: 1s
send_batch_size: 1024
exporters:
debug:
verbosity: detailed
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [debug]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [debug]
logs:
receivers: [otlp]
processors: [batch]
exporters: [debug]
Step 3: Send Test Data to the Collector
# Install the OTel CLI tool
go install github.com/open-telemetry/opentelemetry-collector-contrib/cmd/otelcol-otel@latest
# Generate test trace data
curl -X POST http://localhost:4318/v1/traces \
-H "Content-Type: application/json" \
-d '{"resourceSpans": [{"resource": {}, "scopeSpans": [{"spans": [{"traceId": "123", "spanId": "456", "name": "test-span"}]}]}]}'
Expected output: The Collector logs show the ingested trace, metric, or log data.
Step 4: Add Multiple Exporters
exporters:
otlp/jaeger:
endpoint: jaeger:4317
tls:
insecure: true
prometheus:
endpoint: 0.0.0.0:8889
namespace: otel
otlphttp/loki:
endpoint: http://loki:3100/otlp
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp/jaeger, debug]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [prometheus, debug]
logs:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/loki, debug]
Step 5: Add Data Processing Pipelines
processors:
batch:
timeout: 1s
send_batch_size: 1024
memory_limiter:
check_interval: 1s
limit_mib: 512
spike_limit_mib: 128
attributes:
actions:
- key: environment
value: production
action: upsert
- key: region
value: us-east-1
action: upsert
filter:
error_mode: ignore
metrics:
metric:
- 'IsMatch(name, "otel.*")'
- 'HasAttrOnDatapoint("http.method")'
Step 6: Deploy the Collector with Docker Compose
version: "3.8"
services:
otel-collector:
image: otel/opentelemetry-collector-contrib:0.104.0
volumes:
- ./otel-config.yaml:/etc/otel-collector-config.yaml
ports:
- "4317:4317"
- "4318:4318"
- "8888:8888"
- "8889:8889"
command: ["--config=/etc/otel-collector-config.yaml"]
jaeger:
image: jaegertracing/all-in-one:1.57
environment:
- COLLECTOR_OTLP_ENABLED=true
ports:
- "16686:16686"
prometheus:
image: prom/prometheus:v2.53.0
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
Step 7: Deploy on Kubernetes
apiVersion: v1
kind: ConfigMap
metadata:
name: otel-collector-conf
data:
otel-config.yaml: |
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
processors:
batch:
timeout: 1s
exporters:
prometheus:
endpoint: 0.0.0.0:8889
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [debug]
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: otel-collector
spec:
replicas: 2
selector:
matchLabels:
app: otel-collector
template:
metadata:
labels:
app: otel-collector
spec:
containers:
- name: collector
image: otel/opentelemetry-collector-contrib:0.104.0
args: ["--config=/conf/otel-config.yaml"]
ports:
- containerPort: 4317
volumeMounts:
- name: config
mountPath: /conf
volumes:
- name: config
configMap:
name: otel-collector-conf
Step 8: Monitor the Collector Itself
# Prometheus scrape config for the collector
scrape_configs:
- job_name: "otel-collector"
static_configs:
- targets: ["otel-collector:8888"]
The Collector exposes its own metrics at port 8888, including:
otelcol_receiver_accepted_spans
otelcol_exporter_sent_metric_points
otelcol_processor_batch_timeout_triggered
Learning Path
flowchart LR
A[OTel Collector] --> B{Receivers}
B --> C[OTLP gRPC]
B --> D[OTLP HTTP]
B --> E[Other Formats]
C --> F{Processors}
D --> F
E --> F
F --> G[Batch]
F --> H[Attributes]
F --> I[Filter]
F --> J[Memory Limiter]
G --> K{Exporters}
H --> K
I --> K
J --> K
K --> L[Jaeger]
K --> M[Prometheus]
K --> N[Loki]
K --> O[Other Backends]
style A fill:#4a90d9,color:#fff
style K fill:#e67e22,color:#fff
Common Errors
Collector fails to start with "unknown receivers" -- The configuration references a component not included in the distribution. Use the
contribimage for full component support.OTLP gRPC port is not listening -- Another process is already using port 4317. Change the port or stop the conflicting service.
Exporter returns "connection refused" -- The target backend is not running or is unreachable. Verify the backend endpoint with curl.
High memory usage causes OOM kill -- The
memory_limiterprocessor is not configured. Setlimit_mibto 80% of available memory.Data appears in debug output but not in the backend -- The exporter configuration is wrong. Verify the backend endpoint, authentication, and TLS settings.
Kubernetes Collector cannot schedule -- Resource requests are too high for the available nodes. Reduce CPU/memory requests or add more nodes.
Attribute processor does not add labels -- The action key is wrong or the attribute already exists with a different value. Use
upsertaction to overwrite existing values.
Practice Questions
What are the three main components of the OpenTelemetry Collector pipeline? Answer: Receivers (ingest data), processors (transform data), and exporters (send data to backends).
What is the purpose of the Batch processor? Answer: It groups telemetry data into batches to improve export efficiency and reduce the number of outgoing requests.
How does the Collector handle multiple telemetry signals? Answer: It supports separate pipelines for traces, metrics, and logs. Each pipeline has its own receivers, processors, and exporters.
What port does the Collector use for its own metrics? Answer: Port 8888 for internal Prometheus metrics.
Why use the Collector instead of sending telemetry directly from applications? Answer: The Collector decouples instrumentation from backends, centralizes configuration, and provides data transformation that would otherwise need to be implemented in every service.
Challenge
Deploy the OpenTelemetry Collector in a Docker Compose stack alongside a sample application, Jaeger, Prometheus, and Loki. Configure the Collector with three pipelines: traces -> Jaeger, metrics -> Prometheus, and logs -> Loki. Add processors for batching, memory limiting, and attribute enrichment (adding environment=production and region=us-east-1 to all telemetry). Add a filter processor that drops all metrics with names starting with "internal.". Instrument a Python application to send OTLP data to the Collector. Verify that traces appear in Jaeger, metrics in Prometheus, and logs in Loki. Scale the Collector to handle 10x the load and verify with the internal metrics endpoint. Export the Collector's own metrics to Prometheus and create a Grafana dashboard showing data ingestion rates.
FAQ
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro