Skip to content

Datadog Introduction: APM and Infrastructure Monitoring

DodaTech Updated 2026-06-23 6 min read

In this tutorial, you'll learn about Datadog Introduction: APM and Infrastructure Monitoring. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

What You Will Learn

This tutorial teaches you how to set up Datadog for infrastructure monitoring, application performance monitoring (APM), log management, and dashboard creation -- all through a single agent.

Why It Matters

Most Observability tools require separate agents, backends, and dashboards for metrics, traces, and logs. Datadog unifies all three signals into one platform, reducing operational overhead and enabling faster correlation during incident investigations.

Real-World Use

The Doda Browser team uses Datadog APM to trace every API request from the browser client through the backend services. When a user reports slowness, the team finds the trace, identifies the slowest span, and sees the host-level CPU and memory metrics alongside the trace -- all in one view.

Datadog is a SaaS-based monitoring and analytics platform. It provides infrastructure monitoring, APM, log management, synthetic monitoring, and security monitoring. The Datadog Agent is installed on hosts and collects metrics, traces, and logs, forwarding them to the Datadog backend.


Prerequisites

  • A Datadog account (14-day free trial available)
  • A Linux server or local VM
  • Python 3.8+ for the sample application
  • Basic understanding of Prometheus Introduction or other monitoring tools

Step-by-Step Tutorial

Step 1: Install the Datadog Agent

DD_API_KEY=your_api_key_here DD_SITE="datadoghq.com" bash -c "$(curl -L https://s3.amazonaws.com/dd-agent/scripts/install_script_agent7.sh)"

Expected output: The Agent installs and starts. Verify with sudo <a href="/devops/monitoring-tools/">datadog</a>-agent status.

Step 2: Verify Agent Installation

sudo datadog-agent status | head -20

Look for "Running" in the Agent status. Open the Datadog web dashboard and navigate to Infrastructure > Host Map. Your host should appear.

Step 3: Enable Integrations

Datadog provides 700+ integrations. Enable common ones:

# Enable the Redis integration
sudo cp /etc/datadog-agent/conf.d/redisdb.d/conf.yaml.example \
        /etc/datadog-agent/conf.d/redisdb.d/conf.yaml
sudo vi /etc/datadog-agent/conf.d/redisdb.d/conf.yaml
init_config:
instances:
  - host: localhost
    port: 6379
# Enable the Nginx integration
sudo cp /etc/datadog-agent/conf.d/nginx.d/conf.yaml.example \
        /etc/datadog-agent/conf.d/nginx.d/conf.yaml

Step 4: Restart the Agent

sudo systemctl restart datadog-agent

Step 5: Instrument a Python Application with APM

pip install ddtrace

Create app.py:

from flask import Flask
import time
import random

app = Flask(__name__)

@app.route("/")
def home():
    return {"message": "Hello from Datadog"}

@app.route("/process")
def process():
    time.sleep(random.uniform(0.1, 0.3))
    return {"status": "processed"}

if __name__ == "__main__":
    app.run(port=5000)

Run with ddtrace:

DD_SERVICE="my-app" DD_ENV="production" DD_VERSION="1.0" \
  ddtrace-run python app.py

Generate traffic:

for i in $(seq 1 50); do curl http://localhost:5000/process; done

Step 6: View Traces in Datadog

In the Datadog web dashboard:

  1. Navigate to APM > Traces
  2. Select the my-app service
  3. Click on a trace to see the Waterfall view

Step 7: Create Custom Metrics

from ddtrace import tracer

@app.route("/custom")
def custom():
    with tracer.trace("custom.operation", service="my-app") as span:
        span.set_tag("user.id", "demo-user")
        time.sleep(0.2)
        span.set_metric("custom.processing_time", 200)
    return {"custom": "done"}

Step 8: Create a Dashboard

  1. In Datadog, go to Dashboards > New Dashboard
  2. Add a Timeseries widget with the metric system.cpu.user
  3. Add an APM trace search widget showing traces from my-app
  4. Add a Log Stream widget showing recent error logs
  5. Set template variables for env and service

Step 9: Set Up Monitors and Alerts

  1. Go to Monitors > New Monitor
  2. Choose Metric Monitor
  3. Define: avg:system.cpu.user{*} by {host} > 80
  4. Set alert condition: above 80 for 5 minutes
  5. Configure notification message with @slack or @pagerduty

Learning Path

flowchart LR
    A[Install Datadog Agent] --> B[Enable Integrations]
    B --> C[Infrastructure Metrics]
    A --> D[APM Instrumentation]
    D --> E[Distributed Traces]
    A --> F[Log Collection]
    C --> G[Dashboards]
    E --> G
    F --> G
    G --> H[Monitors & Alerts]
    style A fill:#4a90d9,color:#fff
    style H fill:#e67e22,color:#fff

Common Errors

  1. Agent status shows "not running" -- The Agent service failed to start. Check journalctl -u <a href="/devops/monitoring-tools/">datadog</a>-agent for error logs and verify the API key is correct.

  2. Host does not appear in the infrastructure list -- The Agent cannot connect to the Datadog backend. Verify network access to trace.agent.datadoghq.com and api.datadoghq.com.

  3. APM traces do not appear -- The ddtrace library is not instrumenting the application correctly. Ensure you run the app with ddtrace-run or call patch_all().

  4. Integration metrics are missing -- The integration configuration file has syntax errors or the target service is not running. Validate the YAML with yamllint.

  5. Custom metrics are not queryable -- The metric name has a typo or namespace mismatch. Wait up to 10 minutes for custom metrics to appear.

  6. High Agent CPU usage -- Too many integrations are enabled or the dogstatsd metrics rate is too high. Disable unused integrations.

  7. Logs not appearing in Datadog -- Log collection is not enabled in the <a href="/devops/monitoring-tools/">datadog</a>.yaml configuration. Set logs_enabled: true and restart the Agent.


Practice Questions

  1. How does the Datadog Agent collect metrics? Answer: The Agent collects system metrics directly, pulls metrics from integration endpoints, and accepts custom metrics through dogstatsd.

  2. What is ddtrace and how is it used? Answer: ddtrace is Datadog tracing library that auto-instruments Python applications for APM. It is invoked with ddtrace-run python app.py.

  3. How do you enable log collection in Datadog? Answer: Set logs_enabled: true in <a href="/devops/monitoring-tools/">datadog</a>.yaml, configure log integration files, and restart the Agent.

  4. What is the purpose of Datadog monitors? Answer: Monitors evaluate metric thresholds, anomaly conditions, or log patterns and trigger notifications when conditions are met.

  5. How does Datadog APM correlate traces with infrastructure metrics? Answer: Every trace includes host and container metadata, allowing you to see CPU, memory, and network metrics alongside the trace Waterfall.


Challenge

Set up Datadog monitoring for a two-tier application (Flask API + PostgreSQL). Install the Datadog Agent, enable the PostgreSQL integration, instrument the Flask app with ddtrace, and configure log collection. Create a dashboard with: a timeseries of request latency p99, a table of slowest database queries, a heatmap of error rates by endpoint, and a log stream filtered to ERROR level. Set up monitors for: CPU > 80% (warning), Error rate > 5% (critical), and Database connection count > 100 (info). Verify everything works by simulating load and checking the dashboard.


FAQ

How does Datadog pricing work?

Datadog charges per host for infrastructure monitoring, per million APM spans for APM, and per GB ingested for logs. A free 14-day trial is available.

Can I use Datadog for on-premises monitoring?

Datadog is primarily SaaS, but the Agent can send data from on-premises hosts. There is no on-premises backend option.

Does Datadog support OpenTelemetry?

Yes, Datadog supports OpenTelemetry ingestion. You can send OTLP data directly to the Datadog Agent or use the Datadog Exporter for the OpenTelemetry Collector.

What is the difference between Datadog and Grafana?

Datadog is a fully-managed SaaS platform with integrated metrics, traces, and logs. Grafana is an open-source visualization layer that connects to various backends.

How long does Datadog retain data?

Metrics are retained at full resolution for 15 months. Traces are retained for 15 days. Logs retention depends on your plan, starting at 3 days.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro