How to Fix Datadog Agent Not Running or Reporting
In this tutorial, you'll learn about How to Fix Datadog Agent Not Running or Reporting. We cover key concepts, practical examples, and best practices.
The Problem
The Datadog UI shows No data received for a host, or sudo <a href="/devops/monitoring-tools/">datadog</a>-agent status reports Not running. The Agent may fail to start because of an invalid API key, network restrictions that block outbound HTTPS to Datadog's intake, permission issues on the configuration directory, or a corrupted configuration file in /etc/<a href="/devops/monitoring-tools/">datadog</a>-agent/.
Quick Fix
1. Check the Agent status
sudo datadog-agent status
Expected output (healthy):
========
Collector
========
Status: Healthy
Running for: 2d 14h
Emitted metrics: 42,500
Last flush: 30s ago
Expected output (unhealthy):
Status: Not running
2. Restart the Agent
sudo systemctl restart datadog-agent
Expected output:
● datadog-agent.service - Datadog Agent
Loaded: loaded (/lib/systemd/system/datadog-agent.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2024-01-15 10:30:00 UTC
Check that it started successfully:
sleep 5 && sudo datadog-agent status | head -10
3. Validate the API key
sudo datadog-agent configcheck | grep api_key
Expected output:
api_key: ************************12345
Test the key against the Datadog API:
DATADOG_API_KEY=$(sudo grep api_key /etc/datadog-agent/datadog.yaml | awk '{print $2}')
curl -X GET "https://api.datadoghq.com/api/v1/validate" \
-H "DD-API-KEY: $DATADOG_API_KEY"
Expected output:
{"valid": true}
4. Fix network connectivity to Datadog intake
openssl s_client -connect intake.datadoghq.com:443 -servername intake.datadoghq.com < /dev/null 2>&1 | head -5
Expected output:
CONNECTED(00000003)
---
Certificate chain
0 s:CN = intake.datadoghq.com
If blocked, configure a proxy in /etc/<a href="/devops/monitoring-tools/">datadog</a>-agent/<a href="/devops/monitoring-tools/">datadog</a>.yaml:
proxy:
http: http://proxy.example.com:3128
https: https://proxy.example.com:3128
no_proxy:
- 169.254.169.254
5. Check for YAML configuration errors
sudo datadog-agent configcheck
Expected output:
=== Checks with configuration ===
loading: OK
loaded: 23 checks
running: 23
Fix any YAML indentation errors in /etc/<a href="/devops/monitoring-tools/">datadog</a>-agent/conf.d/:
python3 -c "import yaml; yaml.safe_load(open('/etc/datadog-agent/datadog.yaml'))"
6. Check disk space and permissions on logs directory
ls -la /var/log/datadog/
df -h /var/log/datadog/
Expected output:
total 1234
-rw-r--r-- 1 dd-agent dd-agent 12345 Jan 15 10:30 agent.log
If the directory is owned by root instead of dd-agent, fix it:
sudo chown -R dd-agent:dd-agent /var/log/datadog/
7. Restart the Agent with full debug logging
sudo DD_LOG_LEVEL=DEBUG datadog-agent restart
sudo journalctl -u datadog-agent --no-pager -n 30
8. Test the Agent connectivity to Datadog
sudo datadog-agent diagnose
Expected output:
=== Connectivity Diagnosis ===
Client: OK
API Key: OK
Logs: OK
Prevention
- Enable the Agent to start on boot:
sudo systemctl enable <a href="/devops/monitoring-tools/">datadog</a>-agent - Monitor the Agent process itself through Datadog's process check (
process.<a href="/devops/monitoring-tools/">datadog</a>) - Source the API key from an environment variable or secrets manager in containerized deployments
- Add firewall rules to allow outbound HTTPS to
*.datadoghq.comon port 443 - Run
sudo <a href="/devops/monitoring-tools/">datadog</a>-agent statusafter every configuration change as a quick smoke test
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro