Skip to content

Alerting with Alertmanager: Configuring Alerts and Notifications

DodaTech Updated 2026-06-23 6 min read

In this tutorial, you'll learn about Alerting with Alertmanager: Configuring Alerts and Notifications. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

What You Will Learn

This tutorial teaches you how to configure Prometheus Alertmanager to receive alerts from Prometheus, route them to the right teams, and send notifications through email, Slack, PagerDuty, and Webhooks.

Why It Matters

Metrics without alerts are just history. When a service goes down, you need to know immediately -- not when a user complains. Alertmanager gives you the flexibility to route, group, and silence alerts so your team gets the right signal at the right time.

Real-World Use

The Durga Antivirus Pro team uses Alertmanager with multiple routes: critical security alerts go to PagerDuty with a 5-minute escalation, warning-level alerts go to Slack for the next business day, and informational alerts are grouped into a daily digest email.

Alertmanager is the Prometheus component responsible for alert handling. It deduplicates, groups, and routes alerts to notification channels. It also handles silencing and inhibition -- ensuring that a single root cause does not trigger a cascade of redundant notifications.


Prerequisites


Step-by-Step Tutorial

Step 1: Deploy Alertmanager

docker run -d --name alertmanager \
  -p 9093:9093 \
  -v $(pwd)/alertmanager.yml:/etc/alertmanager/alertmanager.yml \
  prom/alertmanager:v0.27.0

Expected output: Alertmanager listens on port 9093. Visit http://localhost:9093 for the web UI.

Step 2: Create the Alertmanager Configuration

Create alertmanager.yml:

route:
  receiver: "default"
  group_by: ["alertname", "severity"]
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h

receivers:
  - name: "default"
    slack_configs:
      - api_url: "https://hooks.slack.com/services/T0000/B0000/XXXXX"
        channel: "#alerts"
        send_resolved: true

This sends all alerts to a Slack channel. Alerts are grouped by name and severity.

Step 3: Configure Prometheus to Use Alertmanager

In <a href="/devops/prometheus-grafana/">Prometheus</a>.yml:

alerting:
  alertmanagers:
    - static_configs:
        - targets: ["localhost:9093"]

rule_files:
  - "alert-rules.yml"

Step 4: Write Alerting Rules

Create alert-rules.yml:

groups:
  - name: infrastructure
    rules:
      - alert: InstanceDown
        expr: up == 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Instance {{ $labels.instance }} down"
          description: "{{ $labels.instance }} has been down for more than 5 minutes."

      - alert: HighCPUUsage
        expr: 100 - (avg by(instance)(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "CPU usage above 80% on {{ $labels.instance }}"

Step 5: Add Multiple Routes

route:
  receiver: "default"
  routes:
    - match:
        severity: critical
      receiver: "pagerduty-critical"
      repeat_interval: 5m
    - match:
        severity: warning
      receiver: "slack-warnings"
      repeat_interval: 4h
    - match:
        alertname: "InstanceDown"
      receiver: "pagerduty-critical"

receivers:
  - name: "pagerduty-critical"
    pagerduty_configs:
      - routing_key: "YOUR_PAGERDUTY_KEY"
        severity: critical

  - name: "slack-warnings"
    slack_configs:
      - api_url: "https://hooks.slack.com/services/T0000/B0000/XXXXX"
        channel: "#warnings"

Step 6: Configure Email Notifications

receivers:
  - name: "email-alerts"
    email_configs:
      - to: "team@dodatech.com"
        from: "alertmanager@dodatech.com"
        smarthost: "smtp.sendgrid.net:587"
        auth_username: "apikey"
        auth_identity: "apikey"
        auth_password: "SG.XXXXX"

Step 7: Set Up Silences and Inhibitions

Inhibitions prevent redundant alerts when a root cause is already firing:

inhibit_rules:
  - source_match:
      severity: "critical"
    target_match:
      severity: "warning"
    equal: ["instance"]

This suppresses all warning-level alerts on an instance that already has a critical alert.

To silence an alert from the web UI:

  1. Go to http://localhost:9093/#/silences
  2. Click New Silence
  3. Set matchers: alertname="HighCPUUsage"
  4. Set duration and comment
  5. Create

Step 8: Test via the API

curl -X POST http://localhost:9093/api/v1/alerts \
  -H "Content-Type: application/json" \
  -d '[
    {
      "labels": {
        "alertname": "TestAlert",
        "severity": "critical",
        "instance": "test-01]
      },
      "annotations": {
        "summary": "This is a test alert"
      }
    }
  ]'

Expected output: The alert appears in the Alertmanager web UI and triggers the configured route.

Step 9: Use Webhook Receivers for Custom Integrations

receivers:
  - name: "webhook"
    webhook_configs:
      - url: "http://internal-webhook.dodatech.com/alert"
        send_resolved: true

Your webhook endpoint receives a JSON payload with the alert data.


Learning Path

flowchart LR
    A[Alert Rules] -->|fires| B[Prometheus]
    B -->|sends| C[Alertmanager]
    C --> D{Routing}
    D --> E[Slack]
    D --> F[PagerDuty]
    D --> G[Email]
    D --> H[Webhook]
    C --> I[Silences]
    C --> J[Inhibitions]
    style D fill:#4a90d9,color:#fff
    style C fill:#e67e22,color:#fff

Common Errors

  1. Alerts are firing but no notifications are sent -- The Alertmanager configuration has a routing error. Check Alertmanager logs with docker logs alertmanager.

  2. Slack notification says "invalid_url" -- The Slack webhook URL is incorrect or expired. Generate a new webhook in the Slack API dashboard.

  3. Email notifications are not delivered -- SMTP credentials are wrong or the smarthost is not reachable. Test with curl to verify the SMTP endpoint.

  4. Alerts repeat too frequently -- The repeat_interval is set too low. Increase it to 4h or more for warning-level alerts.

  5. The InstanceDown alert fires during rolling updates -- The for: 5m clause is not long enough. Increase the duration to accommodate your deployment cycle.

  6. Webhook receiver times out -- Alertmanager has a default 5-second timeout for Webhooks. Ensure your endpoint responds quickly.

  7. Inhibition does not suppress warnings -- The equal label must match exactly. Ensure the source and target alerts share the same label values.


Practice Questions

  1. What port does Alertmanager listen on by default? Answer: 9093.

  2. What is the purpose of the group_by parameter in the route configuration? Answer: It groups alerts by specified labels so that related alerts arrive as a single notification instead of many individual ones.

  3. How do you prevent alerts from firing during scheduled maintenance? Answer: Create a silence in the Alertmanager web UI or API matching the specific instance or alert name.

  4. What is an inhibition rule? Answer: It suppresses lower-severity alerts when a higher-severity alert is already firing for the same entity.

  5. How do you configure Alertmanager to send alerts to multiple destinations? Answer: Define multiple receivers and route alerts to different receivers using match conditions on labels.


Challenge

Set up a complete alerting pipeline for a production environment. Configure Prometheus alerting rules for: InstanceDown (critical), HighCPUUsage (warning), HighMemoryUsage (warning), DiskSpaceFull (critical), and HighErrorRate (critical based on application metrics). Route critical alerts to PagerDuty with escalation, warnings to Slack in #ops channel, and disk alerts to email. Set up an inhibition rule that suppresses all warnings on an instance that is already down. Create a webhook receiver that forwards alerts to an internal incident management system. Verify every route by firing test alerts through the API.


FAQ

What is the difference between group_wait and group_interval?

group_wait is the time to wait before sending the first notification for a new group. group_interval is the time to wait before sending notifications for an existing group.

Can I use Alertmanager without Prometheus?

No, Alertmanager is designed to receive alerts from Prometheus and does not accept metrics or data from other sources.

How do I manage Alertmanager configuration?

Store the YAML file in version control and use a deployment pipeline to apply changes. Alertmanager reloads configuration on SIGHUP.

Does Alertmanager support multi-team routing?

Yes, you can define nested routes with different matchers to create team-specific notification channels.

What happens if Alertmanager is down?

Prometheus queues alerts locally. When Alertmanager comes back, Prometheus replays queued alerts.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro