Skip to content

18 Api Monitoring

DodaTech 4 min read

title: "API Monitoring Tests" description: "Monitor API health and performance in production using synthetic monitoring tests. Learn health checks, SLA validation, multi-step monitors, alerting, and integration with monitoring platforms." weight: 18 date: 2026-06-28 lastmod: 2026-06-28 tags: [api-development, testing] }

API monitoring tests continuously validate API health, performance, and correctness in production. Synthetic monitors run against live APIs at regular intervals, alerting teams when SLAs are breached or errors occur.

What You'll Learn

  • Synthetic monitoring fundamentals
  • Health check endpoints and patterns
  • Multi-step transaction monitors
  • SLA validation and alerting
  • Integration with monitoring platforms

Why It Matters

Unit and integration tests catch issues before deployment. Monitoring catches issues after deployment: infrastructure failures, performance degradation, and data corruption. Proactive monitoring prevents customer-impacting outages.

Real-World Use

Stripe monitors payment APIs from multiple geographic regions. GitHub runs synthetic monitors for API availability. Cloud providers like AWS have health dashboards powered by monitoring tests.

flowchart LR
    Monitor[Monitoring Service] --> API[Production API]
    Monitor --> Check1[Health Check]
    Monitor --> Check2[Multi-Step Flow]
    Monitor --> Check3[Performance Check]
    Check1 --> Alert[Alert on Failure]
    Check2 --> Alert
    Check3 --> Alert
    Alert --> Pager[On-Call Engineer]
    Monitor --> Dashboard[Status Dashboard]

Teacher Mindset

Monitor what matters: availability, latency, and correctness. Run monitors from multiple locations. Alert on symptoms (user-facing failures), not causes (infrastructure metrics). Escalate if not acknowledged.

Code Examples

// Example 1: Simple health check monitor
const http = require('http');

async function checkHealth() {
  const start = Date.now();

  return new Promise((resolve) => {
    const req = http.get('http://api.example.com/health', (res) => {
      let body = '';
      res.on('data', (chunk) => body += chunk);
      res.on('end', () => {
        const duration = Date.now() - start;
        resolve({
          status: res.statusCode,
          duration,
          body: JSON.parse(body),
          timestamp: new Date().toISOString()
        });
      });
    });

    req.on('error', (err) => {
      resolve({
        status: 0,
        duration: Date.now() - start,
        error: err.message,
        timestamp: new Date().toISOString()
      });
    });

    req.setTimeout(5000, () => {
      req.destroy();
      resolve({
        status: 0,
        duration: 5000,
        error: 'Timeout',
        timestamp: new Date().toISOString()
      });
    });
  });
}

async function monitorLoop() {
  setInterval(async () => {
    const result = await checkHealth();
    if (result.status !== 200 || result.duration > 2000) {
      console.error('Health check failed:', result);
      await sendAlert(result);
    }
  }, 30000);
}
# Example 2: Postman monitor configuration
# Postman Monitor runs this collection every 5 minutes
name: Production API Monitor
requests:
  - name: Health Check
    method: GET
    url: "{{base_url}}/health"
    tests: |
      pm.test("Status is 200", () => pm.response.to.have.status(200));
      pm.test("Response time < 2s", () => pm.expect(pm.response.responseTime).to.be.below(2000));

  - name: User Login Flow
    method: POST
    url: "{{base_url}}/auth/login"
    body:
      mode: raw
      raw: '{"username": "monitor", "password": "{{monitor_password}}"}'
    tests: |
      pm.test("Login successful", () => pm.response.to.have.status(200));
      pm.test("Token received", () => pm.expect(pm.response.json().token).to.be.a('string'));

  - name: Authenticated API Call
    method: GET
    url: "{{base_url}}/api/users/me"
    header:
      - key: Authorization
        value: "Bearer {{auth_token}}"
    tests: |
      pm.test("Authenticated call works", () => pm.response.to.have.status(200));
// Example 3: SLA validation with Prometheus metrics
const promClient = require('prom-client');

const apiAvailability = new promClient.Gauge({
  name: 'api_availability',
  help: 'API availability (1 = up, 0 = down)',
  labelNames: ['endpoint']
});

const apiLatency = new promClient.Gauge({
  name: 'api_latency_ms',
  help: 'API response latency in milliseconds',
  labelNames: ['endpoint']
});

async function monitorAndRecord() {
  const result = await checkEndpoint('/api/users');
  apiAvailability.set({ endpoint: '/api/users' }, result.status === 200 ? 1 : 0);
  apiLatency.set({ endpoint: '/api/users' }, result.duration);

  // Alert if SLA breached
  if (result.duration > 2000 || result.status !== 200) {
    await sendAlert({
      message: `SLA breach on /api/users`,
      duration: result.duration,
      status: result.status
    });
  }
}

Common Mistakes

  • Monitoring only the health endpoint without testing real functionality
  • Not monitoring from multiple geographic locations
  • Setting alert thresholds too tight, causing alert fatigue
  • Not testing authentication and user flows in monitors
  • Ignoring slow degradation that does not trigger immediate alerts

Practice

  1. Create a health check endpoint in your API.
  2. Write a synthetic monitor that checks health every minute.
  3. Add a multi-step monitor that tests a user flow.
  4. Set up alerting when response time exceeds 2 seconds.
  5. Challenge: Build a monitoring dashboard that shows API availability over the last 30 days.

FAQ

What is the difference between monitoring and testing?

Testing catches issues before deployment. Monitoring catches issues after deployment. Both are necessary.

How often should I run monitors?

Every 1-5 minutes for critical endpoints. Every 15-60 minutes for less critical paths.

Should I monitor from multiple locations?

Yes. Cloud providers (AWS, GCP) have monitoring regions. Monitor from at least 3 locations.

What SLAs should I monitor?

Availability (uptime), latency (p95, p99), error rate (percentage of non-200 responses), and correctness (response structure).

How do I set up alerting?

Use PagerDuty, Opsgenie, or Slack webhooks. Alert on symptoms (user sees error) not causes (CPU high).

Mini Project

Set up production API monitoring for your API. Create health and multi-step monitors. Configure Prometheus metrics for availability and latency. Set up alerting via Slack when SLA is breached (response time > 2s or error rate > 1%).

What's Next

Next, you will learn about test data management strategies for API testing.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro