Anomaly Detection — Complete AI-Powered API Monitoring Guide
In this tutorial, you will learn about Anomaly Detection. We cover key concepts, practical examples, and best practices to help you master this topic.
Anomaly detection automatically identifies unusual patterns in API metrics that deviate from normal behavior. It catches issues that static thresholds miss, like gradual degradation or subtle changes in traffic patterns.
What You'll Learn
You'll learn statistical and ML-based anomaly detection techniques for API metrics.
Why It Matters
Static thresholds miss slow degradations and complex anomalies. Anomaly detection catches subtle issues before they become critical.
Real-World Use
An API's normal traffic is 1000 requests/second. Anomaly detection spots a gradual increase to 1200 req/s over 2 hours. Investigation reveals a scraper slowly ramping up to avoid detection. Static thresholds (e.g., >2000) would not catch this.
Implementation
import numpy as np
from scipy import stats
from collections import deque
class StatisticalAnomalyDetector:
def __init__(self, window_size=100, z_threshold=3):
self.window = deque(maxlen=window_size)
self.z_threshold = z_threshold
def add_observation(self, value):
self.window.append(value)
if len(self.window) < 10:
return False
mean = np.mean(self.window)
std = np.std(self.window)
if std == 0:
return False
z_score = abs(value - mean) / std
if z_score > self.z_threshold:
print(f"Anomaly detected: {value:.2f}"
f" (z-score: {z_score:.2f}, mean: {mean:.2f})")
return True
return False
class SeasonalDecomposition:
def __init__(self, period=24, threshold=2):
self.period = period
self.threshold = threshold
self.history = {}
def check_hourly(self, metric_name, value, hour):
if metric_name not in self.history:
self.history[metric_name] = {}
if hour not in self.history[metric_name]:
self.history[metric_name][hour] = deque(maxlen=100)
self.history[metric_name][hour].append(value)
if len(self.history[metric_name][hour]) >= 5:
values = self.history[metric_name][hour]
mean = np.mean(values)
std = np.std(values)
if std > 0 and abs(value - mean) / std > self.threshold:
return True
return False
detector = StatisticalAnomalyDetector(z_threshold=3)
seasonal = SeasonalDecomposition()
for value in np.random.normal(100, 10, 1000):
detector.add_observation(value)
# Test with anomalous value
detector.add_observation(500)
Anomaly Types
| Type | Description | Detection Method |
|---|---|---|
| Spike | Sudden increase in metric | Z-score, MAD |
| Drop | Sudden decrease | Z-score, MAD |
| Level shift | Sustained change in baseline | CUSUM, change point detection |
| Seasonal deviation | Unusual for time of day | Seasonal decomposition |
| Slow ramp | Gradual increase | Trend analysis, ML models |
Common Mistakes
| Mistake | Fix | |---------|-----| | No baseline period | Detector fires immediately | Collect 2+ weeks of baseline data | | Not accounting for seasonality | Lunch hour looks anomalous | Use seasonal decomposition | | Only detecting spikes | Drops equally important | Monitor both directions | | No anomaly grouping | 1000 alerts for same root cause | Group by metric and time window | | Black-box ML models | Cannot explain alerts | Use interpretable models (isolation forest, Z-score) |
Practice Questions
- What is the difference between static threshold and anomaly detection?
- How does Z-score detect anomalies?
- Why is seasonality important in anomaly detection?
- What is a false positive and how do you reduce it?
- How do you handle gradual degradation (slow ramp)?
Challenge
Implement statistical anomaly detection for API request rate using Z-score. Add seasonal decomposition for hourly patterns. Test with normal traffic and injected anomalies (spike, slow ramp, level shift).
What's Next
Complete the monitoring and analytics project.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro