Health Check Endpoint Returning 503 Fix
In this tutorial, you'll learn about Health Check Endpoint Returning 503 Fix. We cover key concepts, practical examples, and best practices.
Your health check endpoint returns HTTP 503 Service Unavailable — the application is running but reports itself as unhealthy. Kubernetes, load balancers, or monitoring systems mark the service as down.
The Problem
# WRONG — health check that fails on any non-critical dependency failure
from flask import Flask, jsonify
import redis
app = Flask(__name__)
@app.route('/health')
def health():
try:
redis.ping()
return jsonify({"status": "healthy"})
except:
return jsonify({"status": "unhealthy"}), 503
Redis is a caching layer. A Redis restart causes the health check to return 503, even though the application can still serve requests using database fallback. Kubernetes kills the pod, causing unnecessary disruption.
Step-by-Step Fix
1. Separate liveness and readiness probes
@app.route('/health/live')
def liveness():
# Simple: is the app process alive?
return jsonify({"status": "alive"}), 200
@app.route('/health/ready')
def readiness():
# More thorough: can the app serve traffic?
checks = {
"database": check_database(),
"redis": check_redis(),
"queue": check_queue()
}
ready = all(checks.values())
return jsonify({"status": "ready" if ready else "not ready", "checks": checks}), \
200 if ready else 503
2. Configure Kubernetes probes correctly
apiVersion: v1
kind: Pod
metadata:
name: myapp
spec:
containers:
- name: app
image: myapp:latest
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 10
periodSeconds: 15
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3
3. Add appropriate timeouts and thresholds
readinessProbe:
httpGet:
path: /health/ready
port: 8080
timeoutSeconds: 3 # Don't wait too long
failureThreshold: 3 # Allow 3 failures before marking unhealthy
periodSeconds: 10
4. Implement graceful degradation checks
def check_database(timeout=2):
try:
db.execute("SELECT 1", timeout=timeout)
return True
except Exception:
return False
def check_redis(timeout=1):
try:
redis.ping()
return True
except Exception:
return True # Degraded: Redis unavailable but app still works
5. Log health check failures
@app.route('/health')
def health():
import logging
logger = logging.getLogger(__name__)
try:
db.execute("SELECT 1")
db_ok = True
except Exception as e:
db_ok = False
logger.warning(f"Health check: database unavailable: {e}")
status = "healthy" if db_ok else "degraded"
return jsonify({"status": status, "database": db_ok}), 200 if db_ok else 503
Expected output:
$ curl http://localhost:8080/health/ready
{
"status": "ready",
"checks": {
"database": true,
"redis": true,
"queue": true
}
}
Prevention Tips
- Separate liveness (is process alive) from readiness (can serve traffic)
- Use
initialDelaySecondsto avoid startup failures - Use
failureThreshold: 3to avoid flapping - Log health check failures for debugging
- Degrade gracefully — don't fail on non-critical dependencies
Common Mistakes with endpoint 503
- Forgetting
deriving (Show, Eq)on custom data types needed for debugging - Placing the wildcard pattern first in case expressions, making all subsequent patterns unreachable
- Using
headandtailinstead of pattern matching, causing runtime errors on empty lists
These mistakes appear frequently in real-world HEALTHCHECK code. DodaTech's contributors have identified these patterns through analysis of open-source projects and production systems.
Practice Exercise
Write a pure function that safely divides two integers using Maybe, then test it with edge cases like division by zero and negative numbers.
This exercise reinforces the concepts covered in this guide. Try implementing it before checking online solutions.
FAQ
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro