Skip to content

Input Validation for APIs — Complete Injection Prevention Guide

DodaTech Updated 2026-06-28 3 min read

In this tutorial, you will learn about Input Validation for APIs. We cover key concepts, practical examples, and best practices to help you master this topic.

Input validation is the process of verifying that user-supplied data conforms to expected formats, types, and ranges before processing. It is the primary defense against injection attacks, including SQL Injection and Command Injection.

What You'll Learn

You'll learn input validation techniques including whitelist validation, schema validation, sanitization, and secure Parsing.

Why It Matters

Injection attacks are the number one web security risk according to OWASP. Proper input validation would prevent over 80% of all web vulnerabilities, including SQLi, XSS, and command injection.

Real-World Use

An e-commerce API validates product IDs as integers between 1 and 100,000. An attacker sending "1; DROP TABLE products" is rejected immediately because the input is not a valid integer.

flowchart LR
    A[Raw Input] --> B{Type Check}
    B -->|Valid| C{Length Check}
    B -->|Invalid| D[Reject]
    C -->|Valid| E{Pattern Match}
    C -->|Invalid| D
    E -->|Match| F{Sanitize}
    E -->|No Match| D
    F --> G[Safe Input]
    D --> H[400 Bad Request]

Teacher's Mindset

Input validation is like checking IDs at an airport security checkpoint. You verify the name matches the ticket (format), check expiration (validity), and scan for prohibited items (sanitization) before letting passengers through.

Implementing Input Validation

from flask import Flask, request, jsonify
import re

app = Flask(__name__)

def validate_username(username: str) -> bool:
    if not username or len(username) > 30:
        return False
    return bool(re.match(r"^[a-zA-Z0-9_]+$", username))

def validate_email(email: str) -> bool:
    if not email or len(email) > 254:
        return False
    pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
    return bool(re.match(pattern, email))

@app.route("/api/users", methods=["POST"])
def create_user():
    data = request.json
    if not validate_username(data.get("username", "")):
        return jsonify({"error": "Invalid username"}), 400
    if not validate_email(data.get("email", "")):
        return jsonify({"error": "Invalid email"}), 400
    return jsonify({"message": "User created"})
# Schema validation with Pydantic
from pydantic import BaseModel, EmailStr, Field, validator

class CreateUserRequest(BaseModel):
    username: str = Field(..., min_length=3, max_length=30, pattern=r"^[a-zA-Z0-9_]+$")
    email: EmailStr
    age: int = Field(..., ge=0, le=150)
    bio: str = Field("", max_length=500)

    @validator("username")
    def username_must_be_valid(cls, v):
        if "admin" in v.lower():
            raise ValueError("Username cannot contain 'admin'")
        return v

user = CreateUserRequest(
    username="john_doe",
    email="john@example.com",
    age=25
)
print(user.json())
# Input sanitization
import html

def sanitize_input(value: str) -> str:
    value = html.escape(value)
    value = value.strip()
    value = value.replace("\0", "")
    return value

def sanitize_filename(filename: str) -> str:
    filename = re.sub(r"[^\w\-_.]", "", filename)
    filename = filename[:255]
    return filename

user_input = "<script>alert('xss')</script>"
print(sanitize_input(user_input))

Common Mistakes

Mistake Why It's Wrong Fix
Blacklist instead of whitelist Attackers bypass with new patterns Always whitelist allowed characters and patterns
Validating only on frontend Attackers call APIs directly Always validate server-side
Not validating all input fields One unvalidated field becomes the attack vector Validate every field, including optional ones
Accepting unexpected fields Mass assignment attacks Strip unknown fields from the request
Using eval or exec on user input Remote code execution Never execute user input as code

Practice Questions

  1. Why is whitelist validation better than blacklist?
  2. What is mass assignment and how do you prevent it?
  3. Why should validation be server-side even with client-side validation?
  4. What is the difference between validation and sanitization?
  5. How does input length validation prevent buffer overflow?

Challenge

Build a comprehensive input validation middleware using Pydantic. Support nested JSON validation, custom validators, and detailed error messages. Test with injection payloads.

FAQ

Should I validate on the client or server?

Both. Client validation improves UX. Server validation is mandatory for security. Never trust the client.

What is the difference between validation and sanitization?

Validation rejects invalid input. Sanitization cleans potentially dangerous content. Use both together for defense in depth.

Can validation prevent all injection attacks?

Validation is the first line of defense. Combine with parameterized queries, output encoding, and least privilege for complete protection.

What is a regex denial of service (ReDoS)?

Some regex patterns have exponential worst-case time. An attacker can send crafted input to freeze your server. Use bounded regex patterns.

How do you validate JSON input?

Use libraries like Pydantic (Python), Zod (TypeScript), or Joi (Node.js). Define schemas with types, constraints, and custom validators.

Mini Project

Create a user registration endpoint with Pydantic validation. Include username (alphanumeric, 3-30 chars), email (valid format), age (13-120), password (8+ chars, 1 uppercase, 1 number). Test with valid and invalid inputs.

What's Next

Learn about output encoding to prevent XSS when returning user-controlled data.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro