Skip to content

Scrape Shield — Protect Your Content

DodaTech 5 min read

In this tutorial, you'll learn about Scrape Shield. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

Cloudflare Scrape Shield combines five content protection features — email obfuscation, server-side excludes, hotlink protection, disablement of right-click, and script-based content protection — to defend against automated scraping.

What You Will Learn

You will learn how each Scrape Shield component works, how to enable them through the dashboard and API, and how to balance protection against legitimate user experience.

Why It Matters

Content scraping costs businesses billions annually in lost revenue, bandwidth theft, and competitive intelligence leaks. Scrape Shield provides a baseline defence that blocks casual scrapers without requiring custom development.

Real-World Use Case

A real estate listings site found their property data being scraped daily by competing sites. Enabling Scrape Shield's email obfuscation and server-side excludes stopped 70% of automated scrapers, while the remaining sophisticated scrapers were handled by WAF Rate Limiting rules.

Scrape Shield Components

Feature Purpose
Email Address Obfuscation Hides emails from harvesters
Server-Side Excludes Respects robots.txt crawl-delay
Hotlink Protection Blocks image leeching
Disable Right-Click Prevents manual copy-paste
Script-Based Protection Injects JS to detect automation
flowchart LR
  A[Visitor Request] --> B{Scrape Shield Check}
  B --> C[1. Email Obfuscation]
  B --> D[2. Server-Side Excludes]
  B --> E[3. Hotlink Protection]
  B --> F[4. Right-Click Disable]
  B --> G[5. Script Detection]
  C --> H[Protected Content]
  D --> H
  E --> H
  F --> H
  G -->|Automation Detected| I["Block / Challenge"]
  G -->|Normal Browser| H

Enabling Scrape Shield via Dashboard

  1. Go to Security > Settings in your Cloudflare dashboard.
  2. Under Scrape Shield, toggle each feature:
    • Email Address Obfuscation: On
    • Server-Side Excludes: On
    • Hotlink Protection: On
    • Disable Right-Click: Optional (use selectively)
    • Script-Based Protection: On
  3. Test each feature individually to verify it does not break site functionality.

API: Enable All Scrape Shield Features

# Enable Email Obfuscation
curl -X PATCH "https://api.cloudflare.com/client/v4/zones/ZONE_ID/settings/email_obfuscation" \
  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"value": "on"}'

# Enable Server-Side Excludes
curl -X PATCH "https://api.cloudflare.com/client/v4/zones/ZONE_ID/settings/server_side_exclude" \
  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"value": "on"}'

Expected output for each:

{
  "result": { "id": "email_obfuscation", "value": "on" },
  "success": true
}

Python: Verify Scrape Shield Status

import os
import requests

ZONE_ID = os.environ["CLOUDFLARE_ZONE_ID"]
TOKEN = os.environ["CLOUDFLARE_API_TOKEN"]
headers = {"Authorization": f"Bearer {TOKEN}", "Content-Type": "application/json"}

features = ["email_obfuscation", "server_side_exclude", "hotlink_protection"]

for feature in features:
    url = f"https://api.cloudflare.com/client/v4/zones/{ZONE_ID}/settings/{feature}"
    resp = requests.get(url, headers=headers)
    status = resp.json()["result"]["value"]
    print(f"{feature}: {status}")

Expected output:

email_obfuscation: on
server_side_exclude: on
hotlink_protection: on

Testing Email Obfuscation

Before enabling email obfuscation, an email in your HTML looks like:

<a href="mailto:hello@example.com">Email us</a>

After enabling, Cloudflare rewrites it as:

<a href="mailto:hello
<!-- obfuscation comment -->
@example.com">Email us</a>

Scrapers parsing plain HTML cannot extract the address, but browsers render it correctly.

import requests
import re

url = "https://example.com/contact"
resp = requests.get(url)

# Try to extract email from raw HTML
emails = re.findall(r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}", resp.text)
print(f"Emails found in HTML: {len(emails)}")

# Browser-rendered version would show the email visually
print(f"Response size: {len(resp.text)} bytes")

Expected output:

Emails found in HTML: 0
Response size: 12450 bytes

Common Mistakes

Mistake Consequence
Enabling email obfuscation for API responses Breaks JSON email fields
Turning on right-click disable globally Frustrates legitimate users
Not testing with screen readers Accessibility features break
Relying only on Scrape Shield Determined scrapers bypass it
Enabling script protection on static sites Adds unnecessary JS overhead

Practice Questions

  1. How does email obfuscation prevent harvesters while keeping emails visible to users?
  2. Why should you avoid enabling right-click disable on documentation sites?
  3. What limits the effectiveness of Scrape Shield against advanced scrapers?

Challenge

Write a Python script that attempts to scrape your site before and after enabling Scrape Shield. Compare the amount of extractable content, number of visible email addresses, and response sizes. Document what Scrape Shield successfully protected.

Real-World Task

Your directory site displays business contact information including email addresses. Enable email obfuscation and hotlink protection. Verify that emails appear correctly in browsers but are not extractable from raw HTML using a simple regex scraper. Create a Cloudflare WAF rule as a second layer to block IPs that make more than 100 requests per minute.

FAQ

{{< faq "Does email obfuscation work with JavaScript-rendered content?">}} Cloudflare's email obfuscation works on HTML served from the origin before JS executes. If your email addresses are rendered via JavaScript frameworks like React or Vue, Cloudflare cannot obfuscate them because the origin response does not contain the email in its raw HTML form. Use a Worker or server-side rendering to apply obfuscation. {{< /faq >}}

Can Scrape Shield block all scrapers?

No single tool stops all scrapers. Scrape Shield blocks casual scrapers and automated harvesters. Sophisticated scrapers using headless browsers that mimic real users can bypass Scrape Shield. For advanced protection, combine Scrape Shield with Bot Fight Mode, Rate Limiting, and WAF custom rules.

Does server-side excludes modify my robots.txt?

No. Server-Side Excludes works by instructing Cloudflare to block certain URL patterns at the edge using rules similar to robots.txt directives but enforced at the network level rather than depending on crawler Compliance. Actual robots.txt files are unaffected and served as-is.


Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro — security-first tools for the modern web.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro