ScavioScavio
FeaturesPricingDocs
Sign InGet Started
  1. Home
  2. Tutorials
  3. How to Detect Cloudflare Blocking
Tutorial

How to Detect Cloudflare Blocking

Build a detector that flags Cloudflare challenges so your scraper can fall back to a managed API like Scavio instead of retry-storming.

Get Free API KeyAPI Docs

Cloudflare blocks scrapers aggressively in 2026, and 'blocked by cloudflare for 3 days' is the top thread in r/vibecoding. The fix is to detect the Cloudflare challenge page early and route affected URLs to a managed API. This tutorial shows how to build the detector and fall back cleanly.

Prerequisites

  • Python 3.8+
  • requests library
  • A Scavio API key (for fallback)

Walkthrough

Step 1: Identify Cloudflare response signatures

Cloudflare challenges return a 403 or 503 with specific HTML markers.

Python
CF_MARKERS = [
    'cf-browser-verification',
    'cf_chl_opt',
    'checking your browser',
    'ray id'
]

Step 2: Build the detector

Return True if the response contains any Cloudflare markers.

Python
def is_cloudflare_blocked(response):
    if response.status_code in (403, 503):
        return True
    body = response.text.lower()
    return any(m in body for m in CF_MARKERS)

Step 3: Try direct scrape first

Attempt a direct fetch with a browser-like user-agent.

Python
import requests

HEADERS = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36'}

def try_direct(url):
    r = requests.get(url, headers=HEADERS, timeout=10)
    return r if not is_cloudflare_blocked(r) else None

Step 4: Fall back to Scavio

When blocked, route through Scavio's managed extractor.

Python
import os

def fallback_scavio(url):
    r = requests.post('https://api.scavio.dev/api/v1/extract',
        headers={'x-api-key': os.environ['SCAVIO_API_KEY']},
        json={'url': url, 'render_js': True, 'bypass_cloudflare': True})
    return r.json()

Step 5: Combine into a resilient fetcher

Try direct, fall back on detection.

Python
def resilient_fetch(url):
    direct = try_direct(url)
    if direct:
        return {'source': 'direct', 'content': direct.text}
    print(f'Cloudflare detected on {url}, falling back to Scavio')
    return {'source': 'scavio', 'content': fallback_scavio(url).get('html', '')}

Python Example

Python
import os, requests

CF_MARKERS = ['cf-browser-verification', 'cf_chl_opt', 'checking your browser', 'ray id']
HEADERS = {'User-Agent': 'Mozilla/5.0'}

def is_cf(response):
    return response.status_code in (403, 503) or any(m in response.text.lower() for m in CF_MARKERS)

def fetch(url):
    try:
        r = requests.get(url, headers=HEADERS, timeout=10)
        if not is_cf(r):
            return r.text
    except Exception:
        pass
    r = requests.post('https://api.scavio.dev/api/v1/extract',
        headers={'x-api-key': os.environ['SCAVIO_API_KEY']},
        json={'url': url, 'render_js': True, 'bypass_cloudflare': True})
    return r.json().get('html', '')

print(fetch('https://example-cloudflare-site.com')[:200])

JavaScript Example

JavaScript
const CF_MARKERS = ['cf-browser-verification', 'cf_chl_opt', 'checking your browser', 'ray id'];

async function fetchResilient(url) {
  try {
    const r = await fetch(url, { headers: { 'User-Agent': 'Mozilla/5.0' } });
    const body = (await r.text()).toLowerCase();
    if (r.ok && !CF_MARKERS.some(m => body.includes(m))) return body;
  } catch {}
  const r = await fetch('https://api.scavio.dev/api/v1/extract', {
    method: 'POST',
    headers: { 'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ url, render_js: true, bypass_cloudflare: true })
  });
  return (await r.json()).html;
}
console.log(await fetchResilient('https://example-cloudflare-site.com'));

Expected Output

JSON
When the target URL is not blocked, the direct fetch succeeds and returns the HTML. When blocked, the detector prints a Cloudflare alert and Scavio's extractor returns the page content anyway.

Related Tutorials

  • How to Build an SDR Research Agent
  • How to Scrape LinkedIn Post Comments

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8+. requests library. A Scavio API key (for fallback). A Scavio API key gives you 250 free credits per month.

Yes. The free tier includes 250 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Related Resources

Best Of

Best Cloudflare-Resilient Search APIs in 2026

Read more
Comparison

Search APIs (Scavio, Tavily, SerpAPI) vs Headless Browser (Playwright, Puppeteer, Browserbase)

Read more
Best Of

Best AI Web Scraping Tools in 2026

Read more
Glossary

Search API Provider Landscape (2026)

Read more
Comparison

TikTok Proxy Scraping vs TikTok Third-Party API (Scavio, TikAPI)

Read more
Solution

Search Behind Cloudflare Without Getting Blocked

Read more

Start Building

Build a detector that flags Cloudflare challenges so your scraper can fall back to a managed API like Scavio instead of retry-storming.

Get Free API KeyRead the Docs
ScavioScavio

Real-time search API for AI agents. Search every platform, not just Google.

Product

  • Features
  • Pricing
  • Dashboard
  • Affiliates

Developers

  • Documentation
  • API Reference
  • Quickstart
  • MCP Integration
  • Python SDK

Alternatives

  • Tavily Alternative
  • SerpAPI Alternative
  • Firecrawl Alternative
  • Exa Alternative

Tools

  • JSON Formatter
  • cURL to Code
  • Token Counter
  • All Tools

© 2026 Scavio. All rights reserved.

Featured on TAAFT
Terms of ServicePrivacy Policy