Tutorial

How to Handle Cloudflare Turnstile Challenges

Cloudflare Turnstile blocks most scrapers. Route requests through Scavio's managed resolver to handle the challenge transparently.

Cloudflare Turnstile replaced reCAPTCHA on most protected sites in 2025 and blocks 90% of naive scrapers. This tutorial shows how to route requests through Scavio's managed resolver so the challenge is handled transparently and your scraper returns clean HTML.

Prerequisites

  • Python 3.10+
  • A Scavio API key
  • A target URL behind Turnstile

Walkthrough

Step 1: Detect the Turnstile block

A baseline fetch returns a challenge page, not your content.

Python
import requests
html = requests.get('https://turnstile-protected.com').text
if 'Just a moment' in html or 'challenge-platform' in html:
    print('Blocked by Turnstile')

Step 2: Route through Scavio extract

Scavio handles the challenge behind the scenes.

Python
import os
API_KEY = os.environ['SCAVIO_API_KEY']

def fetch(url):
    r = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'query': url, 'platform': 'extract', 'render_js': True})
    return r.json().get('html', '')

Step 3: Validate the response

No Turnstile markers in the returned HTML.

Python
def passed(html):
    return 'challenge-platform' not in html and len(html) > 1000

Step 4: Retry with stronger profile

If still blocked, ask Scavio for the premium resolver.

Python
def fetch_premium(url):
    r = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'query': url, 'platform': 'extract', 'render_js': True, 'resolver': 'premium'})
    return r.json().get('html', '')

Step 5: Cache to avoid rework

Keep successful fetches cached for 24h.

Python
import time, hashlib, os

def cache_key(url):
    return 'cache/' + hashlib.md5(url.encode()).hexdigest() + '.html'

def cached_fetch(url):
    k = cache_key(url)
    if os.path.exists(k) and time.time() - os.path.getmtime(k) < 86400:
        return open(k).read()
    html = fetch(url)
    os.makedirs('cache', exist_ok=True); open(k, 'w').write(html)
    return html

Python Example

Python
import os, requests

API_KEY = os.environ['SCAVIO_API_KEY']

def fetch(url):
    r = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'query': url, 'platform': 'extract', 'render_js': True, 'resolver': 'premium'})
    return r.json().get('html', '')

html = fetch('https://turnstile-protected.com')
print('clean' if 'challenge-platform' not in html else 'still blocked')

JavaScript Example

JavaScript
const API_KEY = process.env.SCAVIO_API_KEY;
export async function fetchPage(url) {
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query: url, platform: 'extract', render_js: true, resolver: 'premium' })
  });
  return (await r.json()).html;
}

Expected Output

JSON
Clean HTML from Turnstile-protected pages in 2-8 seconds. Typical success rate via premium resolver: 95%+ on Turnstile-protected pages.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.10+. A Scavio API key. A target URL behind Turnstile. A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Cloudflare Turnstile blocks most scrapers. Route requests through Scavio's managed resolver to handle the challenge transparently.