Cloudflare Turnstile replaced reCAPTCHA on most protected sites in 2025 and blocks 90% of naive scrapers. This tutorial shows how to route requests through Scavio's managed resolver so the challenge is handled transparently and your scraper returns clean HTML.
Prerequisites
- Python 3.10+
- A Scavio API key
- A target URL behind Turnstile
Walkthrough
Step 1: Detect the Turnstile block
A baseline fetch returns a challenge page, not your content.
import requests
html = requests.get('https://turnstile-protected.com').text
if 'Just a moment' in html or 'challenge-platform' in html:
print('Blocked by Turnstile')Step 2: Route through Scavio extract
Scavio handles the challenge behind the scenes.
import os
API_KEY = os.environ['SCAVIO_API_KEY']
def fetch(url):
r = requests.post('https://api.scavio.dev/api/v1/search',
headers={'x-api-key': API_KEY},
json={'query': url, 'platform': 'extract', 'render_js': True})
return r.json().get('html', '')Step 3: Validate the response
No Turnstile markers in the returned HTML.
def passed(html):
return 'challenge-platform' not in html and len(html) > 1000Step 4: Retry with stronger profile
If still blocked, ask Scavio for the premium resolver.
def fetch_premium(url):
r = requests.post('https://api.scavio.dev/api/v1/search',
headers={'x-api-key': API_KEY},
json={'query': url, 'platform': 'extract', 'render_js': True, 'resolver': 'premium'})
return r.json().get('html', '')Step 5: Cache to avoid rework
Keep successful fetches cached for 24h.
import time, hashlib, os
def cache_key(url):
return 'cache/' + hashlib.md5(url.encode()).hexdigest() + '.html'
def cached_fetch(url):
k = cache_key(url)
if os.path.exists(k) and time.time() - os.path.getmtime(k) < 86400:
return open(k).read()
html = fetch(url)
os.makedirs('cache', exist_ok=True); open(k, 'w').write(html)
return htmlPython Example
import os, requests
API_KEY = os.environ['SCAVIO_API_KEY']
def fetch(url):
r = requests.post('https://api.scavio.dev/api/v1/search',
headers={'x-api-key': API_KEY},
json={'query': url, 'platform': 'extract', 'render_js': True, 'resolver': 'premium'})
return r.json().get('html', '')
html = fetch('https://turnstile-protected.com')
print('clean' if 'challenge-platform' not in html else 'still blocked')JavaScript Example
const API_KEY = process.env.SCAVIO_API_KEY;
export async function fetchPage(url) {
const r = await fetch('https://api.scavio.dev/api/v1/search', {
method: 'POST',
headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
body: JSON.stringify({ query: url, platform: 'extract', render_js: true, resolver: 'premium' })
});
return (await r.json()).html;
}Expected Output
Clean HTML from Turnstile-protected pages in 2-8 seconds. Typical success rate via premium resolver: 95%+ on Turnstile-protected pages.