An r/webscraping post: browser-side YouTube clipper hit Supabase IP-level firewall blocks. This walks the architectural fix.
Prerequisites
- Scavio API key
- Decide: do you need video bytes or just metadata?
Walkthrough
Step 1: Classify need: metadata or bytes?
Most clip-tool UX needs transcripts + timestamps; not bytes.
# Decision: if iframe playback + transcript-driven clip moments are acceptable, you don't need bytes.Step 2: Metadata path: Scavio YouTube endpoint
Typed JSON with title, duration, transcript_segments, chapters.
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}
r = requests.post('https://api.scavio.dev/api/v1/search', headers=H, json={'platform': 'youtube', 'url': video_url, 'include_transcript': True}).json()Step 3: Front-end: iframe playback + transcript clip UX
User picks a clip moment from transcript; iframe seeks to timestamp.
// <iframe src={`https://www.youtube.com/embed/${id}?start=${start}&end=${end}`} />Step 4: Bytes path (only if needed): edge worker
Cloudflare/Vercel Edge does fetch with rotating residential proxy.
// Cloudflare Worker (sketch): fetch(`https://www.youtube.com/watch?v=${id}`, { /* proxy headers */ })Step 5: Cache aggressively per video URL
Cache transcripts in Postgres or Redis.
# Cache key: youtube:{video_id} → expire 7 days.Step 6: Test: no Supabase IP fights
Tail logs.
# Sanity: tail logs; expect zero 'YouTube blocked the request' errors.Python Example
# Per video: ~1 Scavio credit. Cache hit rate after first 1K videos: typically 70%+.JavaScript Example
// Same shape in Node + Hono on edge.Expected Output
Metadata path produces typed JSON without IP blocks. Byte path (when needed) routes through residential proxy.