Karpathy LLM Wiki Stack in 2026
An r/AI_Agents post asked for tools. The minimum stack: Scavio + Qdrant + LLM. Three vendors, four ingestion surfaces, citation-grounded answers.
An r/AI_Agents post asked specifically about content, web-scraping, web search tools, ingestion libraries, or MCPs suited to a Karpathy-style LLM Wiki. The post is short, the answers in the thread are scattered. The practical 2026 stack is more concrete than the thread suggests.
What a Karpathy-style LLM Wiki ingests
Four surfaces, in roughly this order of usefulness:
- Web SERP for top-ranked summaries of any topic
- Reddit for community discussion and edge cases
- YouTube for lecture transcripts and demos
- Arxiv for primary papers
The minimum stack handles all four with citation-grade source URLs and embedding-ready markdown.
The 2026 minimum stack
- Scavio for search + extract across web/Reddit/YouTube
- Firecrawl for whole-site crawls of paginated docs (e.g., docs.python.org)
- Qdrant Cloud for vector storage
- Jina or OpenAI for embeddings
- Claude/GPT/DeepSeek for citation-grounded answers
Verified pricing for the data layer (2026-04):
- Scavio: $30/mo for 7K credits ($0.0043 per search or extract)
- Firecrawl Hobby: $16/mo for 3K credits
- Qdrant Cloud: free 1GB; from $25/mo
- Jina embeddings: free tier; paid PAYG
The ingestion loop
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}
def discover(topic):
"""Pull candidate sources from Google + Reddit + YouTube under one key."""
return {
'web': requests.post('https://api.scavio.dev/api/v1/search',
headers=H, json={'query': topic}).json().get('organic_results', []),
'reddit': requests.post('https://api.scavio.dev/api/v1/reddit/search',
headers=H, json={'query': topic}).json().get('posts', []),
'youtube': requests.post('https://api.scavio.dev/api/v1/youtube/search',
headers=H, json={'query': topic}).json().get('videos', []),
}
def extract(url):
"""Markdown-ready content for embedding."""
return requests.post('https://api.scavio.dev/api/v1/extract',
headers=H, json={'url': url, 'format': 'markdown'}).json().get('markdown', '')Citation-grounded answers
The wiki's value is sourced answers, not just retrieval. A strict citation prompt enforces this:
PROMPT = '''Answer using ONLY the sources. Every claim followed by [N] (the source number).
If sources don't answer, say "I don't know based on these sources."
Sources:
{src}
Question: {q}'''
# At render time, [1] becomes a clickable link to source URL.Cost math at typical wiki volumes
100 wiki questions/day, each requiring 5 search + 3 extract + 1 LLM call:
- Scavio: 800 credits/day = ~24K credits/mo. Sits inside Bootstrap tier ($100/mo for 28K).
- Qdrant: free tier covers ~1M points
- LLM tokens: ~$0.05/answer = $5/day = $150/mo at typical Claude Sonnet rates
Total: ~$250/mo for a 100-questions/day wiki with citation-grounded answers across web/Reddit/YouTube.
Honest constraints
Scavio doesn't replace the vector store, the embedding model, or the LLM. It replaces the search + extract layer — usually 3-5 vendors before consolidation. The wiki still needs Qdrant or Weaviate for semantic recall and a model for embeddings.
Whole-site recursive crawls (e.g., ingest all of docs.python.org) belong in Firecrawl's crawl mode. Scavio is per-URL extract, not site walker.
Why this stack vs alternatives
The Tavily + Reddit scraper + YouTube Data API + Firecrawl + Qdrant + LLM stack is six vendors. The Scavio + Qdrant + LLM stack is three. Vendor sprawl eats more time than per-call cost saves; consolidation is the real win.