Grounding LLM answers in source code beats hallucinated explanations. This tutorial uses Scavio's SERP with site:github.com plus its fetch endpoint to bring repo content into the agent loop without a heavy GitHub API integration.
Prerequisites
- Python 3.10+
- A Scavio API key
- An LLM API key
Walkthrough
Step 1: Search inside a repo via SERP
site:github.com/ORG/REPO scoped search finds the right file fast.
import requests, os
API_KEY = os.environ['SCAVIO_API_KEY']
def repo_search(repo, query):
r = requests.post('https://api.scavio.dev/api/v1/search',
headers={'x-api-key': API_KEY},
json={'query': f'site:github.com/{repo} {query}', 'num_results': 10})
return r.json().get('organic_results', [])Step 2: Fetch the selected file
GitHub raw URLs work with Scavio's fetch endpoint.
def fetch_raw(url):
raw = url.replace('github.com', 'raw.githubusercontent.com').replace('/blob/', '/')
r = requests.post('https://api.scavio.dev/api/v1/extract',
headers={'x-api-key': API_KEY},
json={'url': raw})
return r.json().get('content', '')Step 3: Ground the answer
Pass the fetched content into the LLM prompt with source citation.
import anthropic
client = anthropic.Anthropic()
def grounded_answer(repo, question):
hits = repo_search(repo, question)
content = fetch_raw(hits[0]['link']) if hits else ''
msg = client.messages.create(
model='claude-sonnet-4-6',
max_tokens=1024,
messages=[{'role': 'user', 'content': f'{question}\n\nCONTEXT:\n{content[:4000]}'}])
return msg.content[0].textStep 4: Add multi-file composition
Pull top 3 results, rank by relevance, compose context.
def multi_file_context(repo, question):
hits = repo_search(repo, question)[:3]
return '\n\n'.join([fetch_raw(h['link'])[:2000] for h in hits])Step 5: Validate citations
Ensure the LLM response mentions at least one source URL.
def has_citations(answer, urls):
return any(u in answer for u in urls)Python Example
import os, requests
API_KEY = os.environ['SCAVIO_API_KEY']
def repo_grounded(repo, question):
r = requests.post('https://api.scavio.dev/api/v1/search',
headers={'x-api-key': API_KEY},
json={'query': f'site:github.com/{repo} {question}'})
return r.json().get('organic_results', [])[:3]
print(repo_grounded('prisma/prisma', 'migrate.ts'))JavaScript Example
const API_KEY = process.env.SCAVIO_API_KEY;
export async function repoGrounded(repo, question) {
const r = await fetch('https://api.scavio.dev/api/v1/search', {
method: 'POST',
headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
body: JSON.stringify({ query: `site:github.com/${repo} ${question}` })
});
return ((await r.json()).organic_results || []).slice(0, 3);
}Expected Output
LLM answers cite exact files and code paths in the target repo. Hallucination rate drops materially versus ungrounded answers.