How long does this ground an llm with github repo data tutorial take?

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

What do I need before starting?

Python 3.10+. A Scavio API key. An LLM API key. A Scavio API key gives you 50 free credits on signup.

Can I run this tutorial with the free tier?

Yes. The free tier includes 50 credits on signup, which is more than enough to complete this tutorial and prototype a working solution.

What frameworks does this work with?

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Ground LLM with GitHub Repo Data (2026)

Grounding LLM answers in source code beats hallucinated explanations. This tutorial uses Scavio's SERP with site:github.com plus its fetch endpoint to bring repo content into the agent loop without a heavy GitHub API integration.

Prerequisites

Python 3.10+
A Scavio API key
An LLM API key

Walkthrough

Step 1: Search inside a repo via SERP

site:github.com/ORG/REPO scoped search finds the right file fast.

Python

import requests, os
API_KEY = os.environ['SCAVIO_API_KEY']

def repo_search(repo, query):
    r = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'query': f'site:github.com/{repo} {query}', 'num_results': 10})
    return r.json().get('organic_results', [])

Step 2: Fetch the selected file

GitHub raw URLs work with Scavio's fetch endpoint.

Python

def fetch_raw(url):
    raw = url.replace('github.com', 'raw.githubusercontent.com').replace('/blob/', '/')
    r = requests.post('https://api.scavio.dev/api/v1/extract',
        headers={'x-api-key': API_KEY},
        json={'url': raw})
    return r.json().get('content', '')

Step 3: Ground the answer

Pass the fetched content into the LLM prompt with source citation.

Python

import anthropic
client = anthropic.Anthropic()

def grounded_answer(repo, question):
    hits = repo_search(repo, question)
    content = fetch_raw(hits[0]['link']) if hits else ''
    msg = client.messages.create(
        model='claude-sonnet-4-6',
        max_tokens=1024,
        messages=[{'role': 'user', 'content': f'{question}\n\nCONTEXT:\n{content[:4000]}'}])
    return msg.content[0].text

Step 4: Add multi-file composition

Pull top 3 results, rank by relevance, compose context.

Python

def multi_file_context(repo, question):
    hits = repo_search(repo, question)[:3]
    return '\n\n'.join([fetch_raw(h['link'])[:2000] for h in hits])

Step 5: Validate citations

Ensure the LLM response mentions at least one source URL.

Python

def has_citations(answer, urls):
    return any(u in answer for u in urls)

Python Example

Python

import os, requests
API_KEY = os.environ['SCAVIO_API_KEY']

def repo_grounded(repo, question):
    r = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'query': f'site:github.com/{repo} {question}'})
    return r.json().get('organic_results', [])[:3]

print(repo_grounded('prisma/prisma', 'migrate.ts'))

JavaScript Example

JavaScript

const API_KEY = process.env.SCAVIO_API_KEY;
export async function repoGrounded(repo, question) {
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query: `site:github.com/${repo} ${question}` })
  });
  return ((await r.json()).organic_results || []).slice(0, 3);
}

Expected Output

JSON

LLM answers cite exact files and code paths in the target repo. Hallucination rate drops materially versus ungrounded answers.

Prerequisites

Python 3.10+
A Scavio API key
An LLM API key

Walkthrough

Step 1: Search inside a repo via SERP

site:github.com/ORG/REPO scoped search finds the right file fast.

Python

import requests, os
API_KEY = os.environ['SCAVIO_API_KEY']

def repo_search(repo, query):
    r = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'query': f'site:github.com/{repo} {query}', 'num_results': 10})
    return r.json().get('organic_results', [])

Step 2: Fetch the selected file

GitHub raw URLs work with Scavio's fetch endpoint.

Python

def fetch_raw(url):
    raw = url.replace('github.com', 'raw.githubusercontent.com').replace('/blob/', '/')
    r = requests.post('https://api.scavio.dev/api/v1/extract',
        headers={'x-api-key': API_KEY},
        json={'url': raw})
    return r.json().get('content', '')

Step 3: Ground the answer

Pass the fetched content into the LLM prompt with source citation.

Python

import anthropic
client = anthropic.Anthropic()

def grounded_answer(repo, question):
    hits = repo_search(repo, question)
    content = fetch_raw(hits[0]['link']) if hits else ''
    msg = client.messages.create(
        model='claude-sonnet-4-6',
        max_tokens=1024,
        messages=[{'role': 'user', 'content': f'{question}\n\nCONTEXT:\n{content[:4000]}'}])
    return msg.content[0].text

Step 4: Add multi-file composition

Pull top 3 results, rank by relevance, compose context.

Python

def multi_file_context(repo, question):
    hits = repo_search(repo, question)[:3]
    return '\n\n'.join([fetch_raw(h['link'])[:2000] for h in hits])

Step 5: Validate citations

Ensure the LLM response mentions at least one source URL.

Python

def has_citations(answer, urls):
    return any(u in answer for u in urls)

Python Example

Python

import os, requests
API_KEY = os.environ['SCAVIO_API_KEY']

def repo_grounded(repo, question):
    r = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY},
        json={'query': f'site:github.com/{repo} {question}'})
    return r.json().get('organic_results', [])[:3]

print(repo_grounded('prisma/prisma', 'migrate.ts'))

JavaScript Example

JavaScript

const API_KEY = process.env.SCAVIO_API_KEY;
export async function repoGrounded(repo, question) {
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query: `site:github.com/${repo} ${question}` })
  });
  return ((await r.json()).organic_results || []).slice(0, 3);
}

Expected Output

JSON

LLM answers cite exact files and code paths in the target repo. Hallucination rate drops materially versus ungrounded answers.

How to Ground an LLM with GitHub Repo Data

Prerequisites

Walkthrough

Step 1: Search inside a repo via SERP

Step 2: Fetch the selected file

Step 3: Ground the answer

Step 4: Add multi-file composition

Step 5: Validate citations

Python Example

JavaScript Example

Expected Output

Related Tutorials

Frequently Asked Questions

How long does this ground an llm with github repo data tutorial take?

What do I need before starting?

Can I run this tutorial with the free tier?

What frameworks does this work with?

Related Resources

RAG Grounding Post-Google I/O 2026

GitHub Issue Context for Coding Agents

Best Agent Search Grounding Tools in 2026

Improve RAG Answer Quality with Search Grounding

Karpathy LLM Wiki-Style RAG Agent

Grounding LLM Workflows

Start Building

How to Ground an LLM with GitHub Repo Data

Prerequisites

Walkthrough

Step 1: Search inside a repo via SERP

Step 2: Fetch the selected file

Step 3: Ground the answer

Step 4: Add multi-file composition

Step 5: Validate citations

Python Example

JavaScript Example

Expected Output

Related Tutorials

Frequently Asked Questions

How long does this ground an llm with github repo data tutorial take?

What do I need before starting?

Can I run this tutorial with the free tier?

What frameworks does this work with?

Related Resources

RAG Grounding Post-Google I/O 2026

GitHub Issue Context for Coding Agents

Best Agent Search Grounding Tools in 2026

Improve RAG Answer Quality with Search Grounding

Karpathy LLM Wiki-Style RAG Agent

Grounding LLM Workflows

Start Building