Grounding LLMs in Code Repo Context

A r/LanguageTechnology thread on "grounding LLM workflows in repo understanding" captured a real 2026 problem: LLMs hallucinate API details constantly, and the fix is not a better model, it is grounding. This post is the practical implementation.

What Grounding Actually Means

Grounding means the LLM's answer is traceable back to a concrete source in the context window. A grounded answer includes citations ("per file x.py line 42, the function returns None") and refuses when the source is absent. An ungrounded answer sounds confident regardless of whether the source exists.

The Problem with Naive RAG

Most teams ship a vector-search RAG, push code chunks in, and call it grounded. It is not. Vector search retrieves semantically similar chunks, which for code is often wrong. The function you asked about might live in a different file than the one with the similar embedding. The LLM then composes an answer based on the wrong chunk and reports it as grounded. This is the worst of both worlds.

Four Grounding Strategies That Work

Structural indexing: index by function/class name, not by chunk similarity. When the user asks about calculateTax, retrieve every occurrence of that symbol.
Call graph traversal: given a function, also retrieve its callers and callees. The LLM gets the full execution context.
Git blame integration: surface the commit that introduced the code. The commit message often explains the why.
External docs grounding: when the code uses a library, retrieve the library's current docs via Scavio SERP with site:docs.* operator.

Implementing Structural Indexing

Tree-sitter parses source files into an AST. Index every top-level function and class name with its file path and line range.

Python

import tree_sitter_python as tspython
from tree_sitter import Language, Parser
import psycopg2, os

parser = Parser(Language(tspython.language()))

def index_file(path: str):
    with open(path) as f:
        src = f.read().encode()
    tree = parser.parse(src)

    conn = psycopg2.connect(os.environ['DATABASE_URL'])
    for node in tree.root_node.children:
        if node.type in ('function_definition', 'class_definition'):
            name = node.child_by_field_name('name').text.decode()
            start, end = node.start_point[0], node.end_point[0]
            with conn.cursor() as c:
                c.execute("""
                    INSERT INTO symbols(name, file, start_line, end_line, body)
                    VALUES (%s, %s, %s, %s, %s)
                """, (name, path, start, end, src[node.start_byte:node.end_byte]))
    conn.commit()

Combining Internal and External Grounding

A code agent that only sees your repo is missing half the context. The other half is the library documentation. Scavio plus the site operator on docs domains pulls in the third-party side.

Python

import os, requests
API_KEY = os.environ['SCAVIO_API_KEY']

def ground_answer(symbol: str, library_domain: str | None = None):
    internal = lookup_symbol(symbol)  # from structural index

    external = []
    if library_domain:
        r = requests.post('https://api.scavio.dev/api/v1/search',
            headers={'x-api-key': API_KEY},
            json={'query': f'site:{library_domain} {symbol}'})
        external = r.json().get('organic_results', [])[:3]

    return {'internal': internal, 'external': external}

The Refusal Rule

A grounded agent must refuse when it has nothing to cite. The system prompt is explicit:

Text

SYSTEM PROMPT:
You are a code assistant. For every factual claim:
1. Cite a specific file and line range from the context, or
2. Cite a specific external documentation URL from the context, or
3. Say "I don't have that information in my context" and stop.

Never invent function names, parameter types, or return values.

How to Evaluate Grounding Quality

Build a test set of 50 questions where the correct answer is unambiguously in one specific file. Score the agent on: (a) does the answer cite that file? (b) is the line range correct? (c) if removed from context, does the agent refuse?

A well-grounded agent scores above 80% on (a) and (b) and 100% on (c). Most naive RAG implementations land around 40% on (a) and well below 100% on (c), which is why they feel "confident but wrong".

The Honest Limitation

Grounding raises the floor on hallucination but does not eliminate it. An LLM can still misread a correctly retrieved chunk. Pair grounded retrieval with unit tests on the agent's output (does the code it suggests actually compile and pass tests?) for the highest reliability tier.

The full implementation walkthrough is in the how-to-ground-llm-with-github-repo-data tutorial.

What Grounding Actually Means

The Problem with Naive RAG

Four Grounding Strategies That Work

Structural indexing: index by function/class name, not by chunk similarity. When the user asks about calculateTax, retrieve every occurrence of that symbol.
Call graph traversal: given a function, also retrieve its callers and callees. The LLM gets the full execution context.
Git blame integration: surface the commit that introduced the code. The commit message often explains the why.
External docs grounding: when the code uses a library, retrieve the library's current docs via Scavio SERP with site:docs.* operator.

Implementing Structural Indexing

Tree-sitter parses source files into an AST. Index every top-level function and class name with its file path and line range.

Python

import tree_sitter_python as tspython
from tree_sitter import Language, Parser
import psycopg2, os

parser = Parser(Language(tspython.language()))

def index_file(path: str):
    with open(path) as f:
        src = f.read().encode()
    tree = parser.parse(src)

    conn = psycopg2.connect(os.environ['DATABASE_URL'])
    for node in tree.root_node.children:
        if node.type in ('function_definition', 'class_definition'):
            name = node.child_by_field_name('name').text.decode()
            start, end = node.start_point[0], node.end_point[0]
            with conn.cursor() as c:
                c.execute("""
                    INSERT INTO symbols(name, file, start_line, end_line, body)
                    VALUES (%s, %s, %s, %s, %s)
                """, (name, path, start, end, src[node.start_byte:node.end_byte]))
    conn.commit()

Combining Internal and External Grounding

A code agent that only sees your repo is missing half the context. The other half is the library documentation. Scavio plus the site operator on docs domains pulls in the third-party side.

Python

import os, requests
API_KEY = os.environ['SCAVIO_API_KEY']

def ground_answer(symbol: str, library_domain: str | None = None):
    internal = lookup_symbol(symbol)  # from structural index

    external = []
    if library_domain:
        r = requests.post('https://api.scavio.dev/api/v1/search',
            headers={'x-api-key': API_KEY},
            json={'query': f'site:{library_domain} {symbol}'})
        external = r.json().get('organic_results', [])[:3]

    return {'internal': internal, 'external': external}

The Refusal Rule

A grounded agent must refuse when it has nothing to cite. The system prompt is explicit:

Text

SYSTEM PROMPT:
You are a code assistant. For every factual claim:
1. Cite a specific file and line range from the context, or
2. Cite a specific external documentation URL from the context, or
3. Say "I don't have that information in my context" and stop.

Never invent function names, parameter types, or return values.

How to Evaluate Grounding Quality

A well-grounded agent scores above 80% on (a) and (b) and 100% on (c). Most naive RAG implementations land around 40% on (a) and well below 100% on (c), which is why they feel "confident but wrong".

The Honest Limitation

The full implementation walkthrough is in the how-to-ground-llm-with-github-repo-data tutorial.

Grounding LLMs in Code Repo Context

What Grounding Actually Means

The Problem with Naive RAG

Four Grounding Strategies That Work

Implementing Structural Indexing

Combining Internal and External Grounding

The Refusal Rule

How to Evaluate Grounding Quality

The Honest Limitation

Continue reading

AEO Tracking for D2C Ecommerce Brands in 2026

Agent Discovery vs Extraction: Why Cost Split Matters

Grounding LLMs in Code Repo Context

What Grounding Actually Means

The Problem with Naive RAG

Four Grounding Strategies That Work

Implementing Structural Indexing

Combining Internal and External Grounding

The Refusal Rule

How to Evaluate Grounding Quality

The Honest Limitation

Continue reading

AEO Tracking for D2C Ecommerce Brands in 2026

Agent Discovery vs Extraction: Why Cost Split Matters