agentscontextarchitecture

AI Agent Context Handoff Problem

Agents waste tokens re-explaining context between steps. Structured search results as JSON fit in context windows better than prose summaries.

May 19, 2026

8 min

AI agents waste tokens re-explaining context between steps. When a research agent searches, analyzes, and then searches again, it often re-describes previous findings in the prompt, consuming context window space. Structured search results (JSON with typed fields) compress better than prose summaries, leaving more room for actual reasoning.

The problem: context bloat between steps

A multi-step agent research task works like this: Step 1 searches for "best CRM tools." Step 2 analyzes the results. Step 3 searches for pricing details. But Step 3's prompt includes all of Step 1's results and Step 2's analysis as context. By Step 5, the context window is dominated by accumulated search results, not reasoning.

Tavily AI-summarized results: ~200-500 tokens per search (paragraphs of prose)
Structured JSON results: ~80-150 tokens per search (compact key-value pairs)
Over 50 searches: difference of 5,000-17,500 tokens in context window usage

Structured results: smaller context footprint

Python

import requests, os, json

H = {"x-api-key": os.environ["SCAVIO_API_KEY"]}

def compact_search(query: str):
    """Return minimal structured results that compress well in context."""
    resp = requests.post("https://api.scavio.dev/api/v1/search",
        headers=H, json={"query": query, "platform": "google"})
    results = resp.json().get("organic_results", [])[:5]
    # Compact format: only the fields the agent needs
    return [{
        "t": r.get("title", "")[:60],
        "u": r.get("link", ""),
        "s": r.get("snippet", "")[:100],
    } for r in results]

# Compact results use ~50% fewer tokens than full results
results = compact_search("best CRM for startups 2026")
print(json.dumps(results, indent=2))

The re-search vs pass-through tradeoff

When an agent needs information from a previous step, it has two options: pass the previous results forward in context (costs tokens but saves API calls) or re-search for the information (costs money but saves tokens). The optimal choice depends on your cost structure.

Python

def should_research_or_reuse(
    previous_results: list,
    tokens_per_result: int = 100,
    cost_per_search: float = 0.005,
    cost_per_1k_tokens: float = 0.003  # typical input token cost
):
    """Decide whether to re-search or pass previous results in context."""
    context_cost = len(previous_results) * tokens_per_result * cost_per_1k_tokens / 1000
    search_cost = cost_per_search

    if search_cost < context_cost:
        return "re-search (cheaper than carrying context forward)"
    return "reuse (carrying context is cheaper than re-searching)"

# 10 previous results at 100 tokens each, $0.003/1K input tokens
# Context cost: 10 * 100 * 0.003 / 1000 = $0.003
# Search cost: $0.005
# Verdict: reuse the context (cheaper)
print(should_research_or_reuse([{}] * 10))

# 50 previous results
# Context cost: 50 * 100 * 0.003 / 1000 = $0.015
# Search cost: $0.005
# Verdict: re-search (cheaper)
print(should_research_or_reuse([{}] * 50))

Design patterns for context-efficient agents

Summarize-and-discard: after analyzing search results, keep only the summary, drop raw results
Reference-by-ID: store results externally, pass only IDs in context, retrieve when needed
Structured over prose: JSON results compress better than AI-generated summaries
Budget-aware handoff: calculate whether re-searching or carrying context is cheaper

The practical impact

For a 10-step research agent making 5 searches per step, the context management strategy can mean the difference between fitting in a 128K context window and hitting the limit at step 7. Structured results buy you 2-3 more steps before context overflow. That extra headroom often means completing the task vs returning partial results.

AI Agent Context Handoff Problem

The problem: context bloat between steps

Structured results: smaller context footprint

The re-search vs pass-through tradeoff

Design patterns for context-efficient agents

The practical impact

Continue reading

Connect Scavio to Any AI Assistant with MCP

Build a Cross-Platform Product Research Agent with LangGraph