claude-codemcpscavio

Local Code Search MCP vs grep+read for Claude Code

An r/ClaudeAI post launched Semble. Local code search MCP cuts grep+read tokens ~98%. Pair with Scavio for out-of-repo grounding.

5 min read

An r/ClaudeAI post launched Semble, a local code search MCP that benchmarked ~98% fewer tokens than grep+read for Claude Code. The number is striking and the structural reason it's real is worth unpacking. The claim isn't marketing; it's a direct consequence of how Claude Code currently approaches in-repo search.

Why grep+read costs so many tokens

Default Claude Code on a large repo (>100K LOC) approaches in-repo lookups like this: grep for the keyword, get a list of N matches, read N files in full to understand context, decide which file to edit. On a 200K LOC codebase, "find the pagination logic" can fan out across 12-15 files of 500-2000 lines each. That's 30-50K input tokens per query.

Multiply by the number of search-shaped queries in a session (typically 5-15 per hour of dev work), and the input-token bill on Opus 4.7 is real money. This is the "Opus 4.7 burns through Plus" pain from the parallel r/ClaudeCode thread.

What an indexed code search MCP changes

Semble (and similar MCPs like sourcegraph-cody-bridge) build a local index — typically a vector DB or ripgrep-on-symbol-aware-index variant — that returns matching ranges instead of full files. A query for "pagination cursor logic" returns 3-5 ranges of 20-50 lines each. Total: a few hundred tokens instead of tens of thousands. The 98% number comes from this structural shift, not from compression tricks.

Where Scavio is the complement, not the substitute

Scavio is a web search MCP, not a code search MCP. The two are complementary in a Claude Code session:

  • Semble (or equivalent) for in-repo lookups: where is X defined, where is Y called.
  • Scavio for out-of-repo grounding: framework docs, GitHub issues for upstream libraries, recent Stack Overflow threads, RFC docs.

Two named MCPs, no overlap, clean tool surface. Routing accuracy stays high because each tool has a clear job.

Bash
# Two-MCP setup for Claude Code on large repos
claude mcp add semble <semble-url-or-path>

claude mcp add scavio https://mcp.scavio.dev/mcp \
  --header 'x-api-key: $SCAVIO_API_KEY'

# System prompt rule:
# 'For in-repo code questions, call semble.search.
#  For framework docs, recent issues, Stack Overflow,
#  call scavio.search. Do not use grep+read for in-repo.'

The session-level token bill drops 30-50%

That's the empirical number across a few teams that adopted both. The Semble side cuts in-repo search input cost dramatically. The Scavio side prevents fallback to grep+read for out-of-repo questions (which the agent would otherwise approximate by reading random files looking for the answer). Together, the input-token bill compresses noticeably.

Local-only matters for proprietary repos

Semble's pitch is local-only — the index lives on the dev's machine, no cloud upload of code. For OSS projects this is a non-issue; for proprietary repos it's a security review. Vendor-hosted code search means uploading the codebase to a third party, which most enterprise teams can't do without a long compliance process. Local index sidesteps that.

Honest tradeoffs

The break-even repo size is roughly 10K-20K LOC. Below that, default grep+read finishes in a few thousand tokens and the indexing overhead isn't worth it. Index maintenance is a small ongoing tax — re-index on branch switches, on large refactors. Worth it on big repos; overkill on small ones.

What this looks like in practice

Task: "Find where we handle pagination in the API and update the cursor logic to support backward pagination." Old flow: grep for "pagination", read 12 files, identify the right one, edit. ~40K input tokens. New flow: semble.search("pagination cursor") returns 3 ranges, agent reads those, edits. ~1.5K input tokens. Same outcome, ~25× less input.

Don't buy the indexing tool, buy the discipline

Semble open-sourced is great. So are sourcegraph-cody-bridge variants and ripgrep+ast-grep wrappers. The specific implementation matters less than the discipline: stop letting the agent grep+read fanout on large repos. Measure tokens before and after. The savings are not theoretical. For Claude Code Plus users feeling the Opus 4.7 bill, this is one of the highest-leverage fixes available.