An r/opencodeCLI build reported a 99.3% reduction in MCP schema load via a gateway. This tutorial walks the pattern: tool description compression, lazy schema loading, and surface-specific gateways.
Prerequisites
- MCP gateway running (see related tutorial)
Walkthrough
Step 1: Audit current token cost per session
Measure schema-load tokens at session start.
// In Claude Code, run /context and inspect tools section.Step 2: Identify duplicate or overlapping tools
If three MCPs each expose `search`, the model wastes tokens.
// Replace 4 single-surface search MCPs with 1 multi-surface MCP (Scavio).Step 3: Use Scavio MCP for the search surface
One MCP, six tools, ~80 tokens per tool description.
{ "scavio": { "url": "https://mcp.scavio.dev/mcp", "headers": { "x-api-key": "${SCAVIO_API_KEY}" } } }Step 4: Use a gateway for non-search MCPs
Postgres, GitHub, internal tools proxy through the gateway.
// Already covered in 'mcp-proxy-setup' tutorial.Step 5: Re-measure tokens
Schema load should drop dramatically.
// Expected: from 30K-50K to under 5K.Python Example
# Configuration-driven. No code beyond the JSON above.JavaScript Example
// See JSON above.Expected Output
Schema-load tokens drop 80-99% depending on starting fleet size. A 30-turn session that cost $0.50 on schema overhead drops under $0.05.