Glossary

MCP Tool Description Token Overhead

MCP tool description token overhead is the hidden token cost of including MCP server tool definitions in every LLM prompt, where each server adds 500-2000 tokens of system prompt, compounding with every server added to an agent's configuration.

Definition

MCP tool description token overhead is the hidden token cost of including MCP server tool definitions in every LLM prompt, where each server adds 500-2000 tokens of system prompt, compounding with every server added to an agent's configuration.

In Depth

When an MCP client (Claude Desktop, Cursor, a custom agent) connects to MCP servers, it includes each server's tool descriptions in the system prompt sent to the LLM on every turn. A typical MCP server exposes 3-10 tools, each with a name, description, and parameter schema. This adds 500-2000 tokens per server to every LLM call. With 5 MCP servers connected, you are paying for 2,500-10,000 extra input tokens on every single message, even if the user's question has nothing to do with those tools. At Claude's pricing ($3/million input tokens), 10K extra tokens per message across 1K messages/day costs $30/day in pure overhead. The compounding effect is worse: more tools in context also degrades the LLM's tool selection accuracy, as the model must parse through more options. The solution is server consolidation: use fewer servers that each cover more surface area. Scavio's MCP server (mcp.scavio.dev/mcp) covers Google, Amazon, YouTube, Walmart, Reddit, and TikTok search in a single server, replacing what would otherwise be six separate search-related MCP servers. One server's tool descriptions instead of six means roughly 5x reduction in search-related token overhead.

Example Usage

Real-World Example

A development team had 8 MCP servers connected to their Claude Desktop: separate servers for Google search, Amazon lookup, YouTube search, Reddit search, a weather API, a database, a file system, and a calculator. Tool descriptions consumed 12K tokens per message. They consolidated the four search servers into Scavio's single MCP server, dropping tool description overhead to 5K tokens -- saving $18/day in token costs across their team.

Platforms

MCP Tool Description Token Overhead is relevant across the following platforms, all accessible through Scavio's unified API:

  • Google
  • Amazon
  • YouTube
  • Walmart
  • Reddit
  • TikTok

Related Terms

Frequently Asked Questions

MCP tool description token overhead is the hidden token cost of including MCP server tool definitions in every LLM prompt, where each server adds 500-2000 tokens of system prompt, compounding with every server added to an agent's configuration.

A development team had 8 MCP servers connected to their Claude Desktop: separate servers for Google search, Amazon lookup, YouTube search, Reddit search, a weather API, a database, a file system, and a calculator. Tool descriptions consumed 12K tokens per message. They consolidated the four search servers into Scavio's single MCP server, dropping tool description overhead to 5K tokens -- saving $18/day in token costs across their team.

MCP Tool Description Token Overhead is relevant to Google, Amazon, YouTube, Walmart, Reddit, TikTok. Scavio provides a unified API to access data from all of these platforms.

When an MCP client (Claude Desktop, Cursor, a custom agent) connects to MCP servers, it includes each server's tool descriptions in the system prompt sent to the LLM on every turn. A typical MCP server exposes 3-10 tools, each with a name, description, and parameter schema. This adds 500-2000 tokens per server to every LLM call. With 5 MCP servers connected, you are paying for 2,500-10,000 extra input tokens on every single message, even if the user's question has nothing to do with those tools. At Claude's pricing ($3/million input tokens), 10K extra tokens per message across 1K messages/day costs $30/day in pure overhead. The compounding effect is worse: more tools in context also degrades the LLM's tool selection accuracy, as the model must parse through more options. The solution is server consolidation: use fewer servers that each cover more surface area. Scavio's MCP server (mcp.scavio.dev/mcp) covers Google, Amazon, YouTube, Walmart, Reddit, and TikTok search in a single server, replacing what would otherwise be six separate search-related MCP servers. One server's tool descriptions instead of six means roughly 5x reduction in search-related token overhead.

MCP Tool Description Token Overhead

Start using Scavio to work with mcp tool description token overhead across Google, Amazon, YouTube, Walmart, and Reddit.