An r/openclaw post asked why their assistant is dumb. The answer is rarely the LLM and almost always tool naming or attribution. This walks the debug pattern.
Prerequisites
- A failing agent
- Logging set up to capture tool calls
- Patience to read 5-10 actual traces
Walkthrough
Step 1: Capture every tool call with name + args + result
Most agent frameworks log this; turn it on.
# OpenClaw: set CLAW_LOG_LEVEL=DEBUG
# LangChain: callbacks=[StdOutCallbackHandler()]
# Inspect 5-10 actual traces before guessing the issue.Step 2: Check tool name semantics
Tools called search_v1, fetch_url, do_thing fail more than tools called web_search, reddit_search, extract_markdown.
# Bad: name='search', description='search the web'
# Good: name='reddit_search', description='Search Reddit threads. Returns posts with score and comment_count.'Step 3: Check tool descriptions for ambiguity
If two tools have overlapping descriptions, the LLM picks wrong frequently.
# Common bug: 'search' and 'web_search' both attached. LLM guesses.
# Fix: keep only one, with a clear description.Step 4: Reduce attached tool count
Agents with 12+ tools score lower than agents with 4-6 well-named tools.
# If the agent has access to 15 MCP servers, drop to 4-6 you actually need per task.
# Scavio MCP exposes 11 tools but they have unambiguous names; multi-MCP setups usually conflict.Step 5: Add explicit tool-selection scaffolding
If routing is still wrong, prepend a 'choose your tool' step.
# System prompt addition:
# 'For each user request, first state which tool you will use and why. Then call it.'
# Improves routing accuracy on agents with 5+ tools.Python Example
# Debug protocol: log -> read 10 traces -> rename / dedupe tools -> retest. 90% of 'dumb agent' issues end here.JavaScript Example
// Same debug protocol works for any agent framework, JS or Python.Expected Output
Agent that picks the right tool first try. The LLM rarely needs upgrading; the tool surface usually does.