hermessearch-qualityllm

Hermes Web Search Quality: Root Causes and Fixes

Hermes 3 reformulates queries before searching, losing critical keywords. Fix by routing search through an API you control.

5 min read

Hermes 3 web search returns irrelevant results because the model reformulates your query before searching, often stripping critical keywords or adding hallucinated context. The fix is routing search through an API you control, where the exact query hits the search engine without LLM reformulation.

Root Cause: Query Reformulation

When Hermes 3 uses its built-in web search tool, it rewrites your query to what it thinks is a better search. "Best search API for agents 2026" might become "search API comparison" -- losing the year filter and the agent context. The reformulated query returns broader, less relevant results.

This is not a bug; it is how tool-use models work. The LLM decides what to search for. But when your pipeline depends on specific query terms, reformulation breaks result quality.

Fix 1: Bypass LLM Query Reformulation

Call the search API directly with the exact user query. Do not let the LLM choose what to search. Feed the raw results back into Hermes as context.

Python
import requests, os

H = {"x-api-key": os.environ["SCAVIO_API_KEY"]}

def search_then_reason(user_query):
    """Search with exact query, then reason with Hermes."""
    # Step 1: Exact query search -- no LLM reformulation
    r = requests.post("https://api.scavio.dev/api/v1/search",
        headers=H,
        json={"platform": "google", "query": user_query},
        timeout=10
    ).json()
    context = "\n".join([
        f"- {item['title']}: {item.get('snippet', '')}"
        for item in r.get("organic", [])[:5]
    ])

    # Step 2: Feed exact results to Hermes for reasoning
    prompt = f"""Search results for "{user_query}":
{context}

Based ONLY on these search results, answer the user's question.
Do not add information not present in the results."""

    # Send to Hermes 3 via Ollama
    hermes_r = requests.post("http://localhost:11434/api/generate",
        json={
            "model": "hermes3",
            "prompt": prompt,
            "stream": False,
        },
        timeout=30
    ).json()
    return hermes_r.get("response", "")

answer = search_then_reason("best search api for agents 2026")
print(answer)

Fix 2: Multi-Source Grounding

Hermes's built-in search uses a single source. Cross-reference with multiple platforms to improve result quality. If Google and Reddit agree on an answer, it is more likely correct than a single-source result.

Python
def multi_source_search(query):
    """Search Google and Reddit for cross-referencing."""
    sources = {}
    for platform in ["google", "reddit"]:
        r = requests.post("https://api.scavio.dev/api/v1/search",
            headers=H,
            json={"platform": platform, "query": query},
            timeout=10
        ).json()
        sources[platform] = [
            {"title": item["title"], "snippet": item.get("snippet", "")}
            for item in r.get("organic", [])[:3]
        ]
    return sources

def grounded_answer(query):
    """Answer with multi-source grounding."""
    sources = multi_source_search(query)
    context = "Google results:\n"
    for r in sources.get("google", []):
        context += f"- {r['title']}: {r['snippet']}\n"
    context += "\nReddit discussions:\n"
    for r in sources.get("reddit", []):
        context += f"- {r['title']}: {r['snippet']}\n"

    # Hermes reasons over multi-source data
    return context  # pass to Hermes

result = grounded_answer("hermes 3 web search quality issues")
print(result)

Fix 3: Query Template Enforcement

If you must let Hermes reformulate queries, constrain the reformulation with a template. "Search for: [original query] [current year]" prevents the model from stripping year filters.

Why Built-In Search Breaks

Hermes's web search tool is typically backed by DuckDuckGo or a similar free search engine. These have smaller indexes than Google and return weaker results on niche queries. The combination of query reformulation plus a weaker search engine compounds the quality problem.

Quality Metrics to Track

Compare answers from Hermes's built-in search vs API-grounded answers on the same queries. Track: answer relevance (does it address the actual question), factual accuracy (can claims be verified), and freshness (does it reference current data). In testing, API-grounded answers score 30-40% higher on relevance for technical queries.