Tutorial

How to Add Search Grounding to Hermes Agent

Integrate real-time web search into Hermes Agent using the Scavio API. Ground responses with live data instead of relying on training cutoff knowledge.

Hermes is a popular fine-tuned model series for agentic use cases, but like all LLMs, it hallucinates when asked about events after its training cutoff. Adding search grounding via the Scavio API gives Hermes access to real-time web data during inference. This tutorial integrates a search tool into Hermes Agent's tool-use loop, so the agent can autonomously decide when to search and incorporate results into its responses. Each search costs $0.005.

Prerequisites

  • Python 3.9+ installed
  • requests library installed
  • A Scavio API key from scavio.dev
  • Hermes model running via Ollama, vLLM, or similar

Walkthrough

Step 1: Build the search grounding function

Create a search function that returns formatted results suitable for inclusion in the Hermes context window.

Python
import os, requests, json

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']
H = {'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'}
URL = 'https://api.scavio.dev/api/v1/search'

def grounded_search(query: str, num: int = 5) -> str:
    """Search the web and format results for LLM context."""
    resp = requests.post(URL, headers=H,
        json={'query': query, 'country_code': 'us', 'num_results': num})
    resp.raise_for_status()
    results = resp.json().get('organic_results', [])
    if not results:
        return 'No search results found.'
    formatted = []
    for i, r in enumerate(results, 1):
        formatted.append(f'[{i}] {r["title"]}')
        if r.get('snippet'):
            formatted.append(f'    {r["snippet"]}')
        formatted.append(f'    Source: {r["link"]}')
    return '\n'.join(formatted)

results = grounded_search('Hermes 3 model capabilities 2026')
print(results[:400])

Step 2: Define the tool schema for Hermes

Hermes uses the ChatML tool-use format. Define the search tool with proper schema so Hermes knows when and how to call it.

Python
HERMES_TOOLS = [
    {
        'type': 'function',
        'function': {
            'name': 'web_search',
            'description': 'Search the web for current information. Use this tool when you need up-to-date data, facts about recent events, or information beyond your training cutoff.',
            'parameters': {
                'type': 'object',
                'properties': {
                    'query': {
                        'type': 'string',
                        'description': 'The search query. Be specific and include the year 2026 for recent information.'
                    }
                },
                'required': ['query']
            }
        }
    }
]

def format_tools_for_hermes(tools: list) -> str:
    """Format tools for Hermes ChatML system prompt."""
    tool_descriptions = []
    for t in tools:
        fn = t['function']
        tool_descriptions.append(
            f'Tool: {fn["name"]}\n'
            f'Description: {fn["description"]}\n'
            f'Parameters: {json.dumps(fn["parameters"], indent=2)}'
        )
    return '\n\n'.join(tool_descriptions)

print(format_tools_for_hermes(HERMES_TOOLS))

Step 3: Wire search into the Hermes inference loop

Integrate the search tool into the Hermes conversation loop. When Hermes outputs a tool call, execute the search and feed results back.

Python
import ollama

def run_hermes_with_search(prompt: str, model: str = 'hermes3') -> str:
    messages = [{'role': 'user', 'content': prompt}]
    response = ollama.chat(model=model, messages=messages, tools=HERMES_TOOLS)
    # Check if Hermes wants to use tools
    if response.message.tool_calls:
        for tc in response.message.tool_calls:
            if tc.function.name == 'web_search':
                query = tc.function.arguments.get('query', prompt)
                search_results = grounded_search(query)
                print(f'Hermes searched: "{query}"')
                # Feed results back
                messages.append(response.message)
                messages.append({'role': 'tool', 'content': search_results})
                final = ollama.chat(model=model, messages=messages)
                return final.message.content
    return response.message.content

result = run_hermes_with_search('What are the latest developments in AI regulation in 2026?')
print(f'\nHermes response:\n{result[:500]}')

Python Example

Python
import os, requests

SCAVIO_KEY = os.environ['SCAVIO_API_KEY']
H = {'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json'}

def grounded_search(query, num=5):
    resp = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'query': query, 'country_code': 'us', 'num_results': num})
    results = resp.json().get('organic_results', [])
    return '\n'.join(f'[{i+1}] {r["title"]}: {r.get("snippet", "")}' for i, r in enumerate(results))

def hermes_search(prompt):
    # Simulate Hermes deciding to search
    results = grounded_search(prompt)
    print(f'Search grounding for Hermes:')
    print(results[:400])
    print(f'\nCost: $0.005')

hermes_search('AI regulation developments 2026')

JavaScript Example

JavaScript
const SCAVIO_KEY = process.env.SCAVIO_API_KEY;

async function groundedSearch(query) {
  const resp = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': SCAVIO_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query, country_code: 'us', num_results: 5 })
  });
  const data = await resp.json();
  return (data.organic_results || []).map((r, i) => `[${i + 1}] ${r.title}: ${r.snippet || ''}`).join('\n');
}

async function hermesSearch(prompt) {
  const results = await groundedSearch(prompt);
  console.log('Search grounding for Hermes:');
  console.log(results.slice(0, 400));
  console.log('Cost: $0.005');
}

hermesSearch('AI regulation 2026');

Expected Output

JSON
Hermes searched: "AI regulation developments 2026"

[1] EU AI Act Enforcement Begins: What Companies Need to Know
    The EU AI Act officially entered enforcement in March 2026...
    Source: https://reuters.com/technology/eu-ai-act-enforcement
[2] US Congress Passes Bipartisan AI Safety Bill
    The American AI Safety Act of 2026 introduces mandatory...
    Source: https://nytimes.com/2026/04/us-ai-safety-bill
[3] China Updates AI Governance Framework
    China's Cyberspace Administration released updated guidelines...
    Source: https://scmp.com/tech/china-ai-governance

Cost: $0.005

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.9+ installed. requests library installed. A Scavio API key from scavio.dev. Hermes model running via Ollama, vLLM, or similar. A Scavio API key gives you 250 free credits per month.

Yes. The free tier includes 250 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Integrate real-time web search into Hermes Agent using the Scavio API. Ground responses with live data instead of relying on training cutoff knowledge.