Tutorial

How to Add Web Search to a Local LLM Agent

Give your local LLM agent real-time web search via a simple HTTP tool. Works with Ollama, llama.cpp, and any model that supports tool calling.

Local LLMs running on Ollama or llama.cpp lack internet access by default. When users ask about current events, pricing, or anything time-sensitive, the model either hallucinates or refuses to answer. Adding a web search tool solves this. The Scavio API acts as a lightweight HTTP search endpoint that returns structured JSON, which is small enough to fit in any local model's context window. This tutorial wires a search tool into an Ollama agent using Python, so the model can autonomously decide when to search and ground its answers in live data. No LangChain or heavyweight framework needed.

Prerequisites

  • Ollama installed with a tool-capable model (llama3.1, mistral, etc.)
  • Python 3.10+
  • requests library installed
  • A Scavio API key from scavio.dev

Walkthrough

Step 1: Define the search tool schema

Create a tool definition that tells the LLM what the search function does and what parameters it accepts. Ollama uses the OpenAI-compatible tool format.

Python
import os
import requests
import json

API_KEY = os.environ['SCAVIO_API_KEY']

SEARCH_TOOL = {
    'type': 'function',
    'function': {
        'name': 'web_search',
        'description': 'Search the web for current information. Use this when the user asks about recent events, prices, news, or anything you do not have up-to-date knowledge about.',
        'parameters': {
            'type': 'object',
            'properties': {
                'query': {'type': 'string', 'description': 'The search query'}
            },
            'required': ['query']
        }
    }
}

Step 2: Implement the search function

The function calls the Scavio API and returns a condensed string of results that fits comfortably in the model's context window.

Python
def web_search(query):
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us'})
    results = resp.json().get('organic_results', [])[:5]
    lines = []
    for r in results:
        lines.append(f"{r['title']}: {r.get('snippet', '')} ({r['link']})")
    return '\n'.join(lines) if lines else 'No results found.'

Step 3: Run the agent loop with Ollama

Send the user message to Ollama with the tool definition. If the model calls the tool, execute the search and feed the result back for the final answer.

Python
def chat_with_search(user_message, model='llama3.1'):
    messages = [{'role': 'user', 'content': user_message}]
    resp = requests.post('http://localhost:11434/api/chat', json={
        'model': model, 'messages': messages, 'tools': [SEARCH_TOOL], 'stream': False
    }).json()
    msg = resp['message']
    if msg.get('tool_calls'):
        for tc in msg['tool_calls']:
            if tc['function']['name'] == 'web_search':
                args = tc['function']['arguments']
                result = web_search(args['query'])
                messages.append(msg)
                messages.append({'role': 'tool', 'content': result})
        final = requests.post('http://localhost:11434/api/chat', json={
            'model': model, 'messages': messages, 'stream': False
        }).json()
        return final['message']['content']
    return msg['content']

Step 4: Test with a real-time question

Ask the agent something it cannot answer from training data alone. The model should call web_search and synthesize the results.

Python
answer = chat_with_search('What are the top trending Python libraries released in 2026?')
print(answer)

Python Example

Python
import os, requests, json

API_KEY = os.environ['SCAVIO_API_KEY']

SEARCH_TOOL = {
    'type': 'function',
    'function': {
        'name': 'web_search',
        'description': 'Search the web for current information.',
        'parameters': {
            'type': 'object',
            'properties': {'query': {'type': 'string', 'description': 'Search query'}},
            'required': ['query']
        }
    }
}

def web_search(query):
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': API_KEY, 'Content-Type': 'application/json'},
        json={'query': query, 'country_code': 'us'})
    results = resp.json().get('organic_results', [])[:5]
    return '\n'.join(f"{r['title']}: {r.get('snippet', '')}" for r in results) or 'No results.'

def chat(user_msg, model='llama3.1'):
    messages = [{'role': 'user', 'content': user_msg}]
    resp = requests.post('http://localhost:11434/api/chat', json={
        'model': model, 'messages': messages, 'tools': [SEARCH_TOOL], 'stream': False
    }).json()
    msg = resp['message']
    if msg.get('tool_calls'):
        for tc in msg['tool_calls']:
            result = web_search(tc['function']['arguments']['query'])
            messages.append(msg)
            messages.append({'role': 'tool', 'content': result})
        return requests.post('http://localhost:11434/api/chat', json={
            'model': model, 'messages': messages, 'stream': False
        }).json()['message']['content']
    return msg['content']

print(chat('What Python libraries were released in 2026?'))

JavaScript Example

JavaScript
const API_KEY = process.env.SCAVIO_API_KEY;

const SEARCH_TOOL = {
  type: 'function',
  function: {
    name: 'web_search',
    description: 'Search the web for current information.',
    parameters: {
      type: 'object',
      properties: { query: { type: 'string', description: 'Search query' } },
      required: ['query']
    }
  }
};

async function webSearch(query) {
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST',
    headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
    body: JSON.stringify({ query, country_code: 'us' })
  }).then(r => r.json());
  return (r.organic_results || []).slice(0, 5)
    .map(r => `${r.title}: ${r.snippet || ''}`).join('\n') || 'No results.';
}

async function chat(userMsg, model = 'llama3.1') {
  const messages = [{ role: 'user', content: userMsg }];
  let resp = await fetch('http://localhost:11434/api/chat', {
    method: 'POST',
    body: JSON.stringify({ model, messages, tools: [SEARCH_TOOL], stream: false })
  }).then(r => r.json());
  if (resp.message.tool_calls) {
    for (const tc of resp.message.tool_calls) {
      const result = await webSearch(tc.function.arguments.query);
      messages.push(resp.message, { role: 'tool', content: result });
    }
    resp = await fetch('http://localhost:11434/api/chat', {
      method: 'POST',
      body: JSON.stringify({ model, messages, stream: false })
    }).then(r => r.json());
  }
  return resp.message.content;
}

chat('Latest AI news in 2026?').then(console.log).catch(console.error);

Expected Output

JSON
User: What Python libraries were released in 2026?

[Agent calls web_search("new python libraries released 2026")]
[Search returns 5 results with titles and snippets]

Agent: Based on web search results, several notable Python libraries
launched in 2026 including... The agent synthesizes the search
results into a grounded, accurate answer.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Ollama installed with a tool-capable model (llama3.1, mistral, etc.). Python 3.10+. requests library installed. A Scavio API key from scavio.dev. A Scavio API key gives you 250 free credits per month.

Yes. The free tier includes 250 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Give your local LLM agent real-time web search via a simple HTTP tool. Works with Ollama, llama.cpp, and any model that supports tool calling.