How long does this add search to a local ollama agent tutorial take?

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

What do I need before starting?

Ollama installed. Scavio API key. Agent runtime (opencode, Claude Code, or custom). A Scavio API key gives you 500 free credits per month.

Can I run this tutorial with the free tier?

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

What frameworks does this work with?

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Ollama Search Agent Tutorial

An r/LocalLLM user wanted local LLM for privacy but could not get good search results — even Tavily did not yield results. This tutorial sets up Scavio MCP with Ollama for reliable search grounding.

Prerequisites

Ollama installed
Scavio API key
Agent runtime (opencode, Claude Code, or custom)

Walkthrough

Step 1: Install and run Ollama with a capable model

Pull a model that supports tool calling.

Bash

ollama pull qwen3.6:27b
# Or: ollama pull llama3.3:latest
# Model must support function/tool calling for agentic search

Step 2: Configure Scavio MCP

Add Scavio as an MCP server.

Bash

# For opencode/Claude Code:
claude mcp add scavio https://mcp.scavio.dev/mcp --header 'x-api-key: YOUR_SCAVIO_KEY'

# For custom setups, configure in mcp.json:
# { "mcpServers": { "scavio": { "url": "https://mcp.scavio.dev/mcp", "headers": { "x-api-key": "KEY" } } } }

Step 3: Set up the agent loop

Model decides when to search vs answer from knowledge.

Python

import ollama, requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def search(query):
    return requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': 'google', 'query': query}).json()

def agent(question):
    response = ollama.chat(model='qwen3.6:27b', messages=[
        {'role': 'system', 'content': 'Search for factual questions. Answer reasoning questions directly.'},
        {'role': 'user', 'content': question}])
    # Handle tool calls if model requests search
    return response

Step 4: Test with factual vs reasoning queries

Verify search grounding works correctly.

Text

# Factual (should search): 'What is the current price of Helium 10?'
# Reasoning (should not search): 'Explain gradient descent in simple terms'
# Mixed: 'Compare current LLM pricing across providers'

Python Example

Python

# Privacy model: your query goes to Scavio (zero data retention)
# but NOT to OpenAI/Anthropic. The LLM runs locally.
# Search API is the only external call.

JavaScript Example

JavaScript

// Same pattern with Ollama JS client.

Expected Output

JSON

Local Ollama model grounded with live search. Factual queries get accurate answers; reasoning queries use local knowledge. Privacy preserved.

How to Add Search to a Local Ollama Agent