Overview
Local LLMs hallucinate on current events. This workflow injects live SERP results into the system prompt before each query, grounding the model's response in real data. Works with Ollama, vLLM, llama.cpp, or any OpenAI-compatible local endpoint.
Trigger
Per user query to local LLM
Schedule
Per user query
Workflow Steps
Receive user query
User asks a factual question to the local LLM chat interface.
Pre-flight search via Scavio
POST /api/v1/search with the user's query on platform=google. Take top-5 results with snippets.
Inject search context into system prompt
Prepend to system message: 'Use the following search results to answer. Cite sources. Results: [...]'
Forward augmented prompt to local LLM
Send to Ollama /api/chat or vLLM /v1/chat/completions with the enriched messages array.
Return grounded response to user
The local LLM now answers with real data instead of hallucinating.
Python Implementation
import requests, os
scavio_key = os.environ["SCAVIO_API_KEY"]
user_query = "What is the current price of NVIDIA stock?"
search_resp = requests.post("https://api.scavio.dev/api/v1/search",
headers={"x-api-key": scavio_key},
json={"query": user_query, "platform": "google", "limit": 5})
context = "\n".join(f"- {r['title']}: {r['snippet']}" for r in search_resp.json().get("results", []))
ollama_resp = requests.post("http://localhost:11434/api/chat",
json={"model": "qwen2.5:7b", "messages": [
{"role": "system", "content": f"Answer using these search results. Cite sources.\n{context}"},
{"role": "user", "content": user_query}
]})
print(ollama_resp.json()["message"]["content"])JavaScript Implementation
const query = "What is the current price of NVIDIA stock?";
const searchResp = await fetch("https://api.scavio.dev/api/v1/search", {
method: "POST",
headers: { "x-api-key": process.env.SCAVIO_API_KEY, "Content-Type": "application/json" },
body: JSON.stringify({ query, platform: "google", limit: 5 })
});
const context = (await searchResp.json()).results.map(r => `- ${r.title}: ${r.snippet}`).join("\n");
const llmResp = await fetch("http://localhost:11434/api/chat", {
method: "POST",
body: JSON.stringify({ model: "qwen2.5:7b", messages: [
{ role: "system", content: `Answer using these search results. Cite sources.\n${context}` },
{ role: "user", content: query }
]})
});
console.log((await llmResp.json()).message.content);Platforms Used
Web search with knowledge graph, PAA, and AI overviews