How is this workflow triggered?

This workflow uses a per user query to local llm. Per user query.

Which Scavio platforms does this workflow use?

This workflow uses the following Scavio platforms: google. Each platform is called via the same unified API endpoint.

Can I run this workflow on the free tier?

Yes. Scavio's free tier includes 500 credits per month with no credit card required. That is enough to test and validate this workflow before scaling it.

Local LLM Grounding Pipeline

Overview

Local LLMs hallucinate on current events. This workflow injects live SERP results into the system prompt before each query, grounding the model's response in real data. Works with Ollama, vLLM, llama.cpp, or any OpenAI-compatible local endpoint.

Trigger

Per user query to local LLM

Schedule

Per user query

Workflow Steps

Receive user query

User asks a factual question to the local LLM chat interface.

Pre-flight search via Scavio

POST /api/v1/search with the user's query on platform=google. Take top-5 results with snippets.

Inject search context into system prompt

Prepend to system message: 'Use the following search results to answer. Cite sources. Results: [...]'

Forward augmented prompt to local LLM

Send to Ollama /api/chat or vLLM /v1/chat/completions with the enriched messages array.

Return grounded response to user

The local LLM now answers with real data instead of hallucinating.

Python Implementation

Python

import requests, os

scavio_key = os.environ["SCAVIO_API_KEY"]
user_query = "What is the current price of NVIDIA stock?"

search_resp = requests.post("https://api.scavio.dev/api/v1/search",
    headers={"x-api-key": scavio_key},
    json={"query": user_query, "platform": "google", "limit": 5})
context = "\n".join(f"- {r['title']}: {r['snippet']}" for r in search_resp.json().get("results", []))

ollama_resp = requests.post("http://localhost:11434/api/chat",
    json={"model": "qwen2.5:7b", "messages": [
        {"role": "system", "content": f"Answer using these search results. Cite sources.\n{context}"},
        {"role": "user", "content": user_query}
    ]})
print(ollama_resp.json()["message"]["content"])

JavaScript Implementation

JavaScript

const query = "What is the current price of NVIDIA stock?";
const searchResp = await fetch("https://api.scavio.dev/api/v1/search", {
  method: "POST",
  headers: { "x-api-key": process.env.SCAVIO_API_KEY, "Content-Type": "application/json" },
  body: JSON.stringify({ query, platform: "google", limit: 5 })
});
const context = (await searchResp.json()).results.map(r => `- ${r.title}: ${r.snippet}`).join("\n");

const llmResp = await fetch("http://localhost:11434/api/chat", {
  method: "POST",
  body: JSON.stringify({ model: "qwen2.5:7b", messages: [
    { role: "system", content: `Answer using these search results. Cite sources.\n${context}` },
    { role: "user", content: query }
  ]})
});
console.log((await llmResp.json()).message.content);

Platforms Used

Google

Web search with knowledge graph, PAA, and AI overviews

Local LLM Grounding Pipeline Workflow