Hybrid RAG Search Workflow with Live API

Overview

Routes queries through both a vector database and a live search API based on query classification. Merges results from both sources and feeds combined context to the LLM for grounded, comprehensive responses.

Trigger

On every user query to the RAG agent

Schedule

On every query (real-time)

Workflow Steps

Classify the query

Determine whether the query needs internal docs only, external search only, or both. Keywords like 'current', 'price', 'latest' signal external search need.

Retrieve from vector DB

Run similarity search against the internal document store. Return top-k most relevant chunks.

Query live search if needed

If classified as needing external data, query Scavio's search endpoint for current public information.

Merge and re-rank results

Combine internal and external results. Score by relevance to the original query and source authority.

Feed to LLM

Pass the merged context to the LLM with clear source attribution (internal doc vs live search result).

Generate grounded response

LLM produces a response anchored in both internal knowledge and current public data, with source citations.

Python Implementation

Python

import requests, os

H = {"x-api-key": os.environ["SCAVIO_API_KEY"]}

def hybrid_retrieve(query, vector_db, k=5):
    needs_external = any(s in query.lower()
        for s in ["current", "latest", "price", "2026", "today", "competitor"])
    internal = vector_db.similarity_search(query, k=k)
    context = [{"source": "internal", "text": doc.page_content} for doc in internal]
    if needs_external:
        r = requests.post("https://api.scavio.dev/api/v1/search", headers=H,
            json={"platform": "google", "query": query}, timeout=10).json()
        for result in r.get("organic", [])[:3]:
            context.append({
                "source": "web", "title": result.get("title"),
                "text": result.get("snippet"), "url": result.get("link")
            })
    return context

JavaScript Implementation

JavaScript

const H = {"x-api-key": process.env.SCAVIO_API_KEY, "Content-Type": "application/json"};

async function hybridRetrieve(query, vectorDb, k = 5) {
  const needsExternal = /current|latest|price|2026|today|competitor/i.test(query);
  const internal = await vectorDb.similaritySearch(query, k);
  const context = internal.map(d => ({source: "internal", text: d.pageContent}));
  if (needsExternal) {
    const r = await fetch("https://api.scavio.dev/api/v1/search", {
      method: "POST", headers: H,
      body: JSON.stringify({platform: "google", query})
    }).then(r => r.json());
    (r.organic || []).slice(0, 3).forEach(result =>
      context.push({source: "web", title: result.title,
        text: result.snippet, url: result.link}));
  }
  return context;
}

Platforms Used

Google

Web search with knowledge graph, PAA, and AI overviews

Community, posts & threaded comments from any subreddit

YouTube

Video search with transcripts and metadata

Amazon

Product search with prices, ratings, and reviews

Frequently Asked Questions

This workflow uses a on every user query to the rag agent. On every query (real-time).

This workflow uses the following Scavio platforms: google, reddit, youtube, amazon. Each platform is called via the same unified API endpoint.

Yes. Scavio's free tier includes 50 credits on signup with no credit card required. That is enough to test and validate this workflow before scaling it.

Workflow Steps

Classify the query

Determine whether the query needs internal docs only, external search only, or both. Keywords like 'current', 'price', 'latest' signal external search need.

Retrieve from vector DB

Run similarity search against the internal document store. Return top-k most relevant chunks.

Query live search if needed

If classified as needing external data, query Scavio's search endpoint for current public information.

Merge and re-rank results

Combine internal and external results. Score by relevance to the original query and source authority.

Feed to LLM

Pass the merged context to the LLM with clear source attribution (internal doc vs live search result).

Generate grounded response

LLM produces a response anchored in both internal knowledge and current public data, with source citations.

Python Implementation

Python

import requests, os

H = {"x-api-key": os.environ["SCAVIO_API_KEY"]}

def hybrid_retrieve(query, vector_db, k=5):
    needs_external = any(s in query.lower()
        for s in ["current", "latest", "price", "2026", "today", "competitor"])
    internal = vector_db.similarity_search(query, k=k)
    context = [{"source": "internal", "text": doc.page_content} for doc in internal]
    if needs_external:
        r = requests.post("https://api.scavio.dev/api/v1/search", headers=H,
            json={"platform": "google", "query": query}, timeout=10).json()
        for result in r.get("organic", [])[:3]:
            context.append({
                "source": "web", "title": result.get("title"),
                "text": result.get("snippet"), "url": result.get("link")
            })
    return context

JavaScript Implementation

JavaScript

const H = {"x-api-key": process.env.SCAVIO_API_KEY, "Content-Type": "application/json"};

async function hybridRetrieve(query, vectorDb, k = 5) {
  const needsExternal = /current|latest|price|2026|today|competitor/i.test(query);
  const internal = await vectorDb.similaritySearch(query, k);
  const context = internal.map(d => ({source: "internal", text: d.pageContent}));
  if (needsExternal) {
    const r = await fetch("https://api.scavio.dev/api/v1/search", {
      method: "POST", headers: H,
      body: JSON.stringify({platform: "google", query})
    }).then(r => r.json());
    (r.organic || []).slice(0, 3).forEach(result =>
      context.push({source: "web", title: result.title,
        text: result.snippet, url: result.link}));
  }
  return context;
}

Frequently Asked Questions

This workflow uses a on every user query to the rag agent. On every query (real-time).

This workflow uses the following Scavio platforms: google, reddit, youtube, amazon. Each platform is called via the same unified API endpoint.

Yes. Scavio's free tier includes 50 credits on signup with no credit card required. That is enough to test and validate this workflow before scaling it.

Hybrid RAG with Live Search Augmentation

Overview

Trigger

Schedule

Workflow Steps

Classify the query

Retrieve from vector DB

Query live search if needed

Merge and re-rank results

Feed to LLM

Generate grounded response

Python Implementation

JavaScript Implementation

Platforms Used

Google

Reddit

YouTube

Amazon

Frequently Asked Questions

What does the Hybrid RAG with Live Search Augmentation workflow do?

How is this workflow triggered?

Which Scavio platforms does this workflow use?

Can I run this workflow on the free tier?

Hybrid RAG with Live Search Augmentation

Hybrid RAG with Live Search Augmentation

Overview

Trigger

Schedule

Workflow Steps

Classify the query

Retrieve from vector DB

Query live search if needed

Merge and re-rank results

Feed to LLM

Generate grounded response

Python Implementation

JavaScript Implementation

Platforms Used

Google

Reddit

YouTube

Amazon

Frequently Asked Questions

What does the Hybrid RAG with Live Search Augmentation workflow do?

How is this workflow triggered?

Which Scavio platforms does this workflow use?

Can I run this workflow on the free tier?

Hybrid RAG with Live Search Augmentation