langchaintutorialpython

How to Add Web Search to Your LangChain Agent

Step-by-step tutorial for adding real-time web search to LangChain agents using langchain-scavio. Covers installation, configuration, async usage, and LangGraph ToolNode integration.

12 min read

Large language models are trained on static datasets. Ask GPT-4o about a product launched last week and it will either hallucinate an answer or admit it does not know. For any agent that needs to act on current information -- stock prices, breaking news, competitor analysis, live inventory -- you need a search tool that returns structured, real-time data.

This guide walks through adding web search to a LangChain agent using langchain-scavio. By the end, you will have a working agent that searches the web, parses structured SERP data, and uses knowledge graphs and People Also Ask to give grounded answers.

What You Will Build

A LangChain agent that can answer questions like "What are the top Python web frameworks in 2026?" by searching the web in real time, extracting structured data from Google's SERP, and synthesizing a grounded answer with source citations.

Prerequisites

Step 1: Install Dependencies

Bash
pip install langchain-scavio langchain-openai langgraph

langchain-scavio v2.0 ships a single tool class -- ScavioSearch -- that implements LangChain's BaseTool interface. It works with both the legacy AgentExecutor and the newer LangGraph agent patterns.

Step 2: Configure Your API Key

Set the SCAVIO_API_KEY environment variable. The tool reads it automatically:

Bash
export SCAVIO_API_KEY="sk_live_your_key_here"
export OPENAI_API_KEY="sk-your_openai_key"

You can also pass the key directly to the constructor, which is useful for testing or when managing multiple keys:

Python
from langchain_scavio import ScavioSearch

tool = ScavioSearch(scavio_api_key="sk_live_your_key_here")

Step 3: Test the Tool Standalone

Before wiring it into an agent, verify the tool returns data:

Python
from langchain_scavio import ScavioSearch

tool = ScavioSearch(max_results=3)
result = tool.invoke({"query": "best python web frameworks 2026"})
print(result)

Here is what the response looks like. Unlike basic search tools that return a plain text summary, ScavioSearch returns the full SERP structure:

JSON
{
  "organic_results": [
    {
      "title": "Top Python Web Frameworks to Learn in 2026",
      "url": "https://example.com/python-frameworks-2026",
      "description": "Django, FastAPI, and Flask remain the top choices...",
      "position": 1
    },
    {
      "title": "FastAPI vs Django in 2026: Which Should You Choose?",
      "url": "https://example.com/fastapi-vs-django",
      "description": "FastAPI has overtaken Flask as the second most popular...",
      "position": 2
    }
  ],
  "knowledge_graph": {
    "title": "Python",
    "type": "Programming language",
    "description": "Python is a high-level, general-purpose programming language..."
  },
  "people_also_ask": [
    {
      "question": "What is the fastest Python web framework?",
      "answer": "FastAPI and Starlette are the fastest Python web frameworks..."
    },
    {
      "question": "Is Django still relevant in 2026?",
      "answer": "Yes, Django remains the most popular full-stack framework..."
    }
  ],
  "related_searches": [
    "python async web framework",
    "fastapi production deployment",
    "django vs fastapi performance"
  ]
}

This structured output is what makes ScavioSearch different from tools that return a flat text summary. Your agent gets organic results, knowledge graph entities, related questions, and follow-up query suggestions -- all as structured data it can reason about. For more on why this matters, see our guide on structured SERP data in AI agents.

Step 4: Build a LangGraph Agent

The recommended way to build agents in LangChain is with LangGraph's create_react_agent. This is the pattern shown in the langchain-scavio README:

Python
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from langchain_scavio import ScavioSearch

agent = create_react_agent(
    ChatOpenAI(model="gpt-4o"),
    tools=[ScavioSearch(max_results=5)],
)

response = agent.invoke({
    "messages": [
        {"role": "user", "content": "What are the latest AI regulations in the EU?"}
    ]
})

# Print the final answer
print(response["messages"][-1].content)

The agent will automatically decide when to call the search tool based on the user's question. For questions it can answer from training data, it will respond directly. For current information, it will invokeScavioSearch.

Step 5: Configure What Data You Get Back

ScavioSearch lets you control which SERP sections are included in the response. This is important for two reasons: it reduces token usage (less data sent to the LLM), and it reduces cost (light requests cost 1 credit, full requests cost 2).

Python
tool = ScavioSearch(
    max_results=5,
    include_knowledge_graph=True,    # entity data (default: True)
    include_questions=True,          # People Also Ask (default: True)
    include_related=True,            # related queries (default: False)
    include_ai_overviews=False,      # AI overviews (default: False)
    include_news_results=False,      # news results (default: False)
    country_code="us",               # ISO 3166-1 alpha-2
    language="en",                   # ISO 639-1
)

See the full parameter reference for all available options.

Step 6: Async Support for High-Throughput Pipelines

If you are processing multiple queries concurrently -- for example, in a multi-step research agent -- use the async variant. ScavioSearch implements native ainvoke using aiohttp:

Python
import asyncio
from langchain_scavio import ScavioSearch

tool = ScavioSearch(max_results=3)

async def parallel_search():
    queries = [
        "python async frameworks 2026",
        "rust web frameworks comparison",
        "go vs rust for web APIs",
    ]
    tasks = [tool.ainvoke({"query": q}) for q in queries]
    results = await asyncio.gather(*tasks)
    for query, result in zip(queries, results):
        print(f"--- {query} ---")
        print(result[:200])

asyncio.run(parallel_search())

Step 7: Error Handling

ScavioSearch handles errors gracefully without crashing your agent:

  • Empty results raise a ToolException with actionable suggestions so the LLM can retry with a different query
  • API errors (rate limits, invalid keys) return an {"error": "message"} dict instead of throwing, so the agent loop continues
  • Set handle_tool_error=True on the tool to let LangChain pass error messages to the LLM as context for self-correction
Python
# The LLM receives error context and can retry with a different query
tool = ScavioSearch(max_results=5, handle_tool_error=True)

Migrating from Tavily

If you are currently using Tavily, switching requires minimal code changes. The API surface is intentionally similar:

from langchain_tavily import TavilySearch

tool = TavilySearch(max_results=5)

The key difference is what you get back: Scavio returns the full SERP structure (knowledge graphs, People Also Ask, related searches) while Tavily returns simplified summaries. For a detailed comparison, see our LangChain search tool comparison.

What You Get vs. a Plain Search

To understand why structured SERP data matters, consider what happens when an agent searches for "OpenAI":

Data typePlain search toolScavioSearch
Organic results (title, URL, snippet)Yes (text only)Yes (structured JSON)
Knowledge graph (entity type, founded, HQ)NoYes
People Also Ask (related Q&A pairs)NoYes
Related searches (follow-up queries)NoYes
AI overviews (Google's AI summary)NoYes (opt-in)
Result position / ranking dataNoYes

Next Steps