serpknowledge-graphagents

Real-Time SERP Data in AI Agents (Knowledge Graphs, PAA & More)

Why structured SERP data makes AI agents smarter. Practical patterns for using knowledge graphs, People Also Ask, and related searches to build grounded, multi-step agents.

13 min read

Most search tools give your AI agent a flat list of links and snippets. The agent reads them, tries to extract facts, and generates an answer. It works -- until it does not. The agent cannot tell which results are about the same entity. It cannot see what related questions people are asking. It has no way to discover follow-up queries it should run.

This is the difference between a search tool that returns text and one that returns structured SERP data. A modern Google results page contains organic results, knowledge graphs, People Also Ask boxes, featured snippets, related searches, and AI overviews. When your agent can access all of this as structured JSON, it reasons better, hallucinates less, and produces more comprehensive answers.

Anatomy of a Search Engine Results Page

A SERP is not a list of links. It is a rich data structure that Google assembles from multiple sources. Here is what a typical SERP contains and why each element matters for AI agents:

SERP elementWhat it containsWhy agents need it
Organic resultsTitle, URL, snippet, positionPrimary source links for grounding claims
Knowledge graphEntity type, description, attributes, related entitiesStructured facts without parsing web pages
People Also AskRelated questions with expandable answersIntent signals, gap detection, answer validation
Related searchesAlternative queries suggested by GoogleFollow-up queries for multi-step research
Featured snippetsHighlighted answer extracted from a top pageDirect answer for factual questions
AI overviewsGoogle's AI-generated summaryPre-synthesized context from Google's own models

Case Study: Flat Results vs. Structured SERP

To see the difference in practice, consider an agent asked: "Tell me about OpenAI."

With a flat search tool

The agent receives a list of 5-10 search results, each with a title, URL, and text snippet. To answer the question, the LLM must:

  1. Parse snippets to extract facts (founded, CEO, products)
  2. Guess which results are most authoritative
  3. Infer entity relationships from unstructured text
  4. Hope the snippets contain enough information

The result is often a surface-level answer that may mix up facts from different entities or miss key information that was not in the snippets.

With structured SERP data

The agent receives the same organic results plus structured data:

JSON
{
  "knowledge_graph": {
    "title": "OpenAI",
    "type": "Artificial intelligence company",
    "description": "OpenAI is an American artificial intelligence research organization...",
    "attributes": {
      "Founded": "December 11, 2015",
      "Headquarters": "San Francisco, California",
      "CEO": "Sam Altman",
      "Number of employees": "3,500+"
    },
    "related_entities": [
      {"name": "ChatGPT", "type": "Software"},
      {"name": "GPT-4", "type": "Large language model"},
      {"name": "DALL-E", "type": "AI image generator"}
    ]
  },
  "people_also_ask": [
    {
      "question": "Is OpenAI still a nonprofit?",
      "answer": "OpenAI was originally founded as a nonprofit in 2015, but restructured in 2019 to create a 'capped profit' subsidiary..."
    },
    {
      "question": "What is OpenAI's latest model?",
      "answer": "As of 2026, OpenAI's latest flagship model is GPT-4.5..."
    },
    {
      "question": "How much does OpenAI cost?",
      "answer": "ChatGPT Plus costs $20/month. API pricing varies by model..."
    }
  ],
  "related_searches": [
    "openai api pricing 2026",
    "openai competitors",
    "openai vs anthropic",
    "sam altman net worth"
  ]
}

Now the agent has structured facts it can reference directly. It does not need to guess the founding date -- it is in the knowledge graph. It can see what related questions other users ask and use those to provide a more complete answer. And it has related searches to drive follow-up queries if needed.

Three Practical Patterns

Pattern 1: Entity Resolution with Knowledge Graphs

When your agent searches for a company, person, or place, the knowledge graph provides structured facts without needing to scrape and parse web pages. This is faster, more reliable, and cheaper in tokens.

Python
from langchain_scavio import ScavioSearch

tool = ScavioSearch(
    max_results=3,
    include_knowledge_graph=True,
)

result = tool.invoke({"query": "Anthropic AI company"})
# The knowledge_graph object includes:
# - Entity type (AI safety company)
# - Description (from Google's knowledge base)
# - Founded date, CEO, headquarters
# - Related entities (Claude, Constitutional AI)
# Your agent can directly reference these facts

This pattern is especially useful for agents that need to verify claims or cross-reference information. Instead of asking the LLM to infer facts from search snippets, you give it structured data it can directly consume. See the API reference for the full knowledge graph response schema.

Pattern 2: Research Depth with People Also Ask

PAA data reveals what real users want to know about a topic. This is valuable for research agents in three ways:

  • Gap detection -- if PAA questions are not covered by the initial search results, the agent can run follow-up queries to fill those gaps
  • Scope expansion -- PAA reveals angles the agent (and the user) might not have considered. "Is OpenAI still a nonprofit?" is a question a research agent should address even if the user did not ask
  • Answer validation -- PAA answers provide a second source to cross-check information extracted from organic results
Python
tool = ScavioSearch(
    max_results=5,
    include_questions=True,  # People Also Ask (default: True)
)

result = tool.invoke({"query": "LangChain vs LlamaIndex 2026"})

# Use PAA questions as follow-up queries
# for a multi-step research agent:
#
# PAA: "Is LlamaIndex better for RAG?"
# PAA: "Can you use LangChain and LlamaIndex together?"
# PAA: "What is the difference between agents and chains?"
#
# Each becomes a search query in the next iteration

Pattern 3: Adaptive Query Refinement with Related Searches

Instead of relying on the LLM to generate follow-up queries from scratch (which costs tokens and may produce poor queries), use the related searches that Google already suggests:

Python
tool = ScavioSearch(
    max_results=5,
    include_related=True,  # related searches (default: False)
)

result = tool.invoke({"query": "best search API for AI agents"})

# Related searches returned by Google:
# - "tavily vs serpapi vs scavio"
# - "langchain search tool comparison"
# - "web search api for llm"
# - "real-time search api pricing"
#
# These are search queries that real users run --
# they are more likely to return relevant results
# than LLM-generated queries

This pattern is particularly effective in LangGraph research agents, where you can add a "refine" node that examines related searches and generates additional queries to fill gaps in the research.

When Structured SERP Data Is Overkill

Not every use case needs the full SERP structure. If your agent only answers simple factual questions ("What is the capital of France?"), a flat search result or even a direct API call is sufficient. Structured SERP data adds the most value when:

  • The agent needs to reason about entities and relationships
  • Multi-step research requires follow-up queries
  • The user expects comprehensive answers with multiple perspectives
  • Answer accuracy matters enough to cross-reference sources

For simple Q&A chatbots, a simplified search response may be sufficient. See our comparison of LangChain search tools for guidance on choosing the right tool for your use case.

Cost Optimization Tips

Structured SERP data is richer but also larger. Here is how to keep token and API costs under control:

  1. Use field-level filtering -- only enable the SERP sections your agent actually uses. If you do not need AI overviews, set include_ai_overviews=False. ScavioSearch has 12 toggleable sections.
  2. Limit result count -- max_results=5 is usually sufficient. Going to 10+ results rarely improves answer quality but doubles token usage.
  3. Use light requests -- Scavio's light request mode costs 1 credit (vs. 2 for full) and still returns all SERP sections. It is the default and covers most use cases.
  4. Cache repeated queries -- if your agent frequently searches for the same entities, cache results at the application level to avoid redundant API calls.

Getting Started

The fastest way to try structured SERP data in your agent:

Python
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from langchain_scavio import ScavioSearch

agent = create_react_agent(
    ChatOpenAI(model="gpt-4o"),
    tools=[ScavioSearch(
        max_results=5,
        include_knowledge_graph=True,
        include_questions=True,
        include_related=True,
    )],
)

response = agent.invoke({
    "messages": [{
        "role": "user",
        "content": "Research the current state of AI regulation in the EU"
    }]
})

print(response["messages"][-1].content)

Next Steps