deepseekserperexa

What Powers DeepSeek Search: Serper and Exa Backends

DeepSeek uses Serper for Google SERP and Exa for semantic search. How to replicate the architecture with any search API.

7 min

DeepSeek uses Serper for Google SERP data and Exa for semantic web search as its primary search backends. When DeepSeek V4 activates its search tool, queries route to these providers to fetch live results that get injected into the model context window before generating a grounded response. This is the same architecture any developer can replicate with a search API and an LLM.

Serper: the Google data layer

Serper provides Google SERP data at $50/yr (Dev, 50K searches) or $50/mo (Pro, 500K searches). It returns organic results, featured snippets, People Also Ask, knowledge panels, and local results from Google. DeepSeek uses Serper when it needs traditional web search results -- finding documentation, current pricing, news articles, and factual information.

Exa: the semantic search layer

Exa provides neural search that understands meaning rather than just matching keywords. At 1K free searches/month and $5/1K after, it excels at finding conceptually related content. DeepSeek uses Exa when the query is more exploratory -- "papers about transformer efficiency" rather than "transformer model latest version."

How to replicate DeepSeek search locally

Python
import os, requests

def deepseek_style_search(query: str, search_type: str = "auto") -> dict:
    """
    Replicate DeepSeek's search grounding pattern.
    Route to keyword search for factual queries,
    semantic search for exploratory queries.
    """
    # Simple routing heuristic
    factual_signals = ["price", "cost", "version", "release", "how to",
                       "what is", "when did", "latest"]
    is_factual = any(s in query.lower() for s in factual_signals)

    if search_type == "auto":
        search_type = "factual" if is_factual else "semantic"

    if search_type == "factual":
        # Use structured SERP API (like DeepSeek uses Serper)
        resp = requests.post(
            "https://api.scavio.dev/api/v1/search",
            headers={"x-api-key": os.environ["SCAVIO_API_KEY"]},
            json={"query": query, "num_results": 5,
                  "include_ai_overview": True},
        )
        data = resp.json()
        return {
            "type": "factual",
            "results": [
                {"title": r["title"], "url": r["link"],
                 "text": r["snippet"]}
                for r in data.get("organic_results", [])
            ],
            "ai_overview": data.get("ai_overview", {}).get("text", ""),
        }
    else:
        # Same API, different query framing
        resp = requests.post(
            "https://api.scavio.dev/api/v1/search",
            headers={"x-api-key": os.environ["SCAVIO_API_KEY"]},
            json={"query": query, "num_results": 10},
        )
        data = resp.json()
        return {
            "type": "semantic",
            "results": [
                {"title": r["title"], "url": r["link"],
                 "text": r["snippet"]}
                for r in data.get("organic_results", [])
            ],
        }

# Factual query (would use Serper in DeepSeek)
pricing = deepseek_style_search("SerpAPI pricing 2026")
print(f"Found {len(pricing['results'])} results (factual)")

# Exploratory query (would use Exa in DeepSeek)
research = deepseek_style_search("approaches to reducing LLM hallucination")
print(f"Found {len(research['results'])} results (semantic)")

Building a full search-grounded agent

Python
from anthropic import Anthropic

client = Anthropic()

def grounded_agent(question: str) -> str:
    """Agent with DeepSeek-style search grounding."""
    # Step 1: Search for context
    search_result = deepseek_style_search(question)

    # Step 2: Build grounding context
    context_parts = []
    if search_result.get("ai_overview"):
        context_parts.append(f"AI Overview: {search_result['ai_overview']}")
    for r in search_result["results"][:5]:
        context_parts.append(f"[{r['title']}]({r['url']}): {r['text']}")
    context = "\n\n".join(context_parts)

    # Step 3: Generate grounded response
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        messages=[{"role": "user", "content":
            f"Answer based on these sources:\n{context}\n\nQuestion: {question}"}],
    )
    return response.content[0].text

answer = grounded_agent("What search APIs does DeepSeek use?")
print(answer)

Why you might want a different backend

Serper is Google-only. Exa is semantic-only. Neither provides Google Maps, Shopping, News, Reddit, or TikTok data. A multi-platform search API gives you all of these from one integration, which is more versatile for agents that need diverse data sources.

Cost to replicate DeepSeek search

  • Serper + Exa: $50/yr + $5/1K = ~$10/mo for 1K searches
  • Scavio alone: $30/mo for 7K credits (covers both factual and semantic)
  • Tavily: $30/mo Researcher (1K-ish queries)
  • Brave: $5/1K queries

Key takeaway

DeepSeek V4 search is not magic. It is Serper + Exa behind a tool-calling interface. You can build the same thing with any search API and any LLM. The architecture is simple: route query to search, inject results into context, generate grounded response. The value is in the routing logic and prompt design, not the search provider.