Tutorial

How to Combine a Memory Server with Search MCP

Wire a memory MCP server alongside Scavio search MCP to give AI agents persistent context enriched with live web data. Full setup guide.

Combine a memory MCP server with a search MCP server to give your AI agent both persistent context recall and live web data retrieval in a single conversation. Without memory, agents repeat searches for facts they already found. Without search, agents hallucinate when asked about current events. Running both servers together lets the agent store research findings in memory and only search when it encounters genuinely new questions. This tutorial configures both servers in Claude Desktop and demonstrates patterns for efficient memory-augmented search.

Prerequisites

  • Claude Desktop or Claude Code installed
  • A Scavio API key from scavio.dev
  • A memory MCP server (e.g., mem0 or local knowledge graph server)
  • Basic familiarity with MCP configuration

Walkthrough

Step 1: Configure both MCP servers

Add Scavio for search and your memory server to the same MCP configuration file. Claude will have access to tools from both servers simultaneously.

JSON
// .mcp.json or claude_desktop_config.json:
{
  "mcpServers": {
    "scavio": {
      "url": "https://mcp.scavio.dev/mcp",
      "headers": { "x-api-key": "your_scavio_api_key" }
    },
    "memory": {
      "command": "npx",
      "args": ["@modelcontextprotocol/server-memory"]
    }
  }
}

Step 2: Establish the search-then-store pattern

Instruct the agent to check memory first, search only if needed, and always store new findings.

Python
# System prompt or custom command for Claude:
RESEARCH_PROMPT = """
When I ask a factual question:
1. Check memory for existing knowledge on this topic
2. If memory has a recent answer (stored within 7 days), use it
3. If not in memory or outdated, search Google via scavio
4. Store the key findings in memory with a timestamp
5. Answer using the combined context

Always cite whether the answer came from memory or fresh search.
"""

Step 3: Build a research session workflow

Walk through a multi-turn research session where the agent uses memory to avoid redundant searches.

Bash
# Example multi-turn session:
# Turn 1: 'What is Vercel's latest funding round?'
#   -> Agent searches Google via scavio, finds Series E info
#   -> Agent stores: {entity: 'Vercel', funding: 'Series E $250M', date: '2026-05-07'}
#
# Turn 2: 'How does Vercel compare to Netlify on pricing?'
#   -> Agent recalls Vercel info from memory (no search needed)
#   -> Agent searches for Netlify pricing (new topic)
#   -> Stores Netlify pricing in memory
#
# Turn 3: 'Summarize both for my CTO'
#   -> Agent uses memory for both, zero searches needed

Step 4: Verify memory persistence

Test that stored facts survive across conversations by restarting Claude and querying previously researched topics.

Python
# After restarting Claude Desktop, test recall:
# 'What do you remember about Vercel funding?'
#   -> Should retrieve from memory server without searching
#
# Programmatic verification via API:
import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

# This search should NOT happen if memory is working:
def verify_memory_reduces_searches(topic: str):
    # First call: search and store
    resp = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': 'google', 'query': topic}, timeout=15)
    result = resp.json().get('organic_results', [])
    print(f'Search returned {len(result)} results - store these in memory')
    return result

verify_memory_reduces_searches('Vercel series e funding 2026')

Python Example

Python
import requests, os, json
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}
memory = {}

def search_with_memory(query):
    if query in memory:
        print(f'Memory hit: {query}')
        return memory[query]
    data = requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': 'google', 'query': query}).json()
    results = data.get('organic_results', [])[:3]
    memory[query] = results
    print(f'Searched and cached: {query}')
    return results

search_with_memory('Vercel funding 2026')
search_with_memory('Vercel funding 2026')  # memory hit

JavaScript Example

JavaScript
const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
const memory = new Map();
async function searchWithMemory(query) {
  if (memory.has(query)) { console.log('Memory hit'); return memory.get(query); }
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST', headers: H, body: JSON.stringify({platform: 'google', query})
  });
  const results = (await r.json()).organic_results?.slice(0, 3) || [];
  memory.set(query, results);
  return results;
}
await searchWithMemory('Vercel funding 2026');
await searchWithMemory('Vercel funding 2026'); // memory hit

Expected Output

JSON
A dual-MCP setup where Claude uses memory for previously researched topics and only calls the search API for new queries, reducing total API calls by 40-60% in typical research sessions.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Claude Desktop or Claude Code installed. A Scavio API key from scavio.dev. A memory MCP server (e.g., mem0 or local knowledge graph server). Basic familiarity with MCP configuration. A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Wire a memory MCP server alongside Scavio search MCP to give AI agents persistent context enriched with live web data. Full setup guide.