How long does this build a web extraction mcp server tutorial take?

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

What do I need before starting?

Node.js 18+ installed. A Scavio API key from scavio.dev. Basic understanding of the MCP protocol. npm or bun package manager. A Scavio API key gives you 500 free credits per month.

Can I run this tutorial with the free tier?

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

What frameworks does this work with?

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Build Web Extraction MCP Server (2026 Guide)

Build a web extraction MCP server by wrapping Scavio's search and extract endpoints as MCP tools that any compatible AI client can call. This gives agents the ability to both find relevant pages and extract structured content from them in a single workflow. The search tool discovers URLs via Google, Reddit, or YouTube queries, and the extract tool pulls clean text and metadata from any URL. This tutorial builds both tools as a single MCP server using the MCP SDK.

Prerequisites

Node.js 18+ installed
A Scavio API key from scavio.dev
Basic understanding of the MCP protocol
npm or bun package manager

Walkthrough

Step 1: Scaffold the MCP server

Set up a minimal MCP server project using the MCP SDK that registers two tools: search and extract.

// package.json:
// { "dependencies": { "@modelcontextprotocol/sdk": "latest" } }

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { z } from 'zod';

const API_KEY = process.env.SCAVIO_API_KEY;
const server = new McpServer({ name: 'web-extraction', version: '1.0.0' });

Step 2: Add the search tool

Register a search tool that queries Scavio and returns structured results the agent can use to decide which pages to extract.

server.tool('search', 'Search Google, Reddit, YouTube, Amazon, or Walmart',
  { query: z.string(), platform: z.string().default('google') },
  async ({ query, platform }) => {
    const resp = await fetch('https://api.scavio.dev/api/v1/search', {
      method: 'POST',
      headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
      body: JSON.stringify({ platform, query }),
    });
    const data = await resp.json();
    return { content: [{ type: 'text', text: JSON.stringify(data.organic_results?.slice(0, 5) || [], null, 2) }] };
  }
);

Step 3: Add the extract tool

server.tool('extract', 'Extract clean text content from a URL',
  { url: z.string().url() },
  async ({ url }) => {
    const resp = await fetch('https://api.scavio.dev/api/v1/extract', {
      method: 'POST',
      headers: { 'x-api-key': API_KEY, 'Content-Type': 'application/json' },
      body: JSON.stringify({ url }),
    });
    const data = await resp.json();
    return { content: [{ type: 'text', text: data.text || data.content || JSON.stringify(data) }] };
  }
);

Step 4: Start the server and configure Claude

Connect the server via stdio transport and add it to your Claude Desktop or Claude Code MCP config.

// Start the server:
const transport = new StdioServerTransport();
await server.connect(transport);

// In .mcp.json:
// {
//   "mcpServers": {
//     "web-extraction": {
//       "command": "node",
//       "args": ["server.js"],
//       "env": { "SCAVIO_API_KEY": "your_key" }
//     }
//   }
// }
//
// Now Claude can: search for pages, then extract content from the best results.

Python Example

Python

import requests, os
H = {'x-api-key': os.environ['SCAVIO_API_KEY']}

def search(query, platform='google'):
    return requests.post('https://api.scavio.dev/api/v1/search', headers=H,
        json={'platform': platform, 'query': query}).json().get('organic_results', [])

def extract(url):
    return requests.post('https://api.scavio.dev/api/v1/extract', headers=H,
        json={'url': url}).json()

results = search('best crm 2026')
if results:
    content = extract(results[0]['link'])
    print(content.get('text', '')[:500])

JavaScript Example

JavaScript

const H = {'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json'};
async function search(query) {
  const r = await fetch('https://api.scavio.dev/api/v1/search', {
    method: 'POST', headers: H, body: JSON.stringify({platform: 'google', query})
  });
  return (await r.json()).organic_results || [];
}
async function extract(url) {
  const r = await fetch('https://api.scavio.dev/api/v1/extract', {
    method: 'POST', headers: H, body: JSON.stringify({url})
  });
  return r.json();
}
const results = await search('best crm 2026');
if (results[0]) console.log((await extract(results[0].link)).text?.slice(0, 500));

Expected Output

JSON

An MCP server exposing search and extract tools that AI agents can call to discover URLs and extract their content in a single workflow.

How to Build a Web Extraction MCP Server