The Problem
LangChain projects that use web scraping tools (BeautifulSoup loaders, Playwright scrapers, or Selenium-based extractors) face constant maintenance: anti-bot detection breaks scrapers, HTML layouts change without warning, and proxy costs escalate. Every scraper in the chain is a fragile point that requires monitoring and patching. The LangChain community has moved toward structured APIs over raw scraping, but migration paths are not well documented.
The Scavio Solution
Replace LangChain scraping tools with Scavio's search API as a custom tool. The migration is straightforward: remove the scraper tool definition, add a new tool that calls Scavio's REST endpoint, and update the prompt to reference the new tool name. The response is already structured JSON, so you do not need a parsing step. For LangChain agents, define the tool with a clear description so the LLM knows when to use it. For LangChain chains, replace the scraper node with an API call node.
Before
Before migration, a LangChain research agent used a Playwright-based Google scraper that required 150 lines of Python, a headless browser runtime, and a proxy subscription. The scraper broke every 2-3 weeks when Google updated its layout, requiring emergency patches.
After
After migrating to Scavio's search API, the tool definition is 20 lines of Python. No browser runtime, no proxy subscription, no HTML parsing. The agent gets structured JSON directly. The tool has not required a single patch in 4 months because the API contract is stable.
Who It Is For
LangChain developers maintaining brittle web scraping tools who want to migrate to a stable, structured search API without rewriting their agent architecture.
Key Benefits
- Replace 150 lines of scraping code with 20 lines of API calls
- Eliminate headless browser and proxy dependencies
- Structured JSON response requires no HTML parsing
- Stable API contract eliminates layout-change breakage
- Multi-platform coverage from a single tool definition
Python Example
from langchain.tools import tool
import requests, os
@tool
def web_search(query: str) -> str:
"""Search the web for current information. Returns structured results from Google."""
resp = requests.post('https://api.scavio.dev/api/v1/search',
headers={'x-api-key': os.environ['SCAVIO_API_KEY']},
json={'platform': 'google', 'query': query}, timeout=10)
results = resp.json().get('organic', [])[:5]
return '\n'.join(f"{r['title']}: {r['snippet']} ({r['link']})" for r in results)
# Use in a LangChain agent:
# from langchain.agents import create_tool_calling_agent
# agent = create_tool_calling_agent(llm, [web_search], prompt)
# agent_executor = AgentExecutor(agent=agent, tools=[web_search])JavaScript Example
import { tool } from '@langchain/core/tools';
import { z } from 'zod';
const webSearch = tool(async ({ query }) => {
const resp = await fetch('https://api.scavio.dev/api/v1/search', {
method: 'POST',
headers: { 'x-api-key': process.env.SCAVIO_API_KEY, 'Content-Type': 'application/json' },
body: JSON.stringify({ platform: 'google', query })
});
const data = await resp.json();
return (data.organic || []).slice(0, 5)
.map(r => `${r.title}: ${r.snippet} (${r.link})`).join('\n');
}, {
name: 'web_search',
description: 'Search the web for current information.',
schema: z.object({ query: z.string().describe('Search query') })
});Platforms Used
Web search with knowledge graph, PAA, and AI overviews
Community, posts & threaded comments from any subreddit
YouTube
Video search with transcripts and metadata
Amazon
Product search with prices, ratings, and reviews
Walmart
Product search with pricing and fulfillment data