langchainmigrationsearch-api

Migrating LangChain Scrapers to Search API Tools

LangChain scraping tools require 150 lines of Python and break every 2-3 weeks. A search API tool is 15 lines and has not needed a patch in months.

5 min read

The LangChain community has moved from web scrapers to structured APIs for search tools. The reason is simple: scrapers break. A Playwright-based Google scraper requires a headless browser runtime, a proxy subscription, and 150 lines of HTML parsing code that breaks every 2-3 weeks when Google updates its layout. A search API tool requires 15 lines of Python and has not needed a patch since it was written.

The migration is 15 lines

Python
from langchain.tools import tool
import requests, os

@tool
def web_search(query: str, platform: str = 'google') -> str:
    """Search the web. Platforms: google, reddit, youtube, amazon, walmart."""
    resp = requests.post('https://api.scavio.dev/api/v1/search',
        headers={'x-api-key': os.environ['SCAVIO_API_KEY']},
        json={'platform': platform, 'query': query}, timeout=10)
    results = resp.json().get('organic', [])[:5]
    return '\n'.join(f'{r["title"]}: {r.get("snippet", "")} ({r.get("link", "")})'
                      for i, r in enumerate(results))

What the agent gains

The platform parameter in the tool description tells the LLM it can route to different search types. Ask about products: the agent picks Amazon. Ask about community opinions: the agent picks Reddit. Ask about tutorials: the agent picks YouTube. One tool definition replaces what would otherwise be 5 separate scraper tools.

The stability difference

Scraper-based tools have a maintenance half-life of 2-3 weeks. API-based tools have been running unchanged for months. The API provider absorbs the maintenance of dealing with website changes, anti-bot detection, and response format normalization. Your agent gets a stable interface that does not degrade over time.

Cost comparison at agent scale

A LangChain agent that runs 10 searches per task at 100 tasks per day: 1,000 searches daily. With Scavio at $0.005 per query, that is $5 per day or $150 per month. A proxy subscription for the equivalent scraping volume: $50-200 per month for proxies alone, plus compute for headless browsers, plus engineering time for maintenance. The API approach is cheaper or comparable on direct costs and dramatically cheaper when you factor in engineering time.