CrewAI Content Researcher with Live SERP Data
Build a CrewAI two-agent pipeline for content research: a researcher collects SERP data, a strategist evaluates gaps. Working code with Scavio integration.
Building a CrewAI content research agent with live SERP data produces better output than using Perplexity summaries or LLM training data alone. Perplexity gives you a summarized answer. Live SERP data gives you the raw search landscape: what ranks, what people ask, what content gaps exist. Two agents working together, a researcher and a strategist, turn this into actionable content briefs.
Why Perplexity summaries are not enough for content research
Perplexity answers questions. Content research requires understanding the search landscape: which pages rank for a keyword, what angles they cover, what questions appear in People Also Ask, and what related searches suggest about user intent. A Perplexity summary collapses all of this into a single answer, losing the competitive intelligence that makes content strategy work.
When you ask Perplexity "best project management tools 2026," you get a list. When you pull SERP data for the same query, you see which domains rank, what their title tags emphasize, which subtopics get dedicated sections, and what questions Google surfaces in PAA boxes. That is the difference between having an answer and having a strategy.
The two-agent CrewAI architecture
The researcher agent searches, collects, and structures raw data. The strategist agent evaluates the data, identifies gaps, and produces a content brief. Separating these roles prevents the common failure where a single agent tries to research and strategize simultaneously, producing shallow analysis.
from crewai import Agent, Task, Crew
from crewai.tools import BaseTool
import httpx
class SerpSearchTool(BaseTool):
name: str = "serp_search"
description: str = "Search Google and return SERP results with titles, URLs, and snippets"
def _run(self, query: str) -> str:
resp = httpx.post(
"https://api.scavio.dev/api/v1/search",
headers={"x-api-key": "sc-xxxx"},
json={"query": query, "type": "web", "limit": 10},
)
results = resp.json().get("results", [])
output = []
for i, r in enumerate(results, 1):
output.append(
f"{i}. {r.get('title', 'No title')}
"
f" URL: {r.get('url', '')}
"
f" Snippet: {r.get('snippet', '')}"
)
return "
".join(output)
serp_tool = SerpSearchTool()
researcher = Agent(
role="Content Researcher",
goal="Collect comprehensive SERP data for a target keyword",
backstory=(
"You analyze search results to understand the competitive "
"landscape. You search the main keyword, related variations, "
"and question-based queries to build a complete picture."
),
tools=[serp_tool],
verbose=True,
)
strategist = Agent(
role="Content Strategist",
goal="Evaluate SERP data and produce an actionable content brief",
backstory=(
"You identify content gaps by comparing what ranks against "
"what searchers actually need. You produce specific, "
"actionable content briefs with angles competitors miss."
),
verbose=True,
)Defining the tasks
The researcher runs three searches: the primary keyword, a question-based variation, and a comparison variation. This captures informational, navigational, and commercial intent. The strategist receives all collected data and produces a structured brief.
research_task = Task(
description=(
"Research the keyword: {keyword}
"
"1. Search the exact keyword and record all 10 results
"
"2. Search 'best {keyword}' and record results
"
"3. Search '{keyword} vs' and record results
"
"4. Note: which domains appear multiple times, what angles "
"titles emphasize, what snippets reveal about content depth"
),
expected_output=(
"Structured SERP analysis with:
"
"- Top 10 results for each search (title, URL, snippet)
"
"- Domains that rank for multiple queries
"
"- Common title patterns and angles
"
"- Content depth signals from snippets"
),
agent=researcher,
)
strategy_task = Task(
description=(
"Using the SERP research, create a content brief that:
"
"1. Identifies the top 3 content gaps (topics competitors miss)
"
"2. Recommends a specific angle that differentiates from page 1
"
"3. Lists 5-7 sections the article should cover
"
"4. Suggests a title that targets the keyword with a unique hook
"
"5. Estimates word count based on competing content depth"
),
expected_output=(
"Content brief with: title, angle, sections, word count, "
"and competitive gaps to exploit"
),
agent=strategist,
)
crew = Crew(
agents=[researcher, strategist],
tasks=[research_task, strategy_task],
verbose=True,
)
result = crew.kickoff(inputs={"keyword": "ai agent frameworks python"})What the output looks like
The researcher collects 30 results across three queries and surfaces patterns: which domains dominate, what title structures rank, and what content depth the snippets suggest. The strategist uses this to identify gaps. For example, if every ranking page for "ai agent frameworks python" covers LangChain and CrewAI but none compare production deployment patterns, that is a gap worth targeting.
The brief output includes a recommended title, angle, section structure, target word count, and the specific competitive gaps it exploits. This is significantly more actionable than a Perplexity summary, which would just list frameworks without competitive context.
Cost breakdown
Each research cycle runs 3 search queries at $0.005 each = $0.015 per keyword. The LLM cost for two CrewAI agents depends on your model: GPT-4o runs about $0.01-0.03 per task, smaller models less. Total cost per content brief: $0.025-0.045. Running 100 keyword analyses per month costs $2.50-4.50 in search API credits.
Compare this to Perplexity Pro at $20/month (limited to 300 searches) or Ahrefs at $99/month for keyword data. The CrewAI + SERP API approach is cheaper and produces output tailored to your content workflow rather than generic answers.
Extending the pipeline
Add a third agent for content drafting that takes the strategist's brief and produces a first draft. Add a fourth for SEO optimization that checks keyword density, heading structure, and internal link opportunities. Each agent in the chain gets more specific, and the SERP data collected by the researcher flows through the entire pipeline.
The key insight is that live SERP data makes every downstream agent better. A drafting agent with knowledge of what ranks produces content that competes. A drafting agent working from LLM training data produces content that sounds good but misses competitive positioning entirely.