real-estateagentsdata

Real Estate Data Agent: Multi-Source Architecture

Real estate agents combining Zillow, Redfin, and local MLS need structured aggregation. SERP-based approach pulls listing data across platforms at $0.005/query.

9 min

A property search agent that aggregates Google results (Rightmove, Zoopla listings), Reddit market sentiment, and YouTube area reviews into one structured report replaces hours of manual browsing. The agent searches multiple platforms via a single API, normalizes results, and produces a location brief with pricing data, resident opinions, and neighborhood context.

Why multi-source matters for real estate

Rightmove shows listings. Zoopla shows estimates. Neither shows what it is actually like to live somewhere. Reddit has threads like "moving to Bristol, what should I know" with real resident opinions. YouTube has walkthrough videos and area guides. The agent that combines all three gives a complete picture: price, sentiment, and context.

The agent architecture

Python
import requests, os, json
from dataclasses import dataclass

@dataclass
class PropertyBrief:
    location: str
    listings: list
    market_sentiment: list
    area_reviews: list
    price_signals: list

def build_property_brief(location: str, property_type: str = "2 bed flat") -> PropertyBrief:
    """Aggregate property data from multiple sources."""
    api_key = os.environ["SCAVIO_API_KEY"]
    headers = {"x-api-key": api_key}
    base_url = "https://api.scavio.dev/api/v1/search"

    brief = PropertyBrief(
        location=location,
        listings=[],
        market_sentiment=[],
        area_reviews=[],
        price_signals=[],
    )

    # Source 1: Google (captures Rightmove, Zoopla, OnTheMarket results)
    google_resp = requests.post(
        base_url,
        headers=headers,
        json={
            "query": f"{property_type} for sale {location}",
            "platform": "google",
            "country_code": "gb",
        },
        timeout=15,
    )
    for r in google_resp.json().get("organic_results", []):
        brief.listings.append({
            "title": r.get("title", ""),
            "url": r.get("link", ""),
            "snippet": r.get("snippet", ""),
            "source": "google",
        })

    # Source 2: Reddit (resident opinions, market sentiment)
    reddit_resp = requests.post(
        base_url,
        headers=headers,
        json={"query": f"living in {location} experience", "platform": "reddit"},
        timeout=15,
    )
    for r in reddit_resp.json().get("organic_results", []):
        brief.market_sentiment.append({
            "title": r.get("title", ""),
            "url": r.get("link", ""),
            "snippet": r.get("snippet", ""),
        })

    # Source 3: YouTube (area guides, walkthrough videos)
    yt_resp = requests.post(
        base_url,
        headers=headers,
        json={"query": f"{location} area guide living", "platform": "youtube"},
        timeout=15,
    )
    for r in yt_resp.json().get("organic_results", []):
        brief.area_reviews.append({
            "title": r.get("title", ""),
            "url": r.get("link", ""),
            "snippet": r.get("snippet", ""),
        })

    return brief

# 3 credits per location = $0.015
brief = build_property_brief("Clifton Bristol", "2 bed flat")

Extracting price signals from SERP data

Python
import re

def extract_price_signals(listings: list) -> list:
    """Extract price data from listing snippets."""
    prices = []
    for listing in listings:
        text = listing.get("snippet", "") + " " + listing.get("title", "")

        # UK price patterns
        gbp_matches = re.findall(r'\u00a3[\d,]+(?:,\d{3})*', text)
        for match in gbp_matches:
            price = int(match.replace('\u00a3', '').replace(',', ''))
            if 50000 < price < 5000000:
                prices.append({
                    "price_gbp": price,
                    "source": listing.get("url", ""),
                    "context": text[:100],
                })

        # Also catch "XXXk" patterns
        k_matches = re.findall(r'(d{2,3})k', text.lower())
        for match in k_matches:
            price = int(match) * 1000
            if 50000 < price < 5000000:
                prices.append({
                    "price_gbp": price,
                    "source": listing.get("url", ""),
                    "context": text[:100],
                })

    return prices

prices = extract_price_signals(brief.listings)
if prices:
    avg = sum(p["price_gbp"] for p in prices) / len(prices)
    print(f"Average asking price signal: {avg:,.0f}")

Sentiment analysis from Reddit

Python
def analyze_location_sentiment(threads: list) -> dict:
    """Basic sentiment scoring from Reddit thread snippets."""
    positive_signals = [
        "love living", "great area", "recommend", "safe",
        "friendly", "good schools", "nice parks", "walkable",
    ]
    negative_signals = [
        "avoid", "crime", "noisy", "expensive", "parking nightmare",
        "overrated", "gentrification", "traffic", "nothing to do",
    ]

    pos_count = 0
    neg_count = 0
    themes = {"positive": [], "negative": []}

    for thread in threads:
        text = (thread.get("snippet", "") + " " + thread.get("title", "")).lower()
        for signal in positive_signals:
            if signal in text:
                pos_count += 1
                themes["positive"].append(signal)
        for signal in negative_signals:
            if signal in text:
                neg_count += 1
                themes["negative"].append(signal)

    total = pos_count + neg_count
    return {
        "sentiment_score": round(pos_count / max(total, 1), 2),
        "positive_mentions": pos_count,
        "negative_mentions": neg_count,
        "themes": themes,
    }

sentiment = analyze_location_sentiment(brief.market_sentiment)
print(f"Sentiment score: {sentiment['sentiment_score']} "
      f"(1.0 = all positive, 0.0 = all negative)")

Generating the location report

Python
def generate_report(brief: PropertyBrief) -> str:
    """Generate a structured location report."""
    prices = extract_price_signals(brief.listings)
    sentiment = analyze_location_sentiment(brief.market_sentiment)

    report = f"Location Report: {brief.location}\n"
    report += "=" * 50 + "\n\n"

    if prices:
        avg = sum(p["price_gbp"] for p in prices) / len(prices)
        report += f"Price signals: {len(prices)} data points, "
        report += f"average {avg:,.0f} GBP\n\n"

    report += f"Resident sentiment: {sentiment['sentiment_score']:.0%} positive\n"
    if sentiment["themes"]["positive"]:
        report += f"  Positives: {', '.join(set(sentiment['themes']['positive']))}\n"
    if sentiment["themes"]["negative"]:
        report += f"  Concerns: {', '.join(set(sentiment['themes']['negative']))}\n"

    report += f"\nListings found: {len(brief.listings)}\n"
    report += f"Reddit threads: {len(brief.market_sentiment)}\n"
    report += f"YouTube reviews: {len(brief.area_reviews)}\n"

    return report

print(generate_report(brief))

Scaling to multiple locations

  • Compare 5 locations: 15 credits = $0.075 on Scavio
  • Weekly monitoring of target areas: 15 credits/week = $3/month
  • Add Amazon search for "moving to X" books and guides for additional context
  • Store historical data to track price trends and sentiment shifts over time

This approach does not replace Rightmove or Zoopla for actual property listings. It adds the context those platforms lack: what residents actually think, what video reviewers show about the neighborhood, and a structured way to compare multiple locations before committing to in-person viewings.