GSC + GA4 MCP Analytics Pipeline
Connect Google Search Console and GA4 to AI agents via MCP. Add SERP context for keywords GSC shows. Architecture and implementation guide.
Build an analytics pipeline where n8n pulls Google Search Console and GA4 data via their APIs, saves pre-aggregated reports to Google Drive, and Claude reads them via MCP to write analysis into Obsidian. The key insight: pre-aggregate data before sending to Claude. Raw GSC exports with 50,000 rows overwhelm context windows. Summarize first, analyze second.
Architecture
# Pipeline stages:
#
# [n8n Cron: daily 6am]
# |
# v
# [GSC API] --> aggregate by page, query, date
# |
# [GA4 API] --> aggregate by page, source, events
# |
# v
# [n8n Function: pre-aggregate]
# |
# v
# [Google Drive: /analytics/daily-reports/]
# |
# v
# [Claude via MCP: reads Drive files]
# |
# v
# [Obsidian vault: /analysis/weekly-seo-report.md]Step 1: pull GSC data with n8n
# n8n HTTP Request node: GSC Search Analytics
# Method: POST
# URL: https://www.googleapis.com/webmasters/v3/sites/
# https%3A%2F%2Fyoursite.com/searchAnalytics/query
# Auth: OAuth2 (Google credentials)
#
# Body:
# {
# "startDate": "{{ $now.minus(7, 'days').format('yyyy-MM-dd') }}",
# "endDate": "{{ $now.minus(1, 'day').format('yyyy-MM-dd') }}",
# "dimensions": ["query", "page", "date"],
# "rowLimit": 5000,
# "dimensionFilterGroups": [{
# "filters": [{
# "dimension": "country",
# "expression": "usa"
# }]
# }]
# }
# This returns up to 5,000 rows with clicks, impressions, CTR, position
# for each query+page+date combinationStep 2: pre-aggregate before Claude sees it
# n8n Function node: aggregate GSC data
# Place after GSC HTTP Request node
const rows = $input.first().json.rows || [];
// Aggregate by page (sum clicks/impressions, avg position)
const byPage = {};
for (const row of rows) {
const page = row.keys[1]; // page dimension
if (!byPage[page]) {
byPage[page] = { clicks: 0, impressions: 0, positions: [], queries: new Set() };
}
byPage[page].clicks += row.clicks;
byPage[page].impressions += row.impressions;
byPage[page].positions.push(row.position);
byPage[page].queries.add(row.keys[0]);
}
// Create summary (fits in Claude's context)
const summary = Object.entries(byPage)
.map(([page, d]) => ({
page,
clicks: d.clicks,
impressions: d.impressions,
avg_position: (d.positions.reduce((a, b) => a + b, 0) / d.positions.length).toFixed(1),
unique_queries: d.queries.size,
}))
.sort((a, b) => b.clicks - a.clicks)
.slice(0, 50); // Top 50 pages only
return [{ json: { summary, total_pages: Object.keys(byPage).length, period: "last_7_days" } }];Step 3: pull GA4 data
# n8n HTTP Request node: GA4 Data API
# Method: POST
# URL: https://analyticsdata.googleapis.com/v1beta/
# properties/YOUR_PROPERTY_ID:runReport
# Auth: OAuth2 (Google credentials)
#
# Body:
# {
# "dateRanges": [{"startDate": "7daysAgo", "endDate": "yesterday"}],
# "dimensions": [
# {"name": "pagePath"},
# {"name": "sessionSource"}
# ],
# "metrics": [
# {"name": "sessions"},
# {"name": "engagedSessions"},
# {"name": "conversions"}
# ],
# "limit": 1000,
# "orderBys": [{"metric": {"metricName": "sessions"}, "desc": true}]
# }
# Aggregate GA4 in a Function node similar to GSC:
# Group by pagePath, sum sessions/conversions, list top sourcesStep 4: save to Google Drive
# n8n Google Drive node: Create File
# Operation: Upload
# File Name: analytics_{{ $now.format('yyyy-MM-dd') }}.json
# Folder: /analytics/daily-reports/
# Content: {{ JSON.stringify($json) }}
# The file contains pre-aggregated data:
# {
# "gsc": { "summary": [...top 50 pages...], "total_pages": 230, "period": "last_7_days" },
# "ga4": { "summary": [...top 50 pages...], "total_sessions": 4500, "period": "last_7_days" }
# }
#
# This is ~5-10KB instead of the raw 500KB+ exportStep 5: Claude reads via MCP and writes analysis
# MCP configuration for Claude (in .mcp.json or settings):
# {
# "mcpServers": {
# "google-drive": {
# "command": "npx",
# "args": ["-y", "@anthropic/google-drive-mcp"],
# "env": { "GOOGLE_CLIENT_ID": "...", "GOOGLE_CLIENT_SECRET": "..." }
# }
# }
# }
# Claude prompt for weekly analysis:
# "Read the latest analytics file from Google Drive at
# /analytics/daily-reports/. Compare this week's top pages
# to last week. Identify:
# 1. Pages with biggest position improvements
# 2. Pages losing clicks despite stable position (CTR drop)
# 3. New queries driving traffic this week
# 4. Pages with high impressions but low CTR (optimization targets)
# Write the analysis to my Obsidian vault."Why pre-aggregation matters
- Raw GSC export: 50,000 rows = 5MB+ = exceeds Claude's practical context window
- Pre-aggregated: 50 top pages with metrics = 10KB = fits easily, faster response
- Claude analyzes patterns in summaries better than scanning raw rows
- Cheaper: fewer tokens = lower API cost if using Claude API
- Reproducible: saved aggregates serve as historical records
Enriching with SERP data
import requests, os
def enrich_gsc_with_serp(gsc_summary: list) -> list:
"""Add SERP feature data to top GSC pages."""
for page in gsc_summary[:10]: # Top 10 pages only
# Find the top query for this page
top_query = page.get("top_query", page["page"].split("/")[-1].replace("-", " "))
resp = requests.post(
"https://api.scavio.dev/api/v1/search",
headers={"x-api-key": os.environ["SCAVIO_API_KEY"]},
json={"query": top_query, "platform": "google"},
timeout=10,
)
serp = resp.json()
page["serp_features"] = {
"ai_overview": bool(serp.get("ai_overview")),
"featured_snippet": bool(serp.get("featured_snippet")),
"paa_count": len(serp.get("people_also_ask", [])),
}
return gsc_summary
# 10 credits = $0.05 to enrich top pages with SERP context
# Helps Claude identify WHY CTR changed (AI Overview appeared, etc.)This pipeline costs under $5/month: n8n self-hosted is free, GSC/GA4 APIs are free, Google Drive is free, and optional SERP enrichment via Scavio is $1-2/month. The value is not the data collection -- it is the pre-aggregation that makes Claude's analysis actually useful instead of drowning in raw rows.