Google Scholar contains valuable data — paper title, authors, citation count, abstract snippet, and more. Scraping this data directly means dealing with anti-bot detection, CAPTCHAs, IP rotation, and constantly breaking selectors. The Scavio API handles all of that and returns clean, structured JSON from a single POST request.
This tutorial shows you how to scrape Google Scholar using Rust and the Scavio API. By the end, you will have a working Rust script that fetches real-time Google Scholar data and parses the results.
Prerequisites
- Rust installed on your machine
- A Scavio API key (free tier includes 500 credits/month — no credit card required)
Step 1: Install Dependencies
Install reqwest to make HTTP requests:
cargo add reqwest serde serde_json tokio --features reqwest/json,tokio/fullStep 2: Make Your First Google Scholar Search
Send a POST request to the Scavio Google Scholar API endpoint with your query. The API returns structured JSON with paper title, authors, citation count, and more.
use reqwest::Client;
use serde_json::{json, Value};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = "your_scavio_api_key";
let client = Client::new();
let response = client
.post("https://api.scavio.dev/api/v1/search")
.header("x-api-key", api_key)
.json(&json!({ "query": query, "tbs": "" }))
.send()
.await?;
let data: Value = response.json().await?;
println!("{}", serde_json::to_string_pretty(&data)?);
Ok(())
}Step 3: Example Response
The API returns structured JSON. Here is an example response for a Google Scholar search:
{
"search_metadata": { "status": "success" },
"organic_results": [
{
"position": 1,
"title": "Retrieval-Augmented Generation for Large Language Models: A Survey",
"link": "https://scholar.google.com/scholar?hl=en&q=retrieval+augmented+generation",
"authors": ["Y. Gao", "Y. Xiong", "X. Gao"],
"publication_year": 2024,
"cited_by": 1240,
"snippet": "We survey RAG approaches that combine parametric and non-parametric memory..."
}
]
}Every field is structured and typed — no HTML parsing, no CSS selectors, no regex extraction. Your Rust code can access any field directly.
Step 4: Full Working Example
Here is a complete, runnable Rust script that searches Google Scholar and prints the results:
use reqwest::Client;
use serde_json::{json, Value};
use std::env;
/// Scrape Google Scholar search results using Scavio API.
/// Returns structured JSON with paper title, authors, citation count, and more.
async fn search_google_scholar(query: &str) -> Result<Value, Box<dyn std::error::Error>> {
let api_key = env::var("SCAVIO_API_KEY")?;
let client = Client::new();
let response = client
.post("https://api.scavio.dev/api/v1/search")
.header("x-api-key", &api_key)
.json(&json!({ "query": query, "tbs": "" }))
.send()
.await?;
if !response.status().is_success() {
return Err(format!("Scavio API error: {}", response.status()).into());
}
Ok(response.json().await?)
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let results = search_google_scholar("retrieval augmented generation 2024").await?;
println!("{}", serde_json::to_string_pretty(&results)?);
Ok(())
}Why Use Scavio Instead of Scraping Google Scholar Directly?
- No proxy management. Direct scraping requires rotating proxies to avoid IP bans. Scavio handles all of this server-side.
- No CAPTCHA solving. Google Scholar aggressively blocks automated requests. Scavio returns clean data every time.
- Structured JSON output. No HTML parsing or CSS selector maintenance. Get typed, consistent data from every request.
- Multi-platform in one API. Search Google, Amazon, YouTube, and Walmart from the same API key with the same authentication pattern.
- Free tier included. 500 credits/month with no credit card required. Each search costs 1 credit.