Google Scholar contains valuable data — paper title, authors, citation count, abstract snippet, and more. Scraping this data directly means dealing with anti-bot detection, CAPTCHAs, IP rotation, and constantly breaking selectors. The Scavio API handles all of that and returns clean, structured JSON from a single POST request.
This tutorial shows you how to scrape Google Scholar using JavaScript and the Scavio API. By the end, you will have a working JavaScript script that fetches real-time Google Scholar data and parses the results.
Prerequisites
- JavaScript installed on your machine
- A Scavio API key (free tier includes 500 credits/month — no credit card required)
Step 1: Install Dependencies
fetch is built into JavaScript, so there is nothing to install.
# No installation needed — fetch is built into Node.js 18+Step 2: Make Your First Google Scholar Search
Send a POST request to the Scavio Google Scholar API endpoint with your query. The API returns structured JSON with paper title, authors, citation count, and more.
const API_KEY = "your_scavio_api_key";
const response = await fetch("https://api.scavio.dev/api/v1/search", {
method: "POST",
headers: {
"x-api-key": API_KEY,
"Content-Type": "application/json",
},
body: JSON.stringify({ query, tbs: "" }),
});
const data = await response.json();
for (const result of data.organic_results?.slice(0, 5) ?? []) {
console.log(`${result.position}. ${result.title}`);
console.log(` ${result.link}\n`);
}Step 3: Example Response
The API returns structured JSON. Here is an example response for a Google Scholar search:
{
"search_metadata": { "status": "success" },
"organic_results": [
{
"position": 1,
"title": "Retrieval-Augmented Generation for Large Language Models: A Survey",
"link": "https://scholar.google.com/scholar?hl=en&q=retrieval+augmented+generation",
"authors": ["Y. Gao", "Y. Xiong", "X. Gao"],
"publication_year": 2024,
"cited_by": 1240,
"snippet": "We survey RAG approaches that combine parametric and non-parametric memory..."
}
]
}Every field is structured and typed — no HTML parsing, no CSS selectors, no regex extraction. Your JavaScript code can access any field directly.
Step 4: Full Working Example
Here is a complete, runnable JavaScript script that searches Google Scholar and prints the results:
/**
* Scrape Google Scholar search results using Scavio API.
* Returns structured JSON with paper title, authors, citation count, and more.
*/
const API_KEY = "your_scavio_api_key";
async function searchGoogleScholar(query) {
const response = await fetch("https://api.scavio.dev/api/v1/search", {
method: "POST",
headers: {
"x-api-key": API_KEY,
"Content-Type": "application/json",
},
body: JSON.stringify({ query, tbs: "" }),
});
if (!response.ok) {
throw new Error(`Scavio API error: ${response.status}`);
}
return response.json();
}
const results = await searchGoogleScholar("retrieval augmented generation 2024");
console.log(JSON.stringify(results, null, 2));Why Use Scavio Instead of Scraping Google Scholar Directly?
- No proxy management. Direct scraping requires rotating proxies to avoid IP bans. Scavio handles all of this server-side.
- No CAPTCHA solving. Google Scholar aggressively blocks automated requests. Scavio returns clean data every time.
- Structured JSON output. No HTML parsing or CSS selector maintenance. Get typed, consistent data from every request.
- Multi-platform in one API. Search Google, Amazon, YouTube, and Walmart from the same API key with the same authentication pattern.
- Free tier included. 500 credits/month with no credit card required. Each search costs 1 credit.