Tutorial

How to Extract Reddit Comments from a Post

Pull a Reddit post and its full threaded comment tree in Python. Use depth and parentId to rebuild the hierarchy or aggregate by level.

Comment threads are where Reddit's real signal lives. The headline is marketing copy, but the replies decide whether a claim holds up under scrutiny. This tutorial shows how to fetch a Reddit post and its full threaded comment tree with a single API call, then work with the flat, depth-annotated comment array to render, filter, or aggregate.

Prerequisites

  • Python 3.8 or higher
  • requests library
  • A Scavio API key
  • A Reddit post URL you want to analyze

Walkthrough

Step 1: Pick a Reddit post URL

Any canonical Reddit post URL works. The /comments/<id>/ path is enough.

Python
POST_URL = "https://www.reddit.com/r/Python/comments/1smb9du/fastapi_vs_django/"

Step 2: Fetch the post and comments

POST to /api/v1/reddit/post with the URL. Response includes the post plus the flat comment array.

Python
import os, requests

r = requests.post(
    "https://api.scavio.dev/api/v1/reddit/post",
    headers={"Authorization": f"Bearer {os.environ['SCAVIO_API_KEY']}"},
    json={"url": POST_URL},
    timeout=30,
)
data = r.json()["data"]

Step 3: Render comments with depth indentation

Each comment has a depth field starting at 0 for top-level replies. Multiply depth by two spaces for a readable tree.

Python
for c in data["comments"]:
    indent = "  " * c["depth"]
    print(f"{indent}[{c['score']:>4}] u/{c['author']}: {c['body'][:100]}")

Step 4: Aggregate by depth for quick analysis

Use depth to summarize top-level sentiment separately from deep replies.

Python
from collections import defaultdict

by_depth = defaultdict(list)
for c in data["comments"]:
    by_depth[c["depth"]].append(c["score"])

for depth, scores in sorted(by_depth.items()):
    print(f"depth {depth}: {len(scores)} comments, avg score {sum(scores)/len(scores):.1f}")

Python Example

Python
import os, requests

API_KEY = os.environ["SCAVIO_API_KEY"]

def fetch_post(url: str):
    r = requests.post(
        "https://api.scavio.dev/api/v1/reddit/post",
        headers={"Authorization": f"Bearer {API_KEY}"},
        json={"url": url},
        timeout=30,
    )
    r.raise_for_status()
    return r.json()["data"]

data = fetch_post("https://www.reddit.com/r/Python/comments/1smb9du/")
print(data["post"]["title"])
for c in data["comments"][:20]:
    print("  " * c["depth"] + f"u/{c['author']}: {c['body'][:80]}")

JavaScript Example

JavaScript
const API_KEY = process.env.SCAVIO_API_KEY;

async function fetchPost(url) {
  const r = await fetch("https://api.scavio.dev/api/v1/reddit/post", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ url }),
  });
  const { data } = await r.json();
  return data;
}

const data = await fetchPost("https://www.reddit.com/r/Python/comments/1smb9du/");
console.log(data.post.title);
for (const c of data.comments.slice(0, 20)) {
  console.log("  ".repeat(c.depth) + `u/${c.author}: ${c.body.slice(0, 80)}`);
}

Expected Output

JSON
FastAPI vs Django in 2026 -- what the teams are actually using
u/senior_py: We moved to FastAPI for the API surface and kept Django for admin
  u/django_dev: Django ORM is still unmatched for anything with relational depth.
    u/another_dev: Agreed -- the admin is a force multiplier for internal tools.

Related Tutorials

Frequently Asked Questions

Most developers complete this tutorial in 15 to 30 minutes. You will need a Scavio API key (free tier works) and a working Python or JavaScript environment.

Python 3.8 or higher. requests library. A Scavio API key. A Reddit post URL you want to analyze. A Scavio API key gives you 500 free credits per month.

Yes. The free tier includes 500 credits per month, which is more than enough to complete this tutorial and prototype a working solution.

Scavio has a native LangChain package (langchain-scavio), an MCP server, and a plain REST API that works with any HTTP client. This tutorial uses the raw REST API, but you can adapt to your framework of choice.

Start Building

Pull a Reddit post and its full threaded comment tree in Python. Use depth and parentId to rebuild the hierarchy or aggregate by level.