Feature: reddit

Reddit Threaded Comments

Extract full threaded comment trees with depth, parentId, score, and author fields. Render inline or rebuild the hierarchy yourself.

What is Reddit Threaded Comments?

Scavio returns Reddit comments as a flat, depth-annotated array rather than a nested JSON blob, which is easier to page through, filter, and feed into downstream pipelines. Every comment includes an id, author handle, body text, score, timestamp, and parentId pointing either to the post id (for top-level replies) or another comment id (for nested replies). A depth field, 0-indexed, makes it trivial to render the tree with indentation or to group by conversation level for sentiment aggregation. Deleted and removed comments are surfaced with explicit markers so your downstream analysis does not silently drop them.

Example Response

JSON
{
  "data": {
    "comments": [
      {
        "id": "t1_lxs9a0k",
        "author": "senior_py",
        "body": "We moved to FastAPI for the API surface and kept Django for admin",
        "score": 312,
        "depth": 0,
        "parentId": "t3_1smb9du",
        "timestamp": "2026-04-15T17:02:11.000000+0000"
      },
      {
        "id": "t1_lxsa1b2",
        "author": "django_dev",
        "body": "Django ORM is still unmatched for anything with relational depth.",
        "score": 178,
        "depth": 1,
        "parentId": "t1_lxs9a0k",
        "timestamp": "2026-04-15T17:15:42.000000+0000"
      }
    ]
  }
}

Use Cases

  • Rendering comment threads in a custom UI with any indentation style
  • Aggregating sentiment per discussion depth level
  • Training reply-ranking models with score-weighted data
  • Extracting quotable community feedback for product research
  • Generating summaries of long threads with an LLM

Why Reddit Threaded Comments Matters

The depth-plus-parentId shape is the cleanest contract for a comment tree. You can render it, re-tree it, or slice it by score without writing a recursive parser. Teams building community-intelligence tooling save days of glue code by skipping the Reddit API's paginated kind-and-data wrapper.

LangChain Example

Drop reddit threaded comments data into your LangChain agent in a few lines:

Python
from langchain_scavio import ScavioRedditPost

tool = ScavioRedditPost()
result = tool.invoke({"url": "https://www.reddit.com/r/Python/comments/1smb9du/"})

top_level = [c for c in result["data"]["comments"] if c["depth"] == 0]
top_by_score = sorted(top_level, key=lambda c: c["score"], reverse=True)[:5]
for c in top_by_score:
    print(f"{c['score']:>5}  u/{c['author']}: {c['body'][:100]}")

Frequently Asked Questions

Send a search request with the appropriate platform (reddit) and Scavio returns reddit threaded comments data in the response. See the example above for the exact field path.

Yes. Scavio fetches reddit threaded comments data in real time on each request. There is no caching layer and no stale data.

Scavio returns Reddit comments as a flat, depth-annotated array rather than a nested JSON blob, which is easier to page through, filter, and feed into downstream pipelines. Every c

Reddit Threaded Comments data is returned as part of the standard search response. Each request costs 1 credit. Free tier includes 500 credits/month.

Start Using Reddit Threaded Comments

Extract full threaded comment trees with depth, parentId, score, and author fields. Render inline or rebuild the hierarchy yourself.