What is Reddit Threaded Comments?
Scavio returns Reddit comments as a flat, depth-annotated array rather than a nested JSON blob, which is easier to page through, filter, and feed into downstream pipelines. Every comment includes an id, author handle, body text, score, timestamp, and parentId pointing either to the post id (for top-level replies) or another comment id (for nested replies). A depth field, 0-indexed, makes it trivial to render the tree with indentation or to group by conversation level for sentiment aggregation. Deleted and removed comments are surfaced with explicit markers so your downstream analysis does not silently drop them.
Example Response
{
"data": {
"comments": [
{
"id": "t1_lxs9a0k",
"author": "senior_py",
"body": "We moved to FastAPI for the API surface and kept Django for admin",
"score": 312,
"depth": 0,
"parentId": "t3_1smb9du",
"timestamp": "2026-04-15T17:02:11.000000+0000"
},
{
"id": "t1_lxsa1b2",
"author": "django_dev",
"body": "Django ORM is still unmatched for anything with relational depth.",
"score": 178,
"depth": 1,
"parentId": "t1_lxs9a0k",
"timestamp": "2026-04-15T17:15:42.000000+0000"
}
]
}
}Use Cases
- Rendering comment threads in a custom UI with any indentation style
- Aggregating sentiment per discussion depth level
- Training reply-ranking models with score-weighted data
- Extracting quotable community feedback for product research
- Generating summaries of long threads with an LLM
Why Reddit Threaded Comments Matters
The depth-plus-parentId shape is the cleanest contract for a comment tree. You can render it, re-tree it, or slice it by score without writing a recursive parser. Teams building community-intelligence tooling save days of glue code by skipping the Reddit API's paginated kind-and-data wrapper.
LangChain Example
Drop reddit threaded comments data into your LangChain agent in a few lines:
from langchain_scavio import ScavioRedditPost
tool = ScavioRedditPost()
result = tool.invoke({"url": "https://www.reddit.com/r/Python/comments/1smb9du/"})
top_level = [c for c in result["data"]["comments"] if c["depth"] == 0]
top_by_score = sorted(top_level, key=lambda c: c["score"], reverse=True)[:5]
for c in top_by_score:
print(f"{c['score']:>5} u/{c['author']}: {c['body'][:100]}")