Threaded comments:

  • are a comment system where replies are attached to specific comments, creating conversation threads instead of one flat list.

  • Problem

    • I designed a threaded comment system for a post where users can leave top-level comments and replies.
    • Initially, I modeled it as a fully nested tree using the classic *adjacency-list approach (parent_id), which supports unlimited depth.
  • Issue at scale

    • As the number of comments grew, fetching and rendering a full nested tree became slow.
    • Recursive queries also made pagination and “Top” ranking harder, so I optimized for a two-level comment + replies UI.
  • Design decision

    • Since our product only needed two levels (top-level comments + replies), I simplified the model to match the product constraint and optimized reads.
  • Final schema

    • I kept parent_id, and added a root_id (the top-level comment id).
    • For top-level comments: parent_id = NULL, root_id = id
    • For replies: parent_id = <top-level id>, root_id = <top-level id>
    • To display the first level (top-level comments):
      • use parent_id IS NULL. These are the comments that are directly under the post.
    • To display the second level (replies):
      • use parent_id = <top_level_comment_id>. Each reply “hangs under” the top-level comment it responds to.
  • How reads became faster

    • With root_id, I can fetch:
      1. top-level comments by post_id AND parent_id IS NULL with pagination/sorting
      2. replies by post_id AND root_id IN (...) (or root_id = X) without recursion
    • This avoids recursive traversal and makes it straightforward to paginate top-level comments while loading replies efficiently.
  • Trade-off

    • The trade-off is that we intentionally don’t support deeper nesting.
    • But that matches the product requirement and gives us a simpler, faster, and more predictable query pattern.
  • Redis ZSET

    • ZSET stores top-level commentIds only (parent_id is NULL)
    • Replies are loaded separately by:
      • SELECT * FROM comments WHERE root_id = {topCommentId} ORDER BY created_at ASC LIMIT ...
    • PostgreSQL is the source of truth; Redis ZSET is a fast ranking index. On cache miss or eviction, we can rebuild the ZSET from the database or backfill asynchronously.