Hierarchical Data – Adjacency Lists and Nested Set Models

Topics covered
Popular Clips
Episode Highlights
Model Basics
The Nested Set Model is a method for managing hierarchical data by assigning left and right values to each node. Alan Underwood explains that each record has a left, right, and level value, which simplifies querying the hierarchy 1. Michael Outlaw highlights that this model is often built off an existing schema like an adjacency list, making it easier to find ancestors or lineage within the data 2.
If you want to know the ancestors of Michael, you just take that left to right values and say, give me everything in between, and you're done.
--- Joe Zack
Despite its complexity, the Nested Set Model offers significant advantages for specific use cases.
Building Tables
Building tables using the Nested Set Model involves assigning left and right values as you traverse the hierarchy. Michael Outlaw uses a family tree analogy to explain this process, where each node is visited twice to set these values 3. This method ensures that every node in the tree is accounted for, making it easier to manage complex hierarchies.
If you're building your own family tree, you start at the top and go down every node, counting as you go. When you get to the bottom, you start walking back up and continue counting.
--- Michael Outlaw
This approach helps visualize the structure and maintain the integrity of the hierarchical data.
Query Efficiency
Querying hierarchical data efficiently is a key advantage of the Nested Set Model. Michael Outlaw notes that the model's performance gains are significant, especially for large datasets 4. However, Alan Underwood points out that maintaining this model can be costly, as any changes require recalculating the entire tree 5.
If I wanted to remove my grandmother from that list, she was the third node I went to. But if I wanted to remove her, then she would change the values for my grandfather, my mother, my father, and myself.
--- Michael Outlaw
Thus, the Nested Set Model is best suited for scenarios where data is not frequently updated.
Related Episodes


Data Structures - (some) Trees
Answers 383 questions95. Data Structures – Arrays and Array-ish
Answers 383 questions

Data Structures - Heaps and Tries
Answers 383 questions94. Data Structures - Primitives
Answers 383 questions

Data Structures - Arrays and Array-ish
Answers 383 questionsDesigning Data-Intensive Applications – Scalability
Answers 383 questions

Designing Data-Intensive Applications - SSTables and LSM-Trees
Answers 383 questionsGraph Algorithms
Answers 383 questionsAll Your Database Are Belong to Us
Answers 383 questionsDesigning Data-Intensive Applications – Data Models: Query Languages
Answers 383 questions

Strategic Design and Domain Events
Answers 383 questions

Designing Data-Intensive Applications – Multi-Leader Replication
Answers 383 questionsDesign Patterns Part 3
Answers 383 questionsUnderstanding Complexity Theory
Answers 383 questions

Designing Data-Intensive Applications - Data Models: Relational vs Document
Answers 383 questions
