SE Radio 560: Sugu Sougoumarane on Distributed SQL Databases

Topics covered
Popular Clips
Episode Highlights
Sharding Types
Vertical and horizontal sharding are crucial strategies for scaling databases. explains that vertical sharding involves separating unrelated tables into different databases, which was initially implemented at YouTube to manage user and video data 1. However, this method has limitations, leading to the need for horizontal sharding, where data is distributed across multiple shards based on user groups 2. This approach requires converting relational databases into hierarchical ones by weakening many-to-many relationships, thus simplifying the application layer.
The first part is actually rewriting the application to not rely on the many to many relationships.
---
YouTube Evolution
The evolution of database management at YouTube highlights the necessity of innovative solutions for scaling. In 2006, YouTube faced frequent outages due to its growing user base, prompting and his team to develop Vitess for better database clustering 3. This system was designed to leap ahead of existing problems by organizing challenges and solutions systematically.
We had reached a point where there were outages, many outages every day, and our backs were against the wall.
---
Resharding
Resharding is a dynamic process essential for managing database growth. At YouTube, the number of shards increased from four to 256, demonstrating the exponential nature of resharding 4. describes how Vitess uses a sharding function, known as Windex, to efficiently manage data distribution across shards 5. This method allows for live resharding without downtime, ensuring continuous data availability.
The core technology in with us, which is one of the best things we ever built in Vitess, is what we call as the materialization.
---
Pluggable Indexes
Pluggable indexes offer flexibility in database management by allowing custom sharding schemes. shares the inspiration from Michael Stonebraker's work on Illustra, which influenced the development of pluggable indexes in Vitess 6. These indexes enable defining sharding schemes and secondary indexes as code, adapting to changing application needs.
The application use case may dictate one type of sharding today and it may change tomorrow.
---
Related Episodes


SE-Radio Episode 243: RethinkDB with Slava Akhmechet
Answers 383 questions

SE Radio 605: Yingjun Wu on Streaming Databases
Answers 383 questions

Episode 510: Deepthi Sigireddi on How Vitess Scales MySQL
Answers 383 questions

SE-Radio Episode 362: Simon Riggs on Advanced Features of PostgreSQL
Answers 383 questions

SE-Radio Episode 354: Avi Kivity on ScyllaDB.mp3
Answers 383 questions

SE-Radio Episode 353: Max Neunhoffer on Multi-model databases and ArangoDB
Answers 383 questions

SE Radio 561: Dan DeMers on Dataware
Answers 383 questions

SE Radio 623: Mike Freedman on TimescaleDB
Answers 383 questions

SE-Radio Episode 344: Pat Helland on Web Scale
Answers 383 questions

SE-Radio Episode 288: DevSecOps
Answers 383 questions

SE Radio 631: Abhay Paroha on Cloud Migration for Oil and Gas Operations
Answers 383 questions

364: Peter Zaitsev on Choosing the Right Open Source Database
Answers 383 questions

SE Radio 583: Lukas Fittl on Postgres Performance
Answers 383 questions













