Published Sep 3, 2019

Episode 222: Nathan Marz on Real-Time Processing with Apache Storm

Nathan Marz, creator of Apache Storm, delves into the platform's revolutionary impact on real-time data processing, detailing its scalable algorithms, fault tolerance, and how it simplifies distributed computation, transforming industry approaches to complex analytics tasks.
Episode Highlights
Software Engineering Radio - the podcast for professional software developers logo

Popular Clips

Episode Highlights

  • Tuple Processing

    Apache Storm's tuple processing is a sophisticated mechanism that ensures data is efficiently processed across a cluster. explains that Storm uses a spout to emit tuples, which are then processed by bolts, creating a tree of computation across the cluster 1. This process guarantees that every message is successfully processed by tracking the tuple tree with minimal memory usage, only about 20 bytes, even for a billion pending messages 2.

    Regardless of how big the tuple tree gets, tuple tree could have a billion pending messages, and it still only needs 20 bytes to track the tuple tree.

    ---

    This efficient tracking system is based on a probabilistic algorithm that minimizes the chance of errors, making Storm highly reliable for real-time processing 2.

       

    Fault Tolerance

    Storm's architecture is designed for robust fault tolerance, ensuring continuous operation even when processes fail. highlights that Storm's process fault tolerance allows for restarting processes without disrupting the running application, a crucial feature for maintaining uptime 3. The architecture includes components like Nimbus, Zookeeper, and supervisor daemons, which coordinate to keep the system running smoothly even during failures 4.

    You can kill Dash nine, Nimbus or the supervisors and nothing will happen to running topologies.

    ---

    This design ensures that even if a node fails, Storm can reassign tasks to other nodes, maintaining the integrity and progress of the data processing 5.

Related Episodes