SRE Insights at Soundcloud

Björn shares how the principles of site reliability engineering (SRE) have been instrumental in enhancing monitoring at Soundcloud, particularly in the context of increasing complexity from microservices. He highlights the challenges faced by the team and the importance of establishing effective monitoring systems, such as Prometheus, to gain visibility into operations. Additionally, he introduces the concept of on-call duties, emphasizing its critical role in maintaining system reliability.

In this clip
From this podcast
Software Engineering Radio - the podcast for professional software developers
SE-Radio Episode 276: Björn Rabenstein on Site Reliability Engineering
Related Questions
- What is this clip about?
- What is the main topic of this clip?

SRE Insights at Soundcloud

In this clip

From this podcast

Software Engineering Radio - the podcast for professional software developers

SE-Radio Episode 276: Björn Rabenstein on Site Reliability Engineering

Related Questions

What is this clip about?

What is the main topic of this clip?