The DevOps Handbook – The Technical Practices of Feedback

Topics covered
Popular Clips
Episode Highlights
Telemetry Basics
Telemetry plays a crucial role in software development by providing insights into both the usage and performance of applications. Joe Zack and Alan Underwood discuss its dual nature, highlighting how it captures metrics like page visits and conversion rates, as well as performance indicators such as speed and error rates 1. This comprehensive data collection allows developers to pinpoint issues without resorting to guesswork, as Alan emphasizes the importance of generating telemetry to solve infrastructure problems effectively 2.
In software, telemetry is used to gather data on the use and performance of applications and application components.
--- Alan Underwood
Moreover, telemetry should be accessible via APIs, enabling seamless integration with other systems and facilitating quick problem-solving without manual data aggregation 3.
Centralized Insights
Centralized telemetry infrastructure is essential for gaining comprehensive insights into system operations. Alan Underwood explains the need for collecting data across business logic, application, and environmental layers to visualize trends and detect anomalies 4. This infrastructure, often involving tools like Prometheus and Grafana, enables the aggregation and visualization of metrics, providing a holistic view of system performance 5.
You have to create a comprehensive set of telemetry from your application metrics to your operational metrics.
--- Alan Underwood
The case study of Etsy illustrates the power of telemetry, where dashboards were created to track metrics, leading to improved responsiveness and operational efficiency 6.
Deployment Monitoring
Incorporating telemetry into deployment pipelines enhances monitoring and problem-solving capabilities. Joe Zack highlights the importance of tracking metrics like unit test failures and build times to identify issues early in the deployment process 7. By overlaying telemetry data with production changes, teams can pinpoint the exact moment when issues arise, facilitating quicker resolutions 8.
Knowing how long they take to build and execute again... it was always taking a minute before, now it's taking ten. What changed?
--- Alan Underwood
Moreover, telemetry fosters a scientific approach to problem-solving, allowing teams to use data-driven methods to address issues and improve collaboration between development and operations 9.
Related Episodes


The DevOps Handbook - The Technical Practices of Flow
Answers 383 questions

The DevOps Handbook – Anticipating Problems
Answers 383 questions

The DevOps Handbook – Enabling Safe Deployments
Answers 383 questionsThe DevOps Handbook - Create Organizational Learning
Answers 383 questionsThe DevOps Handbook – Architecting for Low-Risk Releases
Answers 383 questionsThe DevOps Handbook – Enable Daily Learning
Answers 383 questions

The DevOps Handbook – The Value of A/B Testing
Answers 383 questions

DevOps: Job Title or Job Responsibility?
Answers 383 questions

Google's Engineering Practices - What to Look for in a Code Review
Answers 383 questions

Google’s Engineering Practices – How to Navigate a Code Review
Answers 383 questions

Google’s Engineering Practices – Code Review Standards
Answers 383 questionsDocker for Developers
Answers 383 questions

Designing Data-Intensive Applications - Reliability
Answers 383 questions

Technical Challenges of Scale at Twitter
Answers 383 questionsSite Reliability Engineering - Monitoring Distributed Systems
Answers 383 questions
