Collective Wisdom from the Experts
Ratings1
Average rating3
When your system goes down, every minute means lost business and angry customers venting frustration on social media. You may be at wits' end, wishing you knew more about the problem. Enter site reliability engineering (SRE). This practical book takes you through actionable advice on a wide range of topics including how to adopt SRE, where DevOps and SRE overlap, and how monitoring and observability differ. Editors Jaime Woo and Emil Stolarsky, cofounders of Incident Labs, have collected 97 concise and useful tips from various colleagues and fellow professionals to help you expand your SRE skills through trusted best practices and new approaches to knotty problems. You'll hone your SRE skills through sound advice, including how to ask thought-provoking questions that will drive the direction of the field. Learn how SRE relates to concepts including DevOps and resilience engineering Assess how SRE is implemented across companies of different sizes Implement foundational concepts of SRE, including SLOs, error budgets, incident response, game days, and post-mortems Build and scale an SRE team for your organization's changing needs Evaluate the progress of SRE adoption and strategies and relate them back to stakeholders
Reviews with the most likes.
There are no reviews for this book. Add yours and it'll show up right here!