Why Is My App SLOw? Defining Reliability in Platform Engineering
Platform engineering is all fun and games until platform customers start complaining about their apps running slowly. Is it the app code or the platform?
This talk looks at how Google’s Serverless SRE team detects platform-level latency regressions before users, measures the impact of regressions, and tracks performance over time. We’ll discuss the limitations of SLOs in this context and how to take a statistical approach that gives a customer-centric picture of the performance of our platform instead.