Flame graphs are a visualization that helps developers easily find performance bottlenecks to cut computing costs and improve end-user experience. They can be used to answer many questions, including how software is consuming resources, especially CPUs, and how that consumption has changed since the last software version. Flame graphs are now a standard for CPU profiling and have been adopted in many programming languages and observability products, and are the basis for multiple startups. They were defined in "The Flame Graph" in the Communications of the ACM, by their creator, Brendan Gregg.

This talk covers the origins of flame graphs, how you can create them using open source software, and how to interpret them. In practice, flame graphs don’t always work completely due to problems walking stack traces, resolving symbols, and other issues; this talk explains the problems and shows you the latest techniques for fixing them.

Flame graphs are a tool for a bigger mission: To understand the performance of everything, all software and hardware. Advanced types of flame graphs that help further this goal will be explained, including differential, off-CPU, memory, disk, and network events. Many of these advanced flame graph types require newer kernel technologies to make practical, especially extended BPF (eBPF), and will see adoption in the years ahead.

Visualizing Performance - The Developers’ Guide to Flame Graphs

Brendan Gregg