|Performance analysis with flame graph under Linux|
This work is licensed under the Creative Commons Attribution-Non-Commercial Use-Sharing in the Same Way 4.0 International License Agreement . Please indicate the source for reprinting. Thank you for your cooperation
Due to my limited technical level and knowledge, if there are any deficiencies or needs to be corrected in the content, I welcome everyone to correct me, and I also welcome you to provide some other good debugging tools for inclusion, I thank you here
Software performance analysis, often need to check
Flame graph (
1 Introduction to flame diagram
When many people have a cold and have a fever, they often imitate Shennong's way of tasting herbs: try antiviral drugs first, then antibacterial drugs, don t control what medicines at home, what Chinese medicines and western medicines, the blind cat will always In the event of dead mice, this is naturally undesirable. The correct way is to go to the hospital for a blood test, and then prescribe the right medicine after the diagnosis.
Let us recall how we generally debug programs: usually relying on subjective assumptions without data, rather than thinking about what caused the problem!flame graph
1.1 Flame graph
The early flame diagram is in
|Sampling data used to generate the On-CPU flame graph (|
|Sampling data used to generate Off-CPU flame graph (|
1.2 On/Off-CPU flame diagram
So when to use
Depends on what the current bottleneck is, if it is
If you still can t confirm, then you might as well
When sampling data, it is best to continue to pressure the program through the pressure measurement tool in order to collect enough samples. Regarding the choice of the pressure measurement tool, if you choose
##1.3 Flame Graph Visualization Generator
HTTPS clone git: //github.com/brendangregg/FlameGraph.git copy the code
The following steps are required to generate and create a flame graph
|Capture stack||use |
perf/systemtap/dtraceAnd other tools to grab the running stack of the program
traceThe stack information of the system and program captured by the tool at each moment of running, they need to be analyzed and combined, and the repeated stacks are accumulated together to reflect the load and critical path
|Generate flame graph||Analyze the stack information output by stackcollapse to generate a flame graph|
Different trace tools capture different information, so
|stackcollapse.pl||for DTrace stacks|
|stackcollapse-perf.pl||for Linux perf_events "perf script" output|
|stackcollapse-pmc.pl||for FreeBSD pmcstat -G stacks|
|stackcollapse-stap.pl||for SystemTap stacks|
|stackcollapse-instruments.pl||for XCode Instruments|
|stackcollapse-vtune.pl||for Intel VTune profiles|
|stackcollapse-ljp.awk||for Lightweight Java Profiler|
|stackcollapse-jstack.pl||for Java jstack(1) output|
|stackcollapse-gdb.pl||for gdb(1) stacks|
|stackcollapse-go.pl||for Golang pprof stacks|
|stackcollapse-vsprof.pl||for Microsoft Visual Studio profiles|
2 Generate flame graph with perf
2.1 Perf collects data
Let's start from
Perf -F Record the sudo 99 -p 3887 -g - SLEEP 30 duplicated code
[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-KNgLEYyP-1624459176139)(./perf_record_chrome.png)]
-FSpecify the sampling frequency as99Hz(Per second99Times), if99 timesAll return the same function name, that meansCPUThe same function is being executed this second, and there may be performance problems.
After running, a huge text file will be generated. If a server has
For ease of reading,
sudo perf report -n --stdio copy the code
2.2 Generate flame graph
# Generate folded call stack perf script -i perf.data &> perf.unfold Copy code
Save the parsed information for generating flame graphs
# Generate flame graph ./stackcollapse-perf.pl perf.unfold &> perf.folded Copy code
./flamegraph.pl perf.folded> perf.svg copy the code
We can use pipelines to simplify the above process into one command
perf script | FlameGraph/stackcollapse-perf.pl | FlameGraph /flamegraph.pl> process.svg copy the code
3 Analyze the flame graph
Finally, you can use the browser to open the flame graph for analysis.
3.1 The meaning of the flame graph
The flame graph is based on
The flame graph is to see which function on the top layer occupies the largest width. As long as there is a "flat top" (
The color has no special meaning, because the flame diagram represents
The flame graph is
- Mouse hover
Each layer of the flame will be marked with the function name. When the mouse is hovered, the complete function name, the number of sampling draws, and the percentage of the total sampling times will be displayed
- Click to enlarge
Click on a certain layer, the flame graph will be enlarged horizontally, the layer will occupy all the width, and detailed information will be displayed.
The upper left corner will also display "Reset Zoom", click the link, the picture will be restored to its original shape.
- search for
Pressing Ctrl + F will display a search box, the user can enter keywords or regular expressions, and all the function names that meet the conditions will be highlighted.
In both cases, the flame graph cannot be drawn, and the system behavior needs to be corrected.
- Incomplete call stack
When the call stack is too deep, some systems only return to the previous part (such as the first 10 layers).
- Function name is missing
Some functions have no names, and the compiler only uses memory addresses to represent them (such as anonymous functions).
3.4 The flame graph of the browser
Open developer tools, switch to
At this time, the developer tool will display a timeline. Below it is the flame graph.
There are two differences between the browser flame graph and the standard flame graph: it is inverted (that is, the function at the top of the call stack is at the bottom);
4 Red and blue bifurcation flame diagram
Refer to www.brendangregg.com/blog/2014-1...
Therefore, the following Introducing the red/blue differential FIG flame (red/blue differential flame graphs)
4.1 Example of red and blue differential flame diagram
Above is a pair of interactiveFormat picture . Two colors are used in the picture to indicate the state, red indicates growth, and blue indicates attenuation.
The shape and size of each flame in this flame picture are the same as the second grab
The following example shows that after the system is upgraded, a workload of
Usually, the colors of the stack frame and the stack tower in the standard flame diagram are randomly selected. In the red/blue differential flame diagram, different colors are used to represent the two
In the second
This example is too simple, I can even analyze it without using the differential flame graph. But imagine that if you are analyzing a small performance degradation, such as less than 5%, and the code is more complex, the problem is as good as that. Dealt with.
4.2 Introduction to Red and Blue Differential Flame Diagram
I have been discussing this matter for several years, and finally I wrote an implementation that I personally think is valuable. It works like this:
Grab the stack before modificationprofile1file
Grab the modified stackprofile2file
useprofile2To generate the flame graph. (So the width of the stack frame isprofile2Document-based)
Use the difference of "2-1" to recolor the flame graph. The principle of coloring is that if the stack frame is inprofile2If it appears more frequently, it is marked as red, otherwise it is marked as blue. The color is filled according to the difference before and after modification.
The purpose of this is to use both before and after the modification
Only functions that have a direct impact on performance will be marked with colors (for example, functions that are running), and the sub-functions it calls will not be marked repeatedly.
4.3 Generate red/blue differential flame graph
- Grab the profile 1 file before modification:
# dedicate data perf record -F 99 -a -g - sleep 30 # Analyze data to generate stack information perf script> out.stacks1 # Fold stack ./stackcollapse-perf.pl ../out.stacks1> out.folded1 Copy code
- After a period of time (or after the program code is modified), grab the profile 2` file
# dedicate data perf record -F 99 -a -g - sleep 30 # Analyze data to generate stack information perf script> out.stacks2 # Fold stack ./stackcollapse-perf.pl ../out.stacks2> out.folded2 Copy code
Generate red and blue differential flame diagram
./difffolded.pl out.folded1 out.folded2 | ./flamegraph.pl> diff2.svg copy the code
func_a;func_b;func_c 31 33 [...] Copy code
In the above example, "funca()->funcb()->func_c()" represents the call stack, which is in profile1
Here are some useful options:
|difffolded.pl -n||This option will normalize the data in the two profile files so that they can match each other. If you don't do this, the statistics of all the stacks grabbed will definitely be different, because the grabbing time and CPU load are different. In this case, it looks either red (increased load) or blue (decreased load). The -n option balances the first profile file, so you can get a complete red/blue map|
|difffolded.pl -x||This option will delete the hexadecimal address. The profiler often fails to convert the address to a symbol, so there will be a hexadecimal address in the stack. If this address is different in the two profile files, the two stacks will be considered different stacks, but in fact they are the same. If you encounter such a problem, use the -x option to fix it|
|flamegraph.pl --negate||Used to reverse the red/blue color scheme. In the following chapters, this function will be used|
Although the red/blue differential flame diagram is useful, there is actually a problem: if a code execution path disappears completely, then there is no place to mark blue in the flame diagram. You can only see the current one
One way is to reverse the order of comparison and draw an opposite differential flame diagram. For example:
The flame diagram above is based on before modification
In the figure, the disappeared code is also highlighted (or it should be said that it is not highlighted), because the compression function was not enabled before the modification, so it did not appear before the modification
The following is the corresponding command line:
./difffolded.pl out.folded2 out.folded1 | ./flamegraph.pl --negate> diff1.svg copy the code
In this way, the previous generation
|Flame graph information||description|
|diff1.svg||The width is based on the profile file before modification, and the color indicates what will happen|
|diff2.svg||The width is based on the modified profile file, and the color indicates what has happened|
If you are doing a functional verification test, I will generate these two images at the same time.
4.5 CPI flame graph
These scripts were initially used in the analysis of the CPI flame chart . Compare the before and after modification
4.6 Other differential flame diagrams
There are other people who have done similar work. Robert Mustacchi also made some attempts not long ago . The method he used is similar to the color code style during code inspection: only the differences are shown, and red indicates the new (rising) code. Path, blue indicates the code path for deletion (descent). A key difference is that the width of the stack frame only reflects the number of different samples. An example is on the right. This is a very good idea, but it feels a bit strange in actual use, because the context of the complete profile file is missing as a background, this picture is a bit difficult to understand.
Cor-Paul Bezemer also created a differential display method flamegraphdiff . He puts 3 flame graphs in the same graph at the same time, one for each standard flame graph before and after the modification, and a differential flame graph is added below, but the stack frame width It is also the number of samples of the difference. The figure above is an example . Move the mouse to the stack frame in the difference graph, the same stack frame in the three graphs will be highlighted. This method adds two standard flame diagrams, so the context problem is solved.
The difference flame diagrams of the three of us all have their own strengths. The three can be used in combination: the two images above in the Cor-Paul method can use my diff1.svg and diff2.svg. The flame diagram below can use Robert's way. To maintain consistency, I can use my coloring method for the flame map below: blue->white->red.
The flame map is spreading widely, and now many companies are using it. If you know of other ways to implement differential flame graphs, I wouldn't be surprised. (Please tell me in the comments)
If you have a performance regression problem, the red/blue differential flame graph is the fastest way to find the root cause. In this way, two ordinary flame pictures were captured, and then compared, and the differences were color-coded: red means rising, blue means falling. The differential flame graph is based on the current ("modified") profile file, and the shape and size remain unchanged. Therefore, you can intuitively find the difference through the difference in color, and you can see why there is such a difference.
The differential flame graph can be applied to the daily construction of the project, so that the performance regression problem can be discovered and corrected in time.
This work/Bowen ( AderStep- purple night-hearted - green grass Ling Lane 2013-2017 Copyright ), made into Kennedy (gatieme) creation.
useCreative Commons Attribution-non-commercial use-the same way to share the 4.0 international license agreement for permission. Welcome to reprint, use, republish, but be sure to keep the article attribution Chengjian gatieme (including link: blog.csdn.net/gatieme ), do not use For commercial purposes.
Modified works based on this article must be published under the same license. If you have any questions, please contact me.