Open source performance visualization tools --FlameScope Pattern Recognition

Article Translations

Description link

FlameScope is a new open source performance visualization tool that uses the sub-second offset heat and flame FIG analyzed FIGS cycle activity, variance disturbance. We Netflix TechBlog top, published technical articles Netflix FlameScope , as well as the source code tools . Figure flame is well understood, sub-second offset heat map harder to understand something (I recently invented it). FlameScope can help you understand the latter.

In short, sub-second offset heat drawing is: x is a whole second axis, y-axis is in the second fraction of a second. This every fraction of a second are referred to as a bucket (or box), which represents a fraction of a second, the number of events of the polymerization. The number of boxes represent the occurrence of color depth, the deeper the color the more and more often.

FIG sub second hot offset samples on a real FIG next CPU:

2509688-d4bc8e0498c01fc8.png

This figure can analyze what information it? In order to be able to distinguish the various modes show, let me draw some synthetic samples in this article. Actual use FlameScope tool, you can select each mode, but also generate flames showing corresponding code path (here I do not show the flame Figure).

Cycle Activities

1. A thread, once per second

2509688-9accd2bb05a33960.png

In the same thread in every second shift woke up, do the work of a few milliseconds, and then go back to sleep.

2. a thread, twice per second

2509688-13c78dac11b968aa.png

Wake up once every 500ms. Both may be two threads, one thread may be a wake-up 500ms.

3. The two threads

2509688-43444eea4e120ebd.png

Looks like two threads are awakened once 1s

4. a busy waiting thread, once per second

2509688-11375fc04d97886c.png

This thread to do the work of about 20ms, and then sleep 1s. This is a common pattern, leading to wake up every second offset crawl.

5. a busy waiting thread, twice per second

2509688-13626c6d01252e9a.png

Wake up once every 500ms. There may be single-threaded program, wake-up twice per second.

6. a less computationally intensive and so busy thread

2509688-ee94e9924c5429e6.png

High slope, do more work per second, which is about 80 milliseconds.

7. a less computationally intensive and so busy thread

2509688-9ad406245374e206.png

Low slope, do less work per second, may be only a few milliseconds.

8. a busy waiting thread, wake up every 5 seconds

2509688-91f72fc46dee1ac4.png

Now five seconds wakes up.

我们可以根据夹角和唤醒的时间间隔,计算每个唤醒的CPU繁忙时间:
busy_time = (1000 ms / (热图行数 *时间长度) *tan(夹角)
例如45°夹角的线:
busy_time = (1000 ms / (50 * 1)) *tan(45) = 20ms

方差

9 . cpu利用率100%

2509688-d78f9d0447162ae9.png

这是CPU完全被用满的样子

10 . cpu利用率50%

2509688-fa6fa547b40fd119.png

真实的工作负载更像是这样,是由短请求、随机到达组成的。

11 . cpu利用率25%

2509688-2e3c4d586a335e46.png

相同的工作负载类型,大小在25%。

12 . cpu利用率5%

2509688-cb4d3657d3ae19a4.png

相同的工作负载类型,大小在5%。

13 . 负载增加

2509688-3cad9eaf87391d77.png

在2分钟的尺度上,负载在变重。

14 . 变化的负荷

2509688-e573f7dded14f3d5.png

每30秒就有5秒的工作负载较重。

扰动

15 . CPU扰动


2509688-b5e68d51ba3bcda1.png

时不时地所有CPU都满载个100ms。(比如垃圾回收)

16 . CPU阻塞


2509688-04272185acdf6fba.png

时不时地所有CPU都空载个100ms。(比如等I/O)

17 . 单线程阻塞


2509688-2e271d0bf40a34f0.png

时不时地,只有一个CPU没有idle(表现为粉红色长条,而不是白色长条)。(比如全局锁)

最后这个模式很有趣:它发生在一个当前运行的线程持有一把锁,而其它所有线程都阻塞在这把锁上。
那么该线程在做什么呢?点击FlameScope的粉色线,就能看到此时的火焰图。复杂的性能问题立刻变简单。

总结

你能从这张图中分析出什么结论?


2509688-3c9567526f927b79.png

实际使用FlameScope工具时,可以选择你的各个模式,还能生成火焰图,显示对应的代码路径。

我和同事Martin Spier(也是该工具的主开发人员)11月8日在LinkedIn性能meetup上发表演讲。

祝你使用FlameScope愉快,欢迎截图分享你遇到有趣的模式!

Brendan

实践

It should be added that the author's latest work, a strong Differential Flame Graph will also be integrated into FlameScope in, and now interactively on FlameScope, select the time period and the corresponding two test sets, tests comparing two groups of sampling events .

When I use FlameScope, find and fix a number of bug FlameScope. Including some bug Differential Flame Graph not run up. After I went to some of the recurring performance problems with it. Some of them also found interesting patterns.

First of all, I want to analyze the scheduling features two test groups. I told them were the perf sched record sampling, and were using FlameScope data visualization.

Good performance packet

Client


2509688-782ecf08c9198bfc.png


Server


2509688-e04a614547ec29f1.png


Poor performance packet

Client


2509688-1888947971d04fa3.png

Server


2509688-c7f71be47d6e4f48.png

We found that poor performance groups, a large number of scheduled events, and occurred very uniform. Good performance packet is periodically busy working several milliseconds (dark red strip), we can find that there are periodic background tasks easy (light red strip)

This comparison gives us scheduling features two test sets an intuitive feel. However, analysis of the problem seems to need the help of more information.

So I used the Differential Flame Graph analysis on the complete call stack sampling two test sets.

2509688-3d86013ade789b07.png

This figure will give an important clue, the two test sets most significant difference, sock_aio_write-> inet_sendmsg-> copy_user_enhanced_fast_string on this path in vfs_write-> do_sync_write->. (Note that due to kernel compilation and optimization, call the path slightly inaccurate)

Good performance test group, and more calls many times copy_user_enhanced_fast_string, the little performance difference between the test group.

After the work is not easy FlameScope relationship. This is a tool I use FlameScope practice test and performance tuning. Bredan Gregg Great God led this software, intuitive interpretation of performance data is really too strong ~



Author: chatter chatter chatter

Read the original

This article Yunqi community original content may not be reproduced without permission.

Guess you like

Origin blog.csdn.net/weixin_34279246/article/details/90920453