Advanced Perfetto Analysis


1. Introduction of Perfetto

Perfetto is a new next-generation platform-level tracking tool introduced in Android Q, which provides a common set of performance detection and tracking analysis tools for Android, Linux and Chrome platforms. Its core is to introduce a new user space to user space tracking protocol, which fills the captured data into the shared memory buffer based on the protobuf serialization mechanism, which can be used to obtain the built-in data sources inside the platform (such as ftrace, atrace, logcat), and also provides SDK and Library for upper-layer C++ applications to realize customization. Perfetto allows flexible and dynamic configuration of data source capture through an extensible configuration file, and can record ultra-long trace data streams to the file system.

In the current Android implementation, perfetto provides a service and library for recording system and application-level traces, a low-overhead native+java heap analysis tool, a library for SQL analysis of trace files, and a web-based visual presentation interface— — Perfetto UI.

traditional systrace

685eb346af8e5d2b204c53bf20dc49ed.png

Perfect UI

78c537a0cc693d50bb255f0366a2b5a2.png

Advantages of Perfetto over systrace:

a. It is convenient and quick to operate, query, locate, and visually analyze and mark;

b. It can continuously record long track records and export them to the file system;

c. Stronger expansion capabilities, support for extended ftrace data analysis, parsers and presentations are easy to update

d. Built-in support for SQLite, data post-processing can be conveniently performed through SQL query;



2. Use and Analysis

1. perfetto trace crawler

Official command line operation, config.pbtx is the trace configuration, which specifies the category, duration, buffer size and other information to be captured. For details, please refer to: https://perfetto.dev/docs/quickstart/android-tracing

adb push config.pbtx /data/local/tmp/config.pbtx

adb shell 'cat /data/local/tmp/config.pbtx | perfetto --txt -c - -o /data/misc/perfetto-traces/trace.perfetto-trace'



2. UI display

Official website: https://ui.perfetto.dev/#!/

Open the above URL, click Open trace file, and select locally recorded perfetto trace or ftrace, systrace and other files to display the detailed trace information of each process and thread in Timeline.

When the trace file is larger than 1G, Open trace file will cause memory overflow and cannot be accessed.

ae259bdbb1a578bdc7a5f64430cf7892.png

At this time, you need to use trace_processor to assist. This program is recommended to run in the Linux environment. Win10 system can install WSL (Ubuntu20.04). Refer to the appendix to install WSL

# Download the official trace_processor

curl -LO https://get.perfetto.dev/trace_processor

chmod +x ./trace_processor

Run the following command to load the perfetto trace file:

./trace_processor --full-sort -D xxx.pftrace

You can also run the following commands under Windows (unstable, large memory consumption):

python3 trace_processor --full-sort -D xxx.pftrace

Open https://ui.perfetto.dev/#!/ in the Chrome browser, and it will automatically detect whether there is already an HTTP SERVER (port 9001) generated by trace_processor locally, as shown in the figure below, please select

"YES, use loaded trace", will automatically parse the pftrace file that has been loaded by trace_processor.

9ed6f3197829a9d84cb79e738d0152a4.png

3. Routine analysis

a. Legend indicator

slice (fragment, a black border will be displayed after the fragment is selected)

Corresponds to the events recorded by Trace.beginSection/ATRACE_BEGIN in the code

4381b3086a47b151cea4237f46e6917a.png

counter (counter, discrete value points) events recorded by Trace.traceCounter/ATRACE_INT in the code

199c576b3476f6ca09665f8d0ac80826.png

sched/freq (CPU scheduling, frequency)

db940cb147071578f4fe8a47741517ad.png

thread_state (thread state)

Click the thread scheduling information fragment (Running) above the fragment to see which CPU the thread is currently running on

834cf79035b1fd29234cd5ab7f56ad40.png

click

539eb43ae302d04d2d2e9a76e2cb5925.png

 , you can see the running segment in CPU scheduling, and you can see the scheduling delay information.

Awakened by P(Process):system_server's T(Thread):Binder_1754_18 thread, it was delayed by 363us from ready to run, click again

63a83ca0cb99d9d3eb626fc557659312.png

, you can return to the original segment, this jump is more flexible and convenient than systrace. same

Yes, the Binder call can also be analyzed and viewed by jumping between the target and the original calling thread.

53cdd63116d88e27ecef84909efe6d01.png

b. Add markup

Click on the top time track to add a time point marker; select an area by holding down the left mouse button or click on a segment, and then press "shift+m" to add a permanent area marker. Select the tag that has been added, and in the Current Selection TAB that appears at the bottom, you can add a tag name, change its color, and perform removal operations.

21688c0ddc8b2eee9638f0f8b4a30cd8.png

Press "m" to add a temporary area mark. When another area is selected to add a temporary area, the previous temporary area will be automatically removed.

c. Lock contention

When you see the lock contention fragment, you can click the monitor contention above to view the call stack where the current object lock competition occurs. The following details show that the current object lock is held by the Owner (Binder:1754_16), and its lock is currently running in serviceDoneExecuting (AMS. java line 16426), and there are already 2 threads waiting for the object lock; the current thread execution is blocked in the getUidState method (AMS.java line 6614).

836535b339eb5762806f10b08dc3a0f8.png

3. SQL query and display

On the perfetto UI interface where the trace has been loaded, enter: in the Search box to enable SQL input, and we can use SQL to query and locate specific trace segments (slices).

b4acb012bd6f4c3a4ad7d95c83ff59be.png

Enter the SQL statement and press Enter to get the query results, which are displayed in the table at the bottom. Click each row in the table to jump to the specific slice, and further analyze the problem according to the trace context.

f5b04013b291e2085b0307d37656ebbe.png

If you only need to execute SQL query data, you can also click Query (SQL) in the left navigation bar of the UI interface, enter the SQL statement, and press CTRL + ENTER to execute the SQL query.

[Here are several commonly used Table/View formats and key field information]

slice table, small fragments on the horizontal track

ts: fragment start timestamp (unit ns)

dur: segment duration (ns)

Which track does track_id belong to (horizontal timeline)

name: The name of the fragment label, corresponding to the method name, mark and other information printed in Trace

94ce472c535ade2f73f59a3dd589f677.png

thread_track table, utid identifies the thread tid, not the real thread tid

2afcba99de04a0553df6a828fd3ce82e.png

thread table, which represents the information of each thread, where utid is associated with the utid of the thread_track table

011ede250c51c29d5c92e01fdbdea8c6.png

The process table, upid and the upid of the thread table are associated, indicating the parent process to which the thread belongs

3f6e58ddb9161562203acabaa31f6224.png

sched_slice, thread scheduling slice

df48e7267882789ac33bb7e01da3e793.png

thread_state, the thread scheduling fragment above each track, identifies the thread running state

4b0206c2d187e0aec1ff53b2810a4296.png

List several commonly used SQL queries:

1

List all doFrame fragments in reverse order of time-consuming, take the first 100

select slice_id,track_id,ts,dur,dur/1e6,name from slice WHERE name like '%doFrame%' order by dur desc limit 100

2

1 On the basis of the query, specify the process name as systemui, that is, the drawing frame information of systemui itself

select slice_id,track_id,ts,dur,dur/1e6,slice.name from slice JOIN thread_track ON slice.track_id = thread_track.id JOIN thread USING(utid) JOIN process USING(upid) WHERE process.name = 'com.android.systemui' and slice.name like '%doFrame%' order by dur desc limit 100

3

Each OPF in system_server: time-consuming information of keyword segments, including the real running_time of each segment (each slice may have a period of running, a period of sleep, and a period of runnable, you need to use the thread_state table to query the scheduling time of each slice in the slice state)

select slice_id,track_id,thread.utid,slice.ts,slice.dur,(slice.dur/1e6) as dur_ms, (select total(case when thread_state.ts < slice.ts then MIN(slice.ts+slice.dur,thread_state.ts+thread_state.dur)-slice.ts when (thread_state.ts+thread_state.dur) > (slice.ts+slice.dur) then (slice.ts+slice.dur-MAX(thread_state.ts,slice.ts)) else thread_state.dur end) from thread_state where thread_state.utid=thread.utid and thread_state.state='Running' and thread_state.ts < (slice.ts+slice.dur) and (thread_state.ts+thread_state.dur) > slice.ts)/1e6 as total_running,slice.name from slice JOIN thread_track ON slice.track_id = thread_track.id JOIN thread USING(utid) JOIN process USING(upid) WHERE process.name='system_server' and slice.name like 'OPF:%' order by slice.dur desc limit 400

4

Lock contention (lock contention) in system_server, lock_depth indicates the number of threads participating in the object lock competition at that time

select count(1) as lock_depth, s.slice_id,s.track_id,s.ts,s.dur,s.dur/1e6 as dur_ms,ctn.otid,s.name

from slice s, (select slice_id,track_id,ts,dur,name,substr(name, 46, instr(name,')')-46) as otid 

from slice t

WHERE name like 'Lock contention on a monitor lock %'

order by dur) ctn

JOIN thread_track ON s.track_id=thread_track.id JOIN thread USING(utid) JOIN process USING(upid)

WHERE

    process.name = 'system_server'

and s.name like 'Lock contention on a monitor lock %'

    and substr(s.name, 46, instr(s.name,')')-46) = ctn.otid

    and ctn.slice_id <> s.slice_id

    and ctn.ts >= s.ts and (ctn.ts+ctn.dur) <= (s.ts+s.dur)

group by s.slice_id

order by s.dur desc

Four. Summary

通过本篇文章希望读者能够了解Perfetto日志获取及常规分析方法,熟悉Perfetto UI界面各种功能,掌握Perfetto日志的SQL分析方法。后续建议结合代码了解常见trace tag/counter的意义,在具体场景中逐步深入了解系统框架运行机制及原理,提升性能分析及优化的能力。

五、附录

1. 常用快捷键

e0e45471107bc0cb6fef9655e7ee897d.png

2. WIN10 WSL及Ubuntu 20.04安装

以管理员身份打开 PowerShell 并运行:

dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart

命令执行完成后,重启。

下载Ubuntu 20.04安装包,并安装 (或 wsl --install -d Ubuntu-20.04 )

https://aka.ms/wslubuntu2004

安装完后,开始菜单找到Ubuntu 20.04,点击即可启动 Ubuntu shell。

参考链接:

https://docs.microsoft.com/zh-cn/windows/wsl/install-win10#manual-installation-steps

https://docs.microsoft.com/zh-cn/windows/wsl/install-manual

3. 手机中抓取perfetto trace方法(Traceur app)

A. 开启开发人员选项,找到并点击系统跟踪,打开以下开关

【类别】建议选中 am、aidl、binder_driver、binder_lock、bionic、freq、gfx、hal、input、res、sched、ss、view、wm。

86317429a314db7584cab7ef7391abdd.png

B. 点击 【录制系统跟踪】,即可开始测试,点击通知栏 系统跟踪图标 停止记录trace日志。

C. 运行如下命令,取出录制的trace文件

adb pull /data/local/traces

Clear the recorded trace log (clear the previously recorded trace before each retest)

adb shell "rm -rf /data/local/traces/*"

You can also click on the developer option to clear the operation, and click [Clear Saved Tracking Records] in System Tracking

4. Reference link

https://perfetto.dev/docs/

https://docs.microsoft.com/zh-cn/windows/wsl/install-manual

e4a131ced6cc4387e3159818f0126322.gif

长按关注内核工匠微信


Linux 内核黑科技 | 技术文章 | 精选教程

Guess you like

Origin blog.csdn.net/feelabclihu/article/details/126672666