[Performance optimization] The understanding and use of Simpleperf

introduction

Simpleperf is a cpu profiling tool on the Android platform. It can be used to analyze app processes and native processes, and analyze Java and C++ codes. The simpleperf executable program runs at least the L version, and the python script tool runs at the lowest N version. – translated from googlesource

content

The Simpleperf directory contains two parts: the simpleperf executable and the python script. The python script is the encapsulation and call of simpleperf. The basic operation is easier to traverse through the script. This article only focuses on the introductory use of the python script. (You can also directly use the simpleperf executable program, you need to manually push it to the device, and then master the syntax)

Install

Method 1, simpleperf is included in ndk. If you have installed android studio, you can directly download and install the ndk component from the AS. Name it ndk-bundle under sdk, or you can download ndk separately.

Method 2, download the simpleperf compressed package separately (can be selected according to the ndk version)

Check the root directory of simpleperf.
insert image description here
Among them, the executable program of Simpleperf can be seen in the bin directory. Different platforms are provided. If it is only under windows, only android and
windows are enough.
insert image description here

acquisition script

Let's look back at the scripts, which can be divided into three categories according to their functions:

Function screenplay
to record app_profiler.py, run_simpleperf_without_usb_connection.py,inferno,run_simpleperf_on_device.py
generate report report.py,report_sample.py
Analytical data simpleperf_report_lib.py

So how to use it? Take a look at the official method. We'll do it ourselves later.

app_profiler.py

Record cpu profiling data of an android app or native program.
It downloads simpleperf on device, uses it to collect profiling data on the selected app,
and pulls profiling data and related binaries on host.

Click to view the official website routine - app_profiler.py

Check out the user manual

$ python .\app_profiler.py --help

usage: app_profiler.py [-h]
                       (-p APP | -np NATIVE_PROGRAM | -cmd CMD | --pid PID [PID ...] | --tid TID [TID ...] | --system_wide)
                       [--compile_java_code] [-a ACTIVITY | -t TEST]
                       [-r RECORD_OPTIONS] [-lib NATIVE_LIB_DIR]
                       [-o PERF_DATA_PATH] [-nb] [--ndk_path NDK_PATH]
                       [--disable_adb_root] [--log {
    
    debug,info,warning}]

app_profiler.py: Record cpu profiling data of an android app or native program.

    It downloads simpleperf on device, uses it to collect profiling data on the selected app,
    and pulls profiling data and related binaries on host.

optional arguments:
  -h, --help            show this help message and exit

Select profiling target:
  -p APP, --app APP     Profile an Android app, given the package name. Like
                        `-p com.example.android.myapp`.
  -np NATIVE_PROGRAM, --native_program NATIVE_PROGRAM
                        Profile a native program running on the Android
                        device. Like `-np surfaceflinger`.
  -cmd CMD              Profile running a command on the Android device. Like
                        `-cmd "pm -l"`.
  --pid PID [PID ...]   Profile native processes running on device given their
                        process ids.
  --tid TID [TID ...]   Profile native threads running on device given their
                        thread ids.
  --system_wide         Profile system wide.

Extra options for profiling an app:
  --compile_java_code   Used with -p. On Android N and Android O, we need to
                        compile Java code into native instructions to profile
                        Java code. Android O also needs wrap.sh in the apk to
                        use the native instructions.
  -a ACTIVITY, --activity ACTIVITY
                        Used with -p. Profile the launch time of an activity
                        in an Android app. The app will be started or
                        restarted to run the activity. Like `-a
                        .MainActivity`.
  -t TEST, --test TEST  Used with -p. Profile the launch time of an
                        instrumentation test in an Android app. The app will
                        be started or restarted to run the instrumentation
                        test. Like `-t test_class_name`.

Select recording options:
  -r RECORD_OPTIONS, --record_options RECORD_OPTIONS
                        Set recording options for `simpleperf record` command.
                        Use `run_simpleperf_on_device.py record -h` to see all
                        accepted options. Default is "-e task-clock:u -f 1000
                        -g --duration 10".
  -lib NATIVE_LIB_DIR, --native_lib_dir NATIVE_LIB_DIR
                        When profiling an Android app containing native
                        libraries, the native libraries are usually stripped
                        and lake of symbols and debug information to provide
                        good profiling result. By using -lib, you tell
                        app_profiler.py the path storing unstripped native
                        libraries, and app_profiler.py will search all shared
                        libraries with suffix .so in the directory. Then the
                        native libraries will be downloaded on device and
                        collected in build_cache.
  -o PERF_DATA_PATH, --perf_data_path PERF_DATA_PATH
                        The path to store profiling data. Default is
                        perf.data.
  -nb, --skip_collect_binaries
                        By default we collect binaries used in profiling data
                        from device to binary_cache directory. It can be used
                        to annotate source code and disassembly. This option
                        skips it.

Other options:
  --ndk_path NDK_PATH   Set the path of a ndk release. app_profiler.py needs
                        some tools in ndk, like readelf.
  --disable_adb_root    Force adb to run in non root mode. By default,
                        app_profiler.py will try to switch to root mode to be
                        able to profile released Android apps.
  --log {
    
    debug,info,warning}
                        set log level

run_simpleperf_on_device.py


This script pushes the simpleperf executable on the device, and run a simpleperf command on the device. It is more convenient than running adb commands manually. It is more convenient to execute directly through the adb command

Click to view the official website example- Profile from launch of an application

Check the manual, in fact, its subcommands are almost equal to simpleperf, students who want to check can execute simpleperf -h to see.

$ python .\run_simpleperf_on_device.py -h

Usage: simpleperf [common options] subcommand [args_for_subcommand]
common options:
    -h/--help     Print this help information.
    --log <severity> Set the minimum severity of logging. Possible severities
                     include verbose, debug, warning, info, error, fatal.
                     Default is info.
    --log-to-android-buffer  Write log to android log buffer instead of stderr.
    --version     Print version of simpleperf.
subcommands:
    api-collect         Collect recording data generated by app api
    api-prepare         Prepare recording via app api
    debug-unwind        Debug/test offline unwinding.
    dump                dump perf record file
    help                print help information for simpleperf
    inject              parse etm instruction tracing data
    kmem                collect kernel memory allocation information
    list                list available event types
    record              record sampling info in perf.data
    report              report sampling information in perf.data
    report-sample       report raw sample information in perf.data
    stat                gather performance counter information
    trace-sched         Trace system-wide process runtime events.

run_simpleperf_without_usb_connection.py


run_simpleperf_without_usb_connection.py records profiling data while the USB cable isn't connected. Maybe api_profiler.py is more suitable, which also don't need USB cable when recording. Below is an example. Maybe api_profiler.py is more suitable, it does not need to connect usb when recording data

Click to view the official website example - run_simpleperf_without_usb_connection_py

read the manual

$ python .\run_simpleperf_without_usb_connection.py -h
usage: run_simpleperf_without_usb_connection.py [-h] {
    
    start,stop} ...

    Support profiling without usb connection in below steps:
    1. With usb connection, start simpleperf recording.
    2. Unplug the usb cable and play the app you want to profile, while the process of
       simpleperf keeps running and collecting samples.
    3. Replug the usb cable, stop simpleperf recording and pull recording file on host.

    Note that recording is stopped once the app is killed. So if you restart the app
    during profiling time, simpleperf only records the first running.

positional arguments:
  {
    
    start,stop}
    start       Start recording.
    stop        Stop recording.

optional arguments:
  -h, --help    show this help message and exit

According to the introduction, it probably means to execute start before connecting, connect to usb when the recording is finished, and pull out the collected data after executing stop, which is suitable for the scene of not connecting to usb for charging. It should be noted that if the app is killed on the way, the data collection will automatically end.

view report

In the previous step, we mainly learned how to execute scripts to collect data. Once we have the data, we can view and analyze it in any form. The official website gives more than N choices ( click to browse )

Let’s look at one that comes with it first. I haven’t tried other methods. The purpose is to assist us in analyzing the time consumption of functions in a graphical form.

report_html.py

report_html.py generates report.html based on the profiling data. Then the report.html can show the profiling result without depending on other files. So it can be shown in local browsers or passed to other machines. Depending on which command-line options are used, the content of the report.html can include: chart statistics, sample table, flamegraphs, annotated source code for each function, annotated disassembly for each function.

Convert prof.data to html report form. It can contain flame graphs.

read the manual

$ python .\report_html.py -h
usage: report_html.py [-h] [-i RECORD_FILE [RECORD_FILE ...]] [-o REPORT_PATH]
                      [--min_func_percent MIN_FUNC_PERCENT]
                      [--min_callchain_percent MIN_CALLCHAIN_PERCENT]
                      [--add_source_code]
                      [--source_dirs SOURCE_DIRS [SOURCE_DIRS ...]]
                      [--add_disassembly]
                      [--binary_filter BINARY_FILTER [BINARY_FILTER ...]]
                      [--ndk_path NDK_PATH] [--no_browser] [--show_art_frames]
                      [--aggregate-by-thread-name]

report profiling data

optional arguments:
  -h, --help            show this help message and exit
  -i RECORD_FILE [RECORD_FILE ...], --record_file RECORD_FILE [RECORD_FILE ...]
                        Set profiling data file to report. Default is
                        perf.data.
  -o REPORT_PATH, --report_path REPORT_PATH
                        Set output html file. Default is report.html.
  --min_func_percent MIN_FUNC_PERCENT
                        Set min percentage of functions shown in the report.
                        For example, when set to 0.01, only functions taking
                        >= 0.01% of total event count are collected in the
                        report. Default is 0.01.
  --min_callchain_percent MIN_CALLCHAIN_PERCENT
                        Set min percentage of callchains shown in the report.
                        It is used to limit nodes shown in the function
                        flamegraph. For example, when set to 0.01, only
                        callchains taking >= 0.01% of the event count of the
                        starting function are collected in the report. Default
                        is 0.01.
  --add_source_code     Add source code.
  --source_dirs SOURCE_DIRS [SOURCE_DIRS ...]
                        Source code directories.
  --add_disassembly     Add disassembled code.
  --binary_filter BINARY_FILTER [BINARY_FILTER ...]
                        Annotate source code and disassembly only for selected
                        binaries.
  --ndk_path NDK_PATH   Find tools in the ndk path.
  --no_browser          Don't open report in browser.
  --show_art_frames     Show frames of internal methods in the ART Java
                        interpreter.
  --aggregate-by-thread-name
                        aggregate samples by thread name instead of thread id.
                        This is useful for showing multiple perf.data
                        generated for the same app.

Use an example to observe the cold start of APP (take Douyin as an example)

  1. Determine the package name and lancherActivity.

PackageName : com.ss.android.ugc.aweme
LancherActivity: com.ss.android.ugc.aweme/.splash.SplashActivity

Execute it in advance: adb shell am force-stop com.ss.android.ugc.aweme

  1. Execute the start monitoring command

python .\app_profiler.py -p com.ss.android.ugc.aweme -a com.ss.android.ugc.aweme.splash.SplashActivity -r “-e task-clock:u -f 100 -g --duration 10”

  1. After the previous step is completed, you will get a file named perf.data by default (you can add the -o parameter to rename it). Continue to execute:

python .\report_html.py

  1. After completing the analysis, you will get a file named report.html by default (you can add the -o parameter to rename it), and it will usually be opened automatically.

Flame graph analysis

Switch to the third tab: Flamegraph (flame graph)

  • Each column represents a call stack (stack frame), and each grid represents a function.

  • The vertical axis represents the depth of the call stack, the higher the call stack, the deeper the call stack, and the lower square is the complex function of the upper square.

  • The horizontal axis represents the multiple call stack information that the flame graph will collect, and a large amount of information is aggregated by sorting alphabetically. Note that it does not represent time.

  • The color of the flame graph grid is a random warm color.

First determine the main thread, and then search for the familiar onCreate method, and the matched method will be marked in purple.

insert image description here
Click on the function, you can use this function as the new base to track its sub-calls. (Click zoom out to return to the previous level)

Analyzing the time consumption of functions mainly depends on the "flat top" function. The longer it takes, the higher the probability of being sampled.

Personal understanding, the function takes a long time to further analyze the ideas:

  • Design: Focus on whether the logic of the function itself is designed reasonably, and whether there is recursion?
  • Scheduling: Is the thread priority lower, and is it scheduled on a large core or a small core?
  • Frequency limit: Whether the frequency is limited can be analyzed together with trace.
  • Blocking: Whether there is blocking waiting, binder call waiting.
  • lock: lock contention

Finish

The current main purpose is to learn Simpleperf's crawling and collection methods, and to identify some obvious time-consuming problems of functions. In fact, the description on the official website is the most comprehensive. This article also inserts hyperlinks in various places for easy jumping. If If you want some details or some other usage needs, you can check the official website.

reference

[Translation] Simpleperf Analysis of Android System

[Official website - Executable commands reference original use of Simpleperf without scripting]

[Official website - Simpleperf # Android platform profiling collection system process]

[Official website - Simpleperf # Android application profiling acquisition application]

[Official website - Simpleperf # Scripts reference script usage]

[Official website - Simpleperf # view_the_profile Various ways to view reports]

⁨Firefox Profiler⁩

Guess you like

Origin blog.csdn.net/lucky_tom/article/details/126885365