【Perfetto】Getting started with Perfetto from scratch

Antecedent: Video lagging problem. In order to rule out whether it is a CPU occupancy performance problem or an audio and video encoding and decoding problem, I came into contact with Perfetto. I found it very fun and wanted to learn about it. Suddenly I felt that the company was quite good. It gave newcomers a lot of room to grow. I could solve bugs, touch new technologies, and learn at the same time. If I had any questions, everyone was happy to teach me~Insert image description here

Overview

Perfetto - System Analysis, Application Tracing and Trace Analysis

Perfetto is a production-grade open source stack for performance instrumentation and tracing analysis. It provides services and libraries for logging system-level and application-level traces, native + java heap analysis, libraries for analyzing traces using SQL, and a web-based UI for visualizing and exploring multi-GB traces.
Insert image description here

record traces

At its core, Perfetto introduces a novel userspace-to-userspace tracing protocol based on direct protobuf serialization over a shared memory buffer. The tracing protocol is used internally by built-in data sources and is exposed to C++ applications through the tracing SDK and tracing event library.
This new tracking protocol allows all aspects of tracking to be dynamically configured via an extensible protobuf-based function advertisement and data source configuration mechanism (see the tracking configuration documentation). Different data sources can be multiplexed onto different subsets of user-defined buffers, thus also allowing arbitrarily long traces to be streamed to the file system.

System-wide tracing on Android and Linux

On Linux and Android, Perfetto comes bundled with a number of data sources that collect detailed performance data from different system interfaces. For the complete set and details, see the Data Sources section of the documentation. Some examples:

  • Kernel tracing: Perfetto integrates with Linux's ftrace and allows kernel events (eg, dispatch events, system calls) to be logged into the trace.
  • /proc and /sys pollers that allow sampling of the state of process-wide or system-wide cpu and memory counters over time.
  • Integrates with the Android HAL module for logging battery and energy usage counters.
  • Native Heap Profiling: A low-overhead heap profiler that hooks malloc/free/new/delete and associates memory with the call stack, based on out-of-process unwinding, configurable sampling, and can be attached to a running process.
  • Capture Java heap dumps using an out-of-process profiler that is tightly integrated with the Android runtime, which allows taking a complete snapshot of the managed heap retention graph (types, field names, retained sizes, and references to other objects) without dumping Complete heap contents (strings and bitmaps), thus reducing serialization time and output file size.

On Android, Perfetto is a next-generation system tracing system that replaces chromium-based systrace. ATrace-based instruments are still fully supported. For more details, see the Android developer documentation.

Tracking SDK and userspace detection

The Perfetto Tracing SDK enables C++ developers to enrich tracing with application-specific trace points. You have the flexibility to define your own strongly typed events and create custom data sources, or you can choose to use the easier-to-use Trace Events library, which allows easy creation of time-limited slices, counters, and timestamps using annotations of the form TRACE_EVENT RACE_EVENT("category", "event_name", "x", "str", "y", 42)( "category", "event name", "x", "str", "y", 42).
This SDK is designed for tracking multi-process systems and multi-threaded processes. It is based on ProtoZero, a library for writing protobuf events directly on thread-local shared memory buffers.
The same code can work in full process mode, hosting an instance of the Perfetto tracing service on a dedicated thread, or in system mode, connecting to the Linux/Android tracing daemon via a UNIX socket, allowing the combined application to have Specific detection points for system-wide tracing events.
The SDK is based on portable C++17 code and tested with the major C++ sanitizers (ASan, TSan, MSan, LSan). It does not rely on runtime code modifications or compiler plugins.

Chromium tracking

Perfetto is designed to replace the internals of chrome://tracing infrastructure. Tracing and its internals in Chromium are based on the Perfetto codebase available on all major platforms (Android, CrOS, Linux, MacOS, Windows). The same service-based system-wide tracing architecture is applied, but internally it uses the Chromium Mojo IPC system instead of Perfetto's own UNIX sockets.
By default, tracing works in in-process mode in Chromium and only records data emitted by the Chromium process. On Android (and on Linux if the Chromium sandbox is disabled) tracing can work in a mixed in-process + system mode, combining chrome-specific tracing events with Perfetto system events.

Trace analysis

In addition to trace logging functionality, the Perfetto codebase includes a dedicated project for importing, parsing, and querying old and new trace formats: the Trace Processor.
Trace Processor is a portable C++17 library that provides column-oriented table storage, is designed to efficiently save hours of trace data into memory, and exposes a SQL query interface based on the popular SQLite query engine. The trace data model becomes a set of SQL tables that can be queried and joined in an extremely powerful and flexible way to analyze the trace data.
In addition to this, the trace processor also includes a trace-based metrics subsystem consisting of pre-baked and extensible queries that can output a strongly typed summary about the trace (e.g. different CPU usage by frequency) status, broken down by process and thread).
Trace-based metrics allow easy integration of traces in performance testing scenarios or batch analysis or large trace corpora.
The trace processor is also designed for low-latency querying and building trace visualization tools. Today, Perfetto UI uses Trace Processor as a Web Assembly module, and Android Studio and Android GPU Inspector use it as a native C++ library.

TraceVisualization

Perfetto also provides a new trace visualization tool for opening and querying hours-long traces, available at ui.perfetto.dev. New visualization tools leverage modern web platform technologies. Its multi-threaded design based on WebWorkers keeps the UI always responsive; the analysis capabilities of Trace Processor and SQLite are fully available in the browser through WebAssembly.
Perfetto UI opens once and works completely offline. Tracking opened using the UI is handled locally by the browser and does not require any server-side interaction.
Insert image description here

data source

Memory counters and events

Perfetto allows collection of a large number of memory events and counters on Android and Linux. These events come from kernel interfaces, including the ftrace and /proc interfaces, and come in two types: poll counters and events pushed by the kernel in the ftrace buffer.

Per-process poll counter

The process statistics data source allows polling at user-defined intervals /proc/<pid>/statusand/proc/<pid>/oom_score_adj

UI

Insert image description here

SQL

select c.ts, c.value, t.name as counter_name, p.name as proc_name, p.pid
from counter as c left join process_counter_track as t on c.track_id = t.id
left join process as p using (upid)
where t.name like 'mem.%'
ts counter_name value_kb proc_name pid
261187015027350 mem.virt 1326464 com.android.vending 28815
261187015027350 mem.rss 85592 com.android.vending 28815
261187015027350 mem.rss.anon 36948 com.android.vending 28815
261187015027350 mem.rss.file 46560 com.android.vending 28815
261187015027350 mem.swap 6908 com.android.vending 28815
261187015027350 mem.rss.watermark 102856 com.android.vending 28815
261187090251420 mem.virt 1326464 com.android.vending 28815

Trace configuration

To collect process statistics counters every X milliseconds, set proc_stats_poll_ms = X in the process statistics configuration. X must be greater than 100ms to avoid excessive CPU usage. Details about the specific counters collected can be found in the ProcessStats reference.

data_sources: {
    
    
    config {
    
    
        name: "linux.process_stats"
        process_stats_config {
    
    
            scan_all_processes_on_start: true
            proc_stats_poll_ms: 1000
        }
    }
}

Process memory events (ftrace)

RSS_Statistics
Recent versions of the Linux kernel allow reporting of ftrace events when the Resident Set Size (RSS) mm counter changes. This is the same /proc/pid/statuscounter available in and VmRSS. The main advantage of this event is that, as an event-driven push event, it allows the detection of very short bursts of memory usage that would otherwise /procnot be detected by usage counters.
Memory usage spikes of hundreds of MB can have a huge negative impact on Android, even if they only last a few milliseconds, as they can cause a lot of low-memory kills to reclaim memory.
Kernel functionality to support this feature was introduced in the Linux kernel in b3d1411b6 and was later improved by e4dcad20. They are available upstream as of Linux v5.5-rc1. This patch has been backported to multiple Google Pixel kernels running Android 10 (Q).

mm_event
mm_event is a ftrace event used to capture statistics about critical memory events ( /proc/vmstata subset of the exposed events). Unlike RSS-stat counter updates, the number of mm events is so large that tracking them individually is not feasible. mm_event only reports periodic histograms in the trace, significantly reducing overhead.
mm_eventOnly available on certain Google Pixel kernels running Android 10 (Q) and higher.
When enabled mm_event, the following mm event types are logged:

  • mem.mm.min_flt: minor page fault
  • mem.mm.maj_flt: major page error
  • mem.mm.swp_flt: Page faults handled by swap cache
  • mem.mm.read_io: I/O supported read page faults
  • mem.mm…compaction: memory compression event
  • mem.mm.reclaim: memory recycling event

For each event type, the event is logged:

  • count: How many times this event has occurred since the previous event.
  • min_lat: Minimum latency recorded since the last event (mm duration of event).
  • max_lat: The highest recorded latency since the last event.

ui
ui
SQL
At the SQL level, these events are imported and exposed in the same way as the corresponding polling events. This allows collecting both types of events (push and poll) and processing them uniformly in queries and scripts.

select c.ts, c.value, t.name as counter_name, p.name as proc_name, p.pid
from counter as c left join process_counter_track as t on c.track_id = t.id
left join process as p using (upid)
where t.name like 'mem.%'
ts value counter_name proc_name pid
777227867975055 18358272 mem.rss.anon com.google.android.apps.safetyhub 31386
777227865995315 5 mem.mm.min_flt.count com.google.android.apps.safetyhub 31386
777227865995315 8 mem.mm.min_flt.max_lat com.google.android.apps.safetyhub 31386
777227865995315 4 mem.mm.min_flt.avg_lat com.google.android.apps.safetyhub 31386
777227865998023 3 mem.mm.swp_flt.count com.google.android.apps.safetyhub 31386

Trace configuration

data_sources: {
    
    
    config {
    
    
        name: "linux.ftrace"
        ftrace_config {
    
    
            ftrace_events: "kmem/rss_stat"
            ftrace_events: "mm_event/mm_event_record"
        }
    }
}

# This is for getting Thread<>Process associations and full process names.
data_sources: {
    
    
    config {
    
    
        name: "linux.process_stats"
    }
}

System-wide polling counter

This data source allows periodic polling of the following system data:

  • /proc/stat
  • /proc/vmstat
  • /proc/meminfo

The UI
Insert image description here
can set the polling period and specific counters included in the trace in the trace configuration.
SQL

select c.ts, t.name, c.value / 1024 as value_kb from counters as c left join counter_track as t on c.track_id = t.id
ts name value_kb
775177736769834 MemAvailable 1708956
775177736769834 Buffers 6208
775177736769834 Cached 1352960
775177736769834 SwapCached 8232
775177736769834 Active 1021108
775177736769834 Inactive(file) 351496

Trace configuration

data_sources: {
    
    
    config {
    
    
        name: "linux.sys_stats"
        sys_stats_config {
    
    
            meminfo_period_ms: 1000
            meminfo_counters: MEMINFO_MEM_TOTAL
            meminfo_counters: MEMINFO_MEM_FREE
            meminfo_counters: MEMINFO_MEM_AVAILABLE

            vmstat_period_ms: 1000
            vmstat_counters: VMSTAT_NR_FREE_PAGES
            vmstat_counters: VMSTAT_NR_ALLOC_BATCH
            vmstat_counters: VMSTAT_NR_INACTIVE_ANON
            vmstat_counters: VMSTAT_NR_ACTIVE_ANON

            stat_period_ms: 1000
            stat_counters: STAT_CPU_TIMES
            stat_counters: STAT_FORK_COUNT
        }
    }
}

Low Memory Kill (LMK)

Background
Android framework kills apps and services, especially background ones, to make room for newly opened apps when memory is needed. These are called Low Memory Terminations (LMK).
Note that LMK is not always a symptom of performance problems. The rule of thumb is that the severity (i.e., user-perceived impact) is directly proportional to the state of the terminated application. Application status can be derived from tracking of OOM tuning scores.
LMK for front-end applications or services is often a big problem. This happens when an app a user is using disappears under their fingers, or their favorite music player service suddenly stops playing music.
In contrast, the LMK of a cached application or service is usually the same as usual, and in most cases the end user will not notice until they try to return to the application, and then the application will start cold.
The situation between these two extremes is more nuanced. If the cached LMK of an application/service occurs in a storm (i.e. most processes are observed to get LMK for a short period of time), there may still be a problem and is usually a symptom of some component of the system causing memory spikes.
LowMemorykiller and
the lowmemorykiller driver in the lmkd kernel
In Android, LMK used to be handled by a staging kernel driver (drivers/staging/android/lowmemorykiller.c for Linux). This driver is used to emit the ftrace event lowmemorykiller/lowmemory_kill in the trace.
Userspace lmkd
Android 9 introduces a userspace native daemon that takes over the responsibilities of LMK: lmkd. Not all devices running Android 9 will necessarily use lmkd, as the final choice of in-kernel or userspace depends on the phone manufacturer, its kernel version, and kernel configuration.
On Google Pixel phones, lmkd-side scanning has been used since the Pixel 2 running Android 9.
See https://source.android.com/devices/tech/perf/lmkd for details.
lmkd emits a userspace atrace counter event named kill_one_process.
Android LMK and Linux oomkiller
LMK on Android, whether in the old kernel lowmemkilleror the newer lmkd, uses a completely different mechanism from the OOM Killer of the standard Linux kernel. Perfetto currently only supports Android LMK events (kernel and user space) and does not support tracking Linux kernel OOM Killer events. Linux OOMKiller events are still theoretically possible on Android, but are extremely unlikely. If this occurs, it is most likely a symptom of a BSP misconfiguration.
ui
The newer userspace LMK is available as a counter in the UI under the lmkd track. The counter value is the PID of the process that was killed (in the example below, PID=27985).
Insert image description hereSQL's
newer lmkd and older kernel-driven lowmemorykiller events are both normalized on import and available under the mem.lmk key in the instant table.

SELECT ts, process.name, process.pid 
FROM instant 
JOIN process_track ON instant.track_id = process_track.id
JOIN process USING (upid)
WHERE instant.name = 'mem.lmk'
ts name pid
442206415875043 roid.apps.turbo 27324
442206446142234 android.process.acore 27683
442206462090204 com.google.process.gapps 28198

Trace configuration

data_sources: {
    
    
    config {
    
    
        name: "linux.ftrace"
        ftrace_config {
    
    
            # For old in-kernel events.
            ftrace_events: "lowmemorykiller/lowmemory_kill"

            # For new userspace lmkds.
            atrace_apps: "lmkd"

            # This is not strictly required but is useful to know the state
            # of the process (FG, cached, ...) before it got killed.
            ftrace_events: "oom/oom_score_adj_update"
        }
    }
}

Application status and OOM tuning scores

Android application status can be inferred from traces of process oom_score_adj. The mapping is not 1:1, there are more states than oom_score_adjvalue groups, and the cached processes oom_score_adjrange from 900 to 1000.

// This is a process only hosting activities that are not visible,
// so it can be killed without any disruption.
static final int CACHED_APP_MAX_ADJ = 999;
static final int CACHED_APP_MIN_ADJ = 900;

// This is the oom_adj level that we allow to die first. This cannot be equal to
// CACHED_APP_MAX_ADJ unless processes are actively being assigned an oom_score_adj of
// CACHED_APP_MAX_ADJ.
static final int CACHED_APP_LMK_FIRST_ADJ = 950;

// The B list of SERVICE_ADJ -- these are the old and decrepit
// services that aren't as shiny and interesting as the ones in the A list.
static final int SERVICE_B_ADJ = 800;

// This is the process of the previous application that the user was in.
// This process is kept above other things, because it is very common to
// switch back to the previous app.  This is important both for recent
// task switch (toggling between the two top recent apps) as well as normal
// UI flow such as clicking on a URI in the e-mail app to view in the browser,
// and then pressing back to return to e-mail.
static final int PREVIOUS_APP_ADJ = 700;

// This is a process holding the home application -- we want to try
// avoiding killing it, even if it would normally be in the background,
// because the user interacts with it so much.
static final int HOME_APP_ADJ = 600;

// This is a process holding an application service -- killing it will not
// have much of an impact as far as the user is concerned.
static final int SERVICE_ADJ = 500;

// This is a process with a heavy-weight application.  It is in the
// background, but we want to try to avoid killing it.  Value set in
// system/rootdir/init.rc on startup.
static final int HEAVY_WEIGHT_APP_ADJ = 400;

// This is a process currently hosting a backup operation.  Killing it
// is not entirely fatal but is generally a bad idea.
static final int BACKUP_APP_ADJ = 300;

// This is a process bound by the system (or other app) that's more important than services but
// not so perceptible that it affects the user immediately if killed.
static final int PERCEPTIBLE_LOW_APP_ADJ = 250;

// This is a process only hosting components that are perceptible to the
// user, and we really want to avoid killing them, but they are not
// immediately visible. An example is background music playback.
static final int PERCEPTIBLE_APP_ADJ = 200;

// This is a process only hosting activities that are visible to the
// user, so we'd prefer they don't disappear.
static final int VISIBLE_APP_ADJ = 100;

// This is a process that was recently TOP and moved to FGS. Continue to treat it almost
// like a foreground app for a while.
// @see TOP_TO_FGS_GRACE_PERIOD
static final int PERCEPTIBLE_RECENT_FOREGROUND_APP_ADJ = 50;

// This is the process running the current foreground app.  We'd really
// rather not kill it!
static final int FOREGROUND_APP_ADJ = 0;

// This is a process that the system or a persistent process has bound to,
// and indicated it is important.
static final int PERSISTENT_SERVICE_ADJ = -700;

// This is a system persistent process, such as telephony.  Definitely
// don't want to kill it, but doing so is not completely fatal.
static final int PERSISTENT_PROC_ADJ = -800;

// The system process runs at the default adjustment.
static final int SYSTEM_ADJ = -900;

// Special code for native processes that are not being managed by the system (so
// don't have an oom adj assigned by the system).
static final int NATIVE_ADJ = -1000;

Reference study: official documents

Guess you like

Origin blog.csdn.net/weixin_43233219/article/details/132348350