How to analyze application memory in android (18) Final Chapter - Use Perfetto to view leaks between memory and call stack

How to analyze application memory in android (18)

In the previous two articles, we first introduced how to use AS to view Android's heap memory, and then introduced how to use MAT to view
Android's heap memory. AS can meet basic memory analysis needs, but it cannot conduct comprehensive comparisons of multiple heaps, so the MAT tool was introduced. It works well for comparing between two heaps. These two tools can already solve 95% of memory problems.

However, in some extreme cases, such as memory leaks caused by multi-threading, the above two tools may not be able to locate the problem, that is, the call stack and calling thread of the leak point.

For Android, how can we locate the memory caused by this multi-threaded call? Here are some tips from experience:

  1. If you can add code, for different threads, add a field to the leaked object to represent the thread id. This method is relatively simple, so I won’t go into details here.
  2. If you cannot add code, you need to record the Java call stack and the Java heap at the same time. Based on the logical comparison in the same time period, we can find out which call stack caused the memory leak. For example, by comparing which call stack has the most calls and which call stack allocates the most memory. It is even necessary to make differences between time intervals to obtain the relationship between the leaked object and the call stack. These methods often require certain experience for logical processing.

This article will focus on the situation where code cannot be added and analyze this extreme situation.

Although Android Studio provides recording of java call stack and dump of java heap. However, they cannot be used at the same time, making comparison on the timeline impossible. But Perfetto offers similar functionality.

Next, we will use Perfetto as a tool to first introduce the simultaneous recording of java call stack and java heap, compare them logically, and obtain the call stack of the leak point. Then at a certain time interval, a differential comparison is made between the Java call stack and the Java heap to obtain the call stack of the leak point.

Perfetto records stack and heap dump simultaneously

In the article How to analyze application memory in Android (13) - perfetto, we introduced how to use Perfetto. Next we will use regular mode to record java heap and java callstack at the same time.

adb shell perfetto \
  -c - --txt \
  -o /data/misc/perfetto-traces/trace \
<<EOF

buffers: {
    size_kb: 63488
    fill_policy: DISCARD
}
buffers: {
    size_kb: 2048
    fill_policy: DISCARD
}
data_sources: {
    config {
        name: "android.packages_list"
        target_buffer: 1
    }
}
data_sources: {
    config {
        name: "android.heapprofd"
        target_buffer: 0
        heapprofd_config {
            sampling_interval_bytes: 4096
            process_cmdline: "com.example.test_malloc"
            shmem_size_bytes: 8388608
            block_client: true
            ## 只录制com.android.art的堆
            heaps: "com.android.art"
        }
    }
}
## 增加了第二个数据源
data_sources: {
    ## 数据源配置
    config {
        ## 名字必须为"android.java_hprof"
        name: "android.java_hprof"
        ## 指定目标buffer,关于目标buffer的含义见android 如何分析应用的内存(十三)
        target_buffer: 0
        ## java_hprof的配置
        java_hprof_config {
            ## dump的进程名为:"com.example.test_malloc"
            process_cmdline: "com.example.test_malloc"
        }
    }
}
## 时间修改为60s
duration_ms: 60000

EOF

In the above command, we added a new data_source and designated it to record the java heap. At the same time, there is another data_source namely android.heapprofd. It will record the heap memory of the specified process. Because we do not need the native heap for the time being, we set "com.android.art" in heaps.

For instructions on the Perfetto configuration file, see: How Android analyzes application memory (13) - perfetto

Analyze results

Enter the above command, and then operate the APP. After 60 seconds, the result file will be formed in /data/misc/perfetto-traces/trace, pull it out, and open it with https://ui.perfetto.dev/ . As shown below
Insert image description here

In the picture:

  • Mark 1: The retained size of the path to the GC root.
  • Mark 2: Retained set of this path to GC root.

Note: The Retained size of an object can be understood as: after recycling this object, as much memory as the Retained size will be recycled. The Retained set of an object can be understood as: the collection of objects that are referenced by it and can be recycled after the object is recycled. The Retained size on a certain path is the sum of the Retained sizes of the objects on this path. The Retained set on a certain path is the sum of the Retained sets of the objects on this path.

For the calculation of Retained size, see: < How android analyzes application memory (16)—Use AS to view the Android heap >

Note: In the above operation, I deliberately placed a small loophole. Look carefully at the position of the two diamond shapes in the picture, one at the beginning and one at the end. In order to perform logical analysis of the leak point dump heap and callstack, the two prisms should be as close as possible. Therefore, the above configuration is adjusted as follows:

adb shell perfetto \
  -c - --txt \
  -o /data/misc/perfetto-traces/trace \
<<EOF

buffers: {
    ## 将buffer增大1000倍,否则出现Perfetto ui解析出错
    size_kb: 63488000
    fill_policy: DISCARD
}
buffers: {
    size_kb: 2048
    fill_policy: DISCARD
}
data_sources: {
    config {
        name: "android.packages_list"
        target_buffer: 1
    }
}
data_sources: {
    config {
        name: "android.heapprofd"
        target_buffer: 0
        heapprofd_config {
            sampling_interval_bytes: 4096
            process_cmdline: "com.example.test_malloc"
            shmem_size_bytes: 8388608
            heaps: "com.android.art"
            continuous_dump_config {
                ## 10s之后,才开始第一次dump
                dump_phase_ms: 10000
                ## 每隔2s,dump一次
                dump_interval_ms: 10000
            }
        }
    }
}

data_sources: {
    config {
        name: "android.java_hprof"
        target_buffer: 0
        java_hprof_config {
            process_cmdline: "com.example.test_malloc"
            continuous_dump_config {
                ## 10s后,才开始第一次dump
                dump_phase_ms: 10000
                ## 每隔2s,dump一次
                dump_interval_ms: 10000
            }
        }
    }
}
## 总时间变成 30s
duration_ms: 30000

EOF

Note: When dumping the heap here, you need to start the APP first and then run Perfetto.

The result obtained is as follows:
Insert image description here

Adjust the timeline and enlarge the two overlapping prisms as shown below:
Insert image description here

In the figure, click the second prismatic icon to display the flame graph from the GC root.

a bolt from the blue! ! ! I found that the java heap dump pulled by Perfetto on my Pixel 3 did not calculate the reference chain correctly. This caused my flame graph to not react correctly and leak memory. After in-depth analysis, it was found that the problem occurred in that the reference chain between the Classloader and the objects it loaded was not handled correctly, resulting in some objects that were reachable from the GC root becoming unreachable, that is, they were already leaked objects and became No leaks.

Our goal is to collect callstack and heap dump at the same time for logical analysis. Therefore, we can ignore this impact and directly operate the database.

Heap dump database table

Java heap dump will only involve 3 tables:

  • heap_graph_reference: storage reference
  • heap_graph_object: storage object
  • heap_graph_class: storage class

In order to be able to visually display the structure of these tables. Next, use the tool to export the database of the trace file, and then use the database UI tool to view it.

Export database

Use the following command to export the database.

./trace_processor /Users/biaowan/Documents/trace_single_conti -e  ~/Documents/trace_to_sqlite.db

In this experiment, the community version of DBeaver was used to view the database. Open the exported database: trace_to_sqlite.db. As shown below:

Insert image description here

table description

Before actual use, you need to make an explanation of each column of the table:

heap_graph_reference

  • id The unique id of this reference
  • type The name of this table, namely heap_graph_reference
  • reference_set_id refers to the reference object set ID. If this reference is in an object, then the reference_set_id in heap_graph_object is equal to this value.
  • owned_id is the id of the referenced object, that is, the id of heap_graph_object
  • owner_id uses the id of the object referenced by this
  • field_name The field name of this reference
  • field_type_name The type name of the field referenced by this
  • deobfuscated_field_name field name after deobfuscation

heap_graph_object

  • id The id of this object
  • type name of this table
  • upid pid
  • graph_sample_ts sampling time, that is, the time to dump this object
  • self_size self size
  • native_size native size
  • reference_set_id The application set id of other objects referenced by this object
  • Reachable Whether the root object is reachable. If it is reachable, it cannot be recycled. Otherwise, it can be recycled (bug exists)
  • type_id The id of the class corresponding to this object
  • root_type If it is not empty, it means it is the root object

heap_graph_class

  • id The id of this class
  • type name of this table
  • name The name of this class
  • deobfuscated_name The name of this class after deobfuscation
  • location Where is this class?
  • classloader_id classloader id, this id is the id of heap_graph_object
  • superclass_id parent class id, corresponding to the id of this table
  • kind type

With the instructions for using these tables, we can use SQL query statements to view the objects in the heap according to our own needs.

Next, we first briefly check which objects occupy the largest memory space in the second heap dump.

Check the memory usage in the second heap

Note: Why use the second heap? Because in our collection data, the collection starts from the tenth second after Perfetto starts running, and then collects again ten seconds later. The data collected for the first time is not used because the time interval between callstack and heap is large.

Pick the correct timestamp

Use the SQL statement as follows:

select * from heap_graph_object group by graph_sample_ts;

Insert image description here

From the figure, we can see that there are three time periods. They correspond to the three diamonds in the flame diagram. The remaining diamonds are callstack

What we want to analyze is the second sampling time: 398831516584184

Classify the objects in the heap corresponding to the timestamp by class, count their sizes, and output them in reverse order.

select class.id,sum(object.self_size) as totalSize,class.name 
  from heap_graph_class as class inner join heap_graph_object as object
  on class.id=object.type_id 
  where object.graph_sample_ts=398831516584184
  group by class.id
  order by totalSize desc;

The results are as follows
Insert image description here

As can be seen in the figure, the largest object type is int[], followed by String.

Note: This query includes both objects that can be recycled and objects that cannot be recycled. Because of Perfetto's own errors, the reachable field cannot be used to determine whether it is recyclable. In fact, you can write a script yourself to recursively process the reference relationship of the classloader, and then modify the reachable field in the database. But our task is to find the relationship between the leaked object and the call stack. The target of the leak has actually been identified. That is, you can determine the leaked object through the previous two articles. See:

  • How to analyze application memory in android (seventeen) - use MAT to view the Android heap: http://t.csdn.cn/c3BfM
  • How to analyze application memory in android (16) - use AS to view the Android heap: http://t.csdn.cn/xYGoA

If it is just the query results above, it cannot simply be attributed to the memory leak object being int[]. Fortunately, we have AS and MAT tools to assist in attribution. With the improvement of the Perfetto tool (repairing the value of the reachable field), memory leak points can also be found very well using only Perfetto.

And because the first two articles and this article are all using a test APP, we have attributed the memory leak point to int[]. The next step is to connect this memory leak point with the call stack

Easily associate leaked objects with call stacks

The above section explains that the leaked object is int[]. If a certain call point in the call stack is executed the most times at the same time, or the object allocated at that call point is the largest, it can be simply connected logically. It is believed that the memory leak caused by this call point.

We click on the prismatic icon closest to the second pile, as shown below:
Insert image description here

Because int[] takes up a lot of space, we choose the total allocation size of the call stack. As follows
Insert image description here

From the figure, we can see that the doText() call point occupies almost 99% of the allocated size. There is no doubt that the leak of int[] is caused by this doText() call point. But there are two doText(), occupying about 60% and about 40% respectively. It is certain that these two call points have caused memory leaks.

Then looking at its flame graph, you can find the entire call stack and calling thread.

Think: It all seems simple, right? Have you ever thought about a question - are their timing really right? Or do their timing really make sense?

In fact: the data of Java's heap dump is the data in the heap from the beginning of the program to the dump point. The data in heapprofd (that is, the java call stack here) is the data between when Perfetto starts running and the recording point. Draw a picture as follows
Insert image description here

Solution: Readers may think of starting Perfetto before the app starts, so that the time they start calculating starts when the app starts. However, according to actual measurements, to capture the Java heap, the app must be started first, so this method is not advisable. To really solve this problem, we only need to record again, and then perform differential comparisons on the callstack and heap respectively.

Use differential comparison to solve the remaining difficult-to-locate problems

Using differential comparison, we can eliminate the interference caused by the asynchronous start point of the above time, and also eliminate the interference caused by too many threads and too many call stacks. Next, take a look at the steps to use

Re-record memory data and call stack for a longer period of time

Change the total duration of the above configuration file to 60s. Then re-record, the situation will be as shown in the figure below

Insert image description here

As shown in the picture above, we sampled about 6 sets of data. Now we select the two sets of 40s and 50s for differential comparison analysis. Of course, you can also choose other groups for comparison.

Differential analysis to view the data with the most added content in the two heaps

  1. Find out the time in between first. as follows
select * from heap_graph_object group by graph_sample_ts;

Insert image description here

Based on the results, we choose the two times in the picture above, which are: 462051768285433 and 462061780815388

  1. Subtract the two time heaps, leaving the objects added in 40s to 50s.
    In order to have a perceptual understanding of the two tables, you can execute the following instructions to view them.
select class.id,object.graph_sample_ts,
    sum(object.self_size) as totalSize,class.name 
  from heap_graph_class as class inner join heap_graph_object as object
  on class.id=object.type_id 
  where object.graph_sample_ts=462051768285433 or 
     object.graph_sample_ts=462061780815388
  group by class.id
  order by totalSize desc;

Insert image description here

As can be seen from the figure, the sum of the sizes of objects in different time periods.

Next, subtract the heaps between 50s and 40s, as follows:

select t2.totalSize - COALESCE(t1.totalSize,0) as diff,t2.name
from (
   select class.id,object.graph_sample_ts,
      sum(object.self_size) as totalSize,class.name 
   from heap_graph_class as class inner join heap_graph_object as object
   on class.id=object.type_id 
   where object.graph_sample_ts=462061780815388
   group by class.id,object.graph_sample_ts
   order by totalSize desc
) as t2 left join (
   select class.id,object.graph_sample_ts,
      sum(object.self_size) as totalSize,class.name 
   from heap_graph_class as class inner join heap_graph_object as object
   on class.id=object.type_id 
   where object.graph_sample_ts=462051768285433
   group by class.id,object.graph_sample_ts
   order by totalSize desc
) as t1 on t2.name = t1.name
order by diff desc

Insert image description here

In order to better calculate their percentages, after I calculate the sum here, I write it directly into the insert statement, as follows:

select (t2.totalSize - COALESCE(t1.totalSize,0))/2119766.0 
   as percentage,t2.totalSize - COALESCE(t1.totalSize,0) as diff,t2.name
from (
   select class.id,object.graph_sample_ts,
      sum(object.self_size) as totalSize,class.name 
   from heap_graph_class as class inner join heap_graph_object as object
   on class.id=object.type_id 
   where object.graph_sample_ts=462061780815388
   group by class.id,object.graph_sample_ts
   order by totalSize desc
) as t2 left join (
   select class.id,object.graph_sample_ts,
      sum(object.self_size) as totalSize,class.name 
   from heap_graph_class as class inner join heap_graph_object as object
   on class.id=object.type_id 
   where object.graph_sample_ts=462051768285433
   group by class.id,object.graph_sample_ts
   order by totalSize desc
) as t1 on t2.name = t1.name
order by percentage desc

As shown below
Insert image description here

From the picture, we see that 97% of the objects are int[]. Next, just compare between 40s and 50s, which call point is called the most times, or the call point allocates the most memory, then the call point is larger Probability is where this 97% int[] is generated

Differential analysis of call stacks in the same time period

Before introducing how to view the difference of the call stack, we need to know the tables in the database related to the call stack. There are three tables below:

  • heap_profile_allocation: storage allocation
  • stack_profile_frame: storage stack frame name
  • stack_profile_callsite: storage call site

Of course, in addition to the above three tables, there are other tables, but they are not relevant to our analysis, so I won’t go into details. For information about all tables, please refer to: https://perfetto.dev/docs/analysis/sql-tables

Call stack table description

heap_profile_allocation

  • id unique id
  • type this table name
  • ts sampling time
  • upid pid
  • heap_name heap name
  • callsite_id call site id, which is the di of stack_profile_callsite
  • count is the number of allocations. A positive number is the number of allocations for the call point, and a negative number is the number of releases for the call point.
  • size is the size of the allocation, which is also divided into positive and negative. The positive number indicates the allocation size, and the negative number indicates the release size.

stack_profile_frame

  • id unique id
  • type this table name
  • name function name
  • mapping This function is mapped to which library, such as so, .dex is the id of stack_profile_mapping
  • rel_pc relative to the pc value of the mapping library
  • symbol_set_id The id of the symbol table corresponding to the function name, that is, the id of stack_profile_symbol
  • deobfuscated_name deobfuscated name

stack_profile_callsite

  • id unique id
  • type this table name
  • depth The distance from the top of the call stack to the top of the call stack. If there is one more function, the depth will be increased by one.
  • parent_id The call site id of the parent function of this call site. That is the id of stack_profile_callsite
  • frame_id frame id, which is the id of stack_profile_frame

View each sampling time

Use the following command to view.

select *,sum(count) as totalCount ,sum (size) as totalSize 
from heap_profile_allocation group by ts;

Insert image description here

As can be seen from the figure, the entire data is divided into 6 time periods, which correspond to the six diamonds of the flame chart. In the flame graph, each prism represents all allocations from the start of grabbing to the time corresponding to the prism position.

However, each time point in the database represents all the data captured from the previous time point to this time. Therefore, to view the data between 40s and 50s, you only need to look at the 50s data. That is, the penultimate line.

Check the call details of 40s and 50s

For ease of understanding, the following statement lists and analyzes both heaps of 40s and 50s. as follows

select * 
from (
   select * 
   from heap_profile_allocation 
   where ts = 462061299971174 
   order by count desc
) as t1
union all 
select * 
from (
   select * 
   from heap_profile_allocation 
   where ts = 462071449972185 
   order by count desc
) as t2

Insert image description here

The above figure lists the situation in the 30-40 interval, and the situation in the 40-50 interval. After a little calculation of the memory allocation of each call point, we can know that the memory allocation of the 701 call point occupies 82 of all allocations from 40s to 50s. % Therefore, we can boldly conclude that the memory leak between 40s and 50s is caused by the 701 allocation point.

View the call stack of the 701 allocation point

Use the following recursive statement to view the entire 701 call stack, as follows

WITH RECURSIVE RecursiveCTE AS (
    SELECT id, parent_id,frame_id
    FROM stack_profile_callsite
    WHERE id = 701
    UNION ALL
    SELECT origin.id, origin.parent_id,origin.frame_id
    FROM stack_profile_callsite origin
    JOIN RecursiveCTE r ON r.parent_id = origin.id
)
SELECT result.id,result.parent_id,frame.name 
FROM RecursiveCTE as result inner join stack_profile_frame as frame 
   on frame.id=result.frame_id 
order by result.id desc;

As shown below
Insert image description here

Direct observation can see that the leaked object int[] between 40s and 50s is leaked by the call stack in the above figure.

Note that using the same analysis method and looking at call point 518, you will still come to the same conclusion, but they occur between the 30s and 40s. Same steps, no more examples

At this point, using perfetto for memory analysis has been introduced.

Summary of memory methods

Everything is fine, the memory analysis of Android has been introduced. Now let’s summarize all the previous articles:

native articles

  1. The zeroth tool xdd: can only view any memory
  2. The first tool is gdb: it can view: registers and memory at any location, analyze coredump, and can view stack conditions but not heap conditions.
  3. The second tool lldb: it can view: registers and memory at any location, analyze coredump, and can view the stack but not the heap.
  4. The third tool customizes malloc: it can only view the heap situation, and the scope of the view is small, almost only the code compiled by yourself.
  5. The fourth tool, malloc hook: can view all heap allocations
  6. The fifth tool, malloc statistics and libmemunreachable: can view all heap allocations
  7. The sixth tool, malloc debug and libc callback: can view all heap allocations
  8. The seventh tool ASan/HWASan: can only view the heap allocation of Linux, but cannot find the allocation of Android. It is listed here only for the completeness of knowledge.
  9. The eighth tool, perfetto: can only view the heap memory allocation situation

java articles

  1. The zeroth tool jdb: View heap frames, local variables, locks, objects
  2. The first tool java debugger for vscode: view stack, local variables, objects
  3. The second tool Android studio: View heap, object reference, Retained size, call stack
  4. The third tool MAT: View heap, object reference, Retained size, and perform differential analysis between heaps
  5. The fourth tool, Perfetto: View heap, object reference, Retained size, call stack, and perform differential analysis between heap and call stack

End of this series.

Guess you like

Origin blog.csdn.net/xiaowanbiao123/article/details/132248383