ANR analysis process

1. Basic knowledge of ANR

1.1. Cause of occurrence

To sum up in one sentence: ANR will occur if you fail to complete what you need to do within the specified time.

1.2. ANR classification

Classification from the scene that happened:

  • The Input event has not been processed for more than 5 seconds.
  • Service processing timeout, 20s in the foreground, 200s in the background
  • BroadcastReceiver processing timeout, 10S in the foreground, 60s in the background
  • ContentProvider execution timeout, relatively rare

According to the reasons why it happened:

  • The main thread has time-consuming operations, such as complex layout, IO operations, etc.
  • Blocked by Binder peer
  • Quilted thread synchronization lock block
  • Binder is full, causing the main thread to be unable to communicate with SystemServer
  • Cannot get system resources (CPU/RAM/IO)

From a process perspective:

  • The problem lies in the current process:
    the main thread itself is time-consuming, or the main thread's message queue has time-consuming operations;
    the main thread is blocked by other child threads of this process;
  • The problem lies in the remote process (usually binder call or socket and other communication methods)

2. Log analysis of ANR

2.1. Log classification

When an ANR problem occurs, a bugreport will usually be filed.

adb bugreprot xxx

The most important thing is that the generated bugreport has the trace of anr. If you want to take it out separately, that’s fine.

adb pull /data/anr/traces.txt xxx

A complete bugreport contains the following information, which is critical to analyzing ANR issues

Log name

effect

Get command

system.log

Contains ANR occurrence time point information, CPU information before ANR occurrence, and also contains a large number of system service output information

adb logcat –b system

main.log

Contains the information output by the application itself before ANR occurs, which can be used to analyze whether the application is abnormal; it also contains the output GC information, which can be used to analyze the speed of memory recycling and determine whether the system is in a low memory or memory fragmentation state.

adb logcat –b main

event.log

Contains application life cycle information output by AMS and WMS, which can be used to analyze window creation speed and focus transition status

adb logcat –b event

kernel.log

Including information printed by the kernel, LowMemoryKiller killing processes, memory fragmentation or insufficient memory, and mmc driver exceptions can be found here.

none

So what do you think of these Logs? See case one below

2.2. Case 1: SP time-consuming problem leads to application of ANR

Generally, search ANR in first to get the most intuitive information, as follows:

06-16 16:16:28.590  1853  2073 E ActivityManager: ANR in com.android.camera (com.android.camera/.Camera)
06-16 16:16:28.590  1853  2073 E ActivityManager: PID: 27661
06-16 16:16:28.590  1853  2073 E ActivityManager: Reason: Input dispatching timed out (com.android.camera/com.android.camera.Camera, Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago.  Wait queue length: 24.  Wait queue head age: 5511.1ms.)
06-16 16:16:28.590  1853  2073 E ActivityManager: Load: 16.25 / 29.48 / 38.33
06-16 16:16:28.590  1853  2073 E ActivityManager: CPU usage from 0ms to 8058ms later:
06-16 16:16:28.590  1853  2073 E ActivityManager:   58% 291/mediaserver: 51% user + 6.7% kernel / faults: 2457 minor 4 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   27% 317/mm-qcamera-daemon: 21% user + 5.8% kernel / faults: 15965 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.4% 288/debuggerd: 0% user + 0.3% kernel / faults: 21615 minor 87 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   17% 27661/com.android.camera: 10% user + 6.8% kernel / faults: 2412 minor 34 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   16% 1853/system_server: 10% user + 6.4% kernel / faults: 1754 minor 87 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   10% 539/sensors.qcom: 7.8% user + 2.6% kernel / faults: 16 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   4.4% 277/surfaceflinger: 1.8% user + 2.6% kernel / faults: 14 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   4% 203/mmcqd/0: 0% user + 4% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   2.6% 3510/com.android.phone: 1.9% user + 0.6% kernel / faults: 1148 minor 8 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   2.1% 2902/com.android.systemui: 1.6% user + 0.4% kernel / faults: 1272 minor 32 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   1.6% 3110/com.miui.whetstone: 1.6% user + 0% kernel / faults: 2614 minor 22 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.8% 99/kswapd0: 0% user + 0.8% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   1.4% 217/jbd2/mmcblk0p25: 0% user + 1.4% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   1.4% 223/logd: 0.7% user + 0.7% kernel / faults: 4 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.9% 12808/kworker/0:1: 0% user + 0.9% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.8% 35/kworker/u:2: 0% user + 0.8% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0% 3222/com.miui.sysbase: 0% user + 0% kernel / faults: 1314 minor 12 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.8% 3446/com.android.nfc: 0.4% user + 0.3% kernel / faults: 1223 minor 9 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.7% 10866/kworker/u:1: 0% user + 0.7% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.6% 642/mdss_fb0: 0% user + 0.6% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.6% 29336/kworker/u:7: 0% user + 0.6% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.4% 6/kworker/u:0: 0% user + 0.4% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.4% 22924/kworker/u:6: 0% user + 0.4% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.3% 4421/mpdecision: 0% user + 0.3% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.2% 276/servicemanager: 0.1% user + 0.1% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.2% 289/rild: 0.2% user + 0% kernel / faults: 20 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 4161/mcd: 0% user + 0% kernel / faults: 9 minor 1 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 3/ksoftirqd/0: 0% user + 0.1% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 5/kworker/0:0H: 0% user + 0.1% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 7/kworker/u:0H: 0% user + 0.1% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0% 215/flush-179:0: 0% user + 0% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 321/displayfeature: 0.1% user + 0% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 368/irq/33-cpubw_hw: 0% user + 0.1% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 403/qmuxd: 0% user + 0.1% kernel / faults: 60 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   0% 3491/com.xiaomi.finddevice: 0% user + 0% kernel / faults: 706 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.1% 29330/ksoftirqd/1: 0% user + 0.1% kernel
06-16 16:16:28.590  1853  2073 E ActivityManager: 96% TOTAL: 56% user + 29% kernel + 6.3% iowait + 4.1% softirq

When encountering an ANR problem, is the trace in front of us the first crime scene? If a lot of information is output when an ANR occurs, and the CPU and I/O resources are relatively tight at that time, then the time point of this log output may be delayed . It could be anywhere from 10 seconds to 20 seconds , so sometimes we need to be more vigilant. Let’s take an example and explain it line by line:

06-16 16:16:28.590  1853  2073 E ActivityManager: ANR in com.android.camera (com.android.camera/.Camera) .

This line learns that the time when ANR occurred is 06-16 16:16:28.590, and the process that occurred is com.android.camera, specifically at com.android.camera/.Camera, where 1853 is the pid of the systemserver and 2073 is the ActivityManager thread. The pid of ActivityManager is a system thread. In fact, there is corresponding information in the Events log , search for the keyword am_anr

06-16 16:16:20.536  1853  2073 I am_anr  : [0,27661,com.android.camera,952745541,Input dispatching timed out (com.android.camera/com.android.camera.Camera, Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago.  Wait queue length: 24.  Wait queue head age: 5511.1ms.)] 

From this, you can also determine the time point, type, process pid, process name, etc. of the ANR. Continue to the next line.

06-16 16:16:28.590  1853  2073 E ActivityManager: PID: 27661 

This line learns that the pid of the ANR process is 27661. In special cases, if the pid is 0, it means that the process was killed by LowMemoryKiller or crashed before the ANR occurred. In this case, the system broadcast cannot be received. Or a key message, so ANR occurs.

06-16 16:16:28.590  1853  2073 E ActivityManager: Reason: Input dispatching timed out (com.android.camera/com.android.camera.Camera, Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago.  Wait queue length: 24.  Wait queue head age: 5511.1ms.) 

This line learns that the reason for ANR is Input dispatching timed out

06-16 16:16:28.590  1853  2073 E ActivityManager: Load: 16.25 / 29.48 / 38.33 

This line knows the load of the CPU . On the Linux operating system, you can also get the load for a period of time by entering uptime.

uptime 20:09:54 up 71 days, 10:48, 1 user, load average: 0.99, 0.78, 0.86

So what does load mean? The three numbers after Load mean the average load of the system in 1 minute, 5 minutes, and 15 minutes respectively . When the CPU is completely idle, the average load is 0; when the CPU workload is saturated, the average load is 1. Load can be used to determine whether the system load is too heavy. There is a vivid metaphor: imagine a CPU as a bridge. There is only one lane on the bridge. All vehicles must pass through this lane. The system load is 0, which means there is not a single car on the bridge. The system load is 0.5 , which means that there are cars on half of the sections of the bridge, and the system load is 1.0, which means that there are cars on all sections of the bridge, which means that the bridge is "full", and the system load is 2.0, which means that there are too many vehicles, and the bridge has been It is full (100%), and there are twice as many vehicles waiting to get onto the bridge. The traffic capacity of the bridge is the maximum workload of the CPU; the vehicles on the bridge are processes waiting for processing by the CPU.

The rule of thumb is this:
when the system load is consistently greater than 0.7, you have to start investigating what the problem is and prevent the situation from getting worse.
When the system load continues to be greater than 1.0, you must find a solution to lower this value.
When the system load reaches 5.0, it indicates that your system has a serious problem.

If the system load in only one minute is greater than 1.0 and the other two time periods are less than 1.0, this indicates that it is only a temporary phenomenon and the problem is not serious.
If the average system load is greater than 1.0 within 15 minutes (after adjusting the number of CPU cores), it indicates that the problem persists and is not a temporary phenomenon. Therefore, you should mainly observe the "15-minute system load" as an indicator of normal computer operation.

Our current mobile phones have a multi-core CPU architecture. There are many eight cores, which means that the CPU processing power is multiplied by 8. The running time of each core can be obtained from the following file, /sys/devices/system/ Read from cpu/cpu%d/cpufreq/stats/time_in_state, %d represents the core of the CPU. The file records the running time of the CPU at each frequency from booting to reading the file, unit: 10 mS.

Use adb shell cat /sys/devices/system/cpu/cpu1/cpufreq/stats/time_in_state to check the frequency

Use adb shell cat /sys/devices/system/cpu/cpu1/cpufreq/stats/time_in_state to view
frequency time
652800 1813593
1036800 46484
1401600 521974
1689600 2956667
1843200 83065
1958400 53516
2016000 251693

For more details about the load, see Understanding Linux System Load , but don’t go into too much detail.

06-16 16:16:28.590  1853  2073 E ActivityManager: CPU usage from 0ms to 8058ms later:
06-16 16:16:28.590  1853  2073 E ActivityManager:   58% 291/mediaserver: 51% user + 6.7% kernel / faults: 2457 minor 4 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   27% 317/mm-qcamera-daemon: 21% user + 5.8% kernel / faults: 15965 minor
06-16 16:16:28.590  1853  2073 E ActivityManager:   0.4% 288/debuggerd: 0% user + 0.3% kernel / faults: 21615 minor 87 major
06-16 16:16:28.590  1853  2073 E ActivityManager:   17% 27661/com.android.camera: 10% user + 6.8% kernel / faults: 2412 minor 34 major
....
06-16 16:16:28.590  1853  2073 E ActivityManager: 96% TOTAL: 56% user + 29% kernel + 6.3% iowait + 4.1% softirq
.....

In this log, you can get the CPU usage of the top processes when ANR occurs. User represents the user space, and kernel represents the kernel space. Generally, the following rules apply.

  • If the kswapd0 CPU occupancy rate is high, the overall system will run slowly, causing various ANRs. Forward the issue to "Memory Optimization" and ask them to optimize it.
  • High logd CPU usage can also cause system freezes and ANR, because the operation of each process to output LOG is blocked and executes extremely slowly.
  • Vold takes up too much CPU and may cause system freezes and ANR. Please investigate first if you are responsible for storage.
  • qcom.sensor CPU usage is too high, which may cause lagging. Please investigate the system.
  • The application's own CPU usage is high, and there is a high probability of application problems
  • The system CPU usage is not high, but the main thread is waiting for a lock, and there is a high probability of application problems.
  • The application is in D state and ANR occurs. If the last operation is refriger, then the application is frozen, which is normally caused by power consumption optimization.

Okay, from the above log we have obtained the basic information of ANR. To find out where the blockage is, we need to rely on the trace file. Usually in the anr directory. Search the stack of the main thread in this trace file, as follows:

----- pid 27661 at 2017-06-16 16:16:20 -----
Cmd line: com.android.camera
"main" prio=5 tid=1 Waiting
 | group="main" sCount=1 dsCount=0 obj=0x75a4b5c8 self=0xb4cf6500
 | sysTid=27661 nice=-10 cgrp=default sched=0/0 handle=0xb6f6cb34
 | state=S schedstat=( 11242036155 8689191757 38520 ) utm=895 stm=229 core=0 HZ=100
 | stack=0xbe4ea000-0xbe4ec000 stackSize=8MB
 | held mutexes=
 at java.lang.Object.wait!(Native method)
 - waiting on <0x09e6a059> (a java.lang.Object)
 at java.lang.Thread.parkFor$(Thread.java:1220)
 - locked <0x09e6a059> (a java.lang.Object)
 at sun.misc.Unsafe.park(Unsafe.java:299)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
 at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:810)
 at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:970)
 at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1278)
 at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:203)
 at android.app.SharedPreferencesImpl$EditorImpl$1.run(SharedPreferencesImpl.java:366)
 at android.app.QueuedWork.waitToFinish(QueuedWork.java:88)
 at android.app.ActivityThread.handleStopActivity(ActivityThread.java:3605)
 at android.app.ActivityThread.access$1300(ActivityThread.java:153)
 at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1399)
 at android.os.Handler.dispatchMessage(Handler.java:102)
 at android.os.Looper.loop(Looper.java:154)
 at android.app.ActivityThread.main(ActivityThread.java:5528)
 at java.lang.reflect.Method.invoke!(Native method)
 at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:740)
 at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:630)

Explain the meaning of some fields

Field

meaning

time=1

Thread number

sysTid=27661

The thread number and process number of the main thread are the same

Waiting

Thread state, where state is also the thread state, if state=D means the bottom layer is blocked.

nice

The smaller the nice value, the higher the priority. Because it is the main thread, nice=-10 here, you can see that the priority is very high.

schedstat

The three numbers in brackets are the Running, Runable, and Switch times in order. Running time: CPU running time, unit ns. Runable time: the waiting time of the RQ queue, in ns. Switch times: The number of CPU scheduling switches

utm

The time the thread is executed in user mode, the unit is jiffies

stm

The time the thread is executed in kernel mode, the unit is jiffies

sCount

The number of times this thread has been suspended

dsCount

The number of times a thread has been suspended by the debugger. When a process is debugged, sCount will be reset to 0. After debugging, sCount will increase depending on whether it was suspended normally, but dsCount will not be reset to 0, so dsCount can also be used. Used to determine whether this thread has been debugged

self

The address of the thread itself

Let’s talk about the status of the thread

state

value

illustrate

THREAD_ZOMBIE

0

TERMINATED

THREAD_RUNNING

1

RUNNABLE or running now

THREAD_TIMED_WAIT

2

TIMED_WAITING in Object.wait()

THREAD_MONITOR

3

BLOCKED on a monitor

THREAD_INITIALIZING

5

allocated not yet running

THREAD_STARTING

6

started not yet on thread list

THREAD_NATIVE

7

off in a JNI native method

THREAD_VMWAIT

8

waiting on a VM resource

THREAD_SUSPENDED

9

suspended usually by GC or debugger

So how to solve this problem? Through the above basic introduction and trace file, we know that the blocked point is

 at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:203)
 at android.app.SharedPreferencesImpl$EditorImpl$1.run(SharedPreferencesImpl.java:366)
 at android.app.QueuedWork.waitToFinish(QueuedWork.java:88)
 at android.app.ActivityThread.handleStopActivity(ActivityThread.java:3605)
 at android.app.ActivityThread.access$1300(ActivityThread.java:153)

Let’s look at QueuedWork.waitToFinish first

77    /**
78     * Finishes or waits for async operations to complete.
79     * (e.g. SharedPreferences$Editor#startCommit writes)
80     *
81     * Is called from the Activity base class's onPause(), after
82     * BroadcastReceiver's onReceive, after Service command handling,
83     * etc.  (so async work is never lost)
84     */
85    public static void waitToFinish() {
86        Runnable toFinish;
     //Wait for all pending tasks to complete
87        while ((toFinish = sPendingWorkFinishers.poll()) != null) {
88            toFinish.run();
89        }
90    }

QueuedWork.waitToFinish will be called after Activity's onPause or BroadcastReceiver's onReceive to ensure that the asynchronous task execution is completed. In waitToFinish, iterate through all the tasks waiting to be completed in sPendingWorkFinishers and wait for their completion. Let's look at SharedPreferencesImpl.apply. This method will put tasks waiting to be written to the file system into the waiting completion queue of QueuedWork.

361        public void apply() {
362            final MemoryCommitResult mcr = commitToMemory();
363            final Runnable awaitCommit = new Runnable() {
364                    public void run() {
365                        try {
366                            mcr.writtenToDiskLatch.await();
367                        } catch (InterruptedException ignored) {
368                        }
369                    }
370                };
371
     //Put tasks waiting to be written to the file system into the waiting queue of QueuedWork
372            QueuedWork.add(awaitCommit);
373            ... ... ... ... ... ...
388        }

Although the apply method itself can return quickly, when the Activity's onPause is called, it will wait for the task of writing to the file system to complete. In other words, although apply itself will not block the calling thread, it will transfer the waiting time to the main thread. Therefore, if the write task is executed slowly and the sp operation of activity, service, broadcast is not completed at the end of the life cycle, the main thread will be blocked and cause ANR. At this point in the analysis, it is obvious that it is a system problem, and the App is powerless. Fortunately, this problem has been alleviated on Xiaomi phones, and the solution will not be disclosed. Generally, the rules for looking at traces are as follows:

  • When an ANR occurs, the corresponding process cannot be found in the trace. Check whether the Android Runtime is ShutDown due to the crash of the application. If it is ShutDown, check the reason for the ShutDown at this time.
  • An ANR occurs in an application. If the main thread is executing getContentProvider, then it is requesting the ContentProvider of another application. At this time, check the host process of the target ContentProvider to see what it is doing.
  • If the main thread performs database operations or network requests, it should be a problem with the application itself.
  • If the main thread waits for locks held by other threads, and the target thread performs database operations or network requests, then it is a problem with the application itself.

Here we just start with a case to familiarize ourselves with the basic process of ANR analysis. Let’s summarize the above routine here:

  • Grab the bugreport, search for ANR in, and check the time and process of occurrence
  • Find the trace of the main thread according to the process and find the blocked place
  • Analysis and solution based on the source code.
    Of course, through these two steps, we can locate the cause of ANR, which shows that we are relatively lucky, but most of the time it is not like this.

Above we analyzed an ANR caused by a system problem. Here you may think, my app did not do any work, but an ANR occurred. In the future, I can directly blame the system. No, the specific problem still needs to be analyzed in detail. , to doubt the system, we need evidence. Where does the evidence come from, or from the Log. Continue to the third section, the system time-consuming analysis plan.

3. System time-consuming analysis plan

The system has done some time-consuming analysis operations. In some mobile phone manufacturers, there are other enhancements in the Log. Here are some of the more common ones.

3.1、binder_sample

  • A. Function description: Monitor the time-consuming situation of the binder transaction of the main thread of each process. When the threshold is exceeded, the corresponding target call information is output. It is turned on at 1000ms by default.
  • B.log格式: 52004 binder_sample (descriptor|3),(method_num|1|5),(time|1|3),(blocking_package|3),(sample_percent|1|6)
  • C.log example:
2754 2754 I binder_sample: [android.app.IActivityManager,35,2900,android.process.media,5]

From the above log, it can be concluded that
1. Main thread 2754;
2. Execute android.app.IActivityManager interface

  1. The corresponding method code =35 (i.e. STOP_SERVICE_TRANSACTION),
  2. The time taken is 2900ms.
  3. The package where this block is located is android.process.media, and the last parameter is the sample ratio (not of much value)

3.2、dvm_lock_sample

  • A. Function description: When the time blocked by a thread waiting for lock exceeds the threshold, the current lock status is output;
  • B.log格式: 20003 dvm_lock_sample (process|3),(main|1|5),(thread|3),(time|1|3),(file|3),(line|1|5),(ownerfile|3),(ownerline|1|5),(sample_percent|1|6)
  • C.log example:
dvm_lock_sample: [system_server,1,Binder_9,1500,ActivityManagerService.java,6403,-,1448,0]

This means that system_server: Binder_9, executed to line 6403 of ActivityManagerService.java, has been waiting for the AMS lock, and the lock is held by line 1448 of the same file, causing the Binder_9 thread to be blocked for 1500ms.

3.3、 binder starved

  • A. Function description: When the thread pool of processes such as system_server is used up and there are no idle threads, binder communication is in a starvation state. If the starvation state exceeds a certain threshold, information will be output;
  • B. Cloud control parameters: persist.sys.binder.starvation (default value 16ms)
  • C.log example:
1232 1232 "binder thread pool (16 threads) starved for 100 ms"
  • D.log analysis: The thread pool of the system_server process is full for up to 100ms.

Generally, after having this information, it can help us determine whether the cause of the problem is the system or the App. See Case 2 below:

3.4. Case 2: Crazy Binder Call leads to application ANR

Search ANR in

08-28 18:54:00.110  1000  1825  1848 E ActivityManager: ANR in com.jeejen.family (com.jeejen.family/com.jeejen.home.launcher.ShoppingActivity)
08-28 18:54:00.110  1000  1825  1848 E ActivityManager: PID: 20576
08-28 18:54:00.110  1000  1825  1848 E ActivityManager: Reason: Input dispatching timed out (com.jeejen.family/com.jeejen.home.launcher.WelcomeActivity, Waiting to send non-key event because the touched window has not finished processing certain input events that were delivered to it over 500.0ms ago.  Wait queue length: 2.  Wait queue head age: 10064.4ms.)
08-28 18:54:00.110  1000  1825  1848 E ActivityManager: Parent: com.jeejen.family/com.jeejen.home.launcher.WelcomeActivity
08-28 18:54:00.110  1000  1825  1848 E ActivityManager: Load: 1.25 / 1.1 / 1.37
08-28 18:54:00.110  1000  1825  1848 E ActivityManager: CPU usage from 5166ms to 0ms ago (2018-08-28 18:53:51.270 to 2018-08-28 18:53:56.436):
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   7.7% 1825/system_server: 5.6% user + 2.1% kernel / faults: 1329 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   3.6% 20683/com.jeejen.family:pushcenter_pushservice: 3% user + 0.5% kernel / faults: 542 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   2.7% 4114/cnss_diag: 1.9% user + 0.7% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   2.1% 422/kworker/u16:7: 0% user + 2.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   1.9% 20830/com.jeejen.family:store: 1.3% user + 0.5% kernel / faults: 199 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   1.7% 20608/com.jeejen.family:pushcenter: 1.1% user + 0.5% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   1.5% 725/[email protected]: 0.7% user + 0.7% kernel / faults: 1 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.9% 3538/com.android.systemui: 0.7% user + 0.1% kernel / faults: 11 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.5% 241/crtc_commit:111: 0% user + 0.5% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.5% 419/kworker/u16:4: 0% user + 0.5% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.5% 786/surfaceflinger: 0.5% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.3% 185/IPCRTR_dsps_sme: 0% user + 0.3% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.3% 730/[email protected]: 0.1% user + 0.1% kernel / faults: 28 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.3% 820/dsps_IPCRTR: 0% user + 0.3% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.3% 1147/msm_irqbalance: 0.1% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.3% 4113/sugov:0: 0% user + 0.3% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 10/rcuop/0: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 18/ksoftirqd/1: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0% 34/ksoftirqd/3: 0% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0% 53/rcuop/5: 0% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0% 61/rcuop/6: 0% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 242/crtc_event:111: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 538/ueventd: 0.1% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 577/jbd2/sda22-8: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 591/logd: 0.1% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 719/[email protected]: 0.1% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 928/thermal-engine: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 3490/cds_mc_thread: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 3491/cds_ol_rx_threa: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 3680/com.android.phone: 0% user + 0.1% kernel / faults: 16 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 4248/com.miui.daemon: 0.1% user + 0% kernel / faults: 4 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 4488/com.miui.powerkeeper: 0.1% user + 0% kernel / faults: 10 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 5545/com.lbe.security.miui: 0% user + 0.1% kernel / faults: 6 minor
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 6490/kworker/u17:2: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 7535/kworker/u16:15: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0% 7723/kworker/3:5: 0% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 15111/kworker/1:0: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 15138/kworker/3:0: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0% 19857/kworker/0:3: 0% user + 0% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager:   0.1% 20492/kworker/5:3: 0% user + 0.1% kernel
08-28 18:54:00.110  1000  1825  1848 E ActivityManager: 3.8% TOTAL: 2% user + 1.1% kernel + 0% iowait + 0.3% irq + 0.1% softirq

According to the above routine, everything is relatively normal. The time of occurrence is probably 08-28 18:54:00.110. I am looking at the trace of the main thread.

----- pid 20576 at 2018-08-28 18:53:56 -----
Cmd line: com.jeejen.family
"main" prio=5 tid=1 Native
| group="main" sCount=1 dsCount=0 flags=1 obj=0x77ffca18 self=0xecfce000
| sysTid=20576 nice=-10 cgrp=default sched=0/0 handle=0xf0bf2494
| state=S schedstat=( 628294395 402363898 957 ) utm=42 stm=20 core=4 HZ=100
| stack=0xff5fe000-0xff600000 stackSize=8MB
| held mutexes=
kernel: (couldn't read /proc/self/task/20576/stack)
native: #00 pc 00053cfc /system/lib/libc.so (__ioctl+8)
native: #01 pc 00021cd3 /system/lib/libc.so (ioctl+30)
native: #02 pc 0003d3f5 /system/lib/libbinder.so (android::IPCThreadState::talkWithDriver(bool)+204)
native: #03 pc 0003dde3 /system/lib/libbinder.so (android::IPCThreadState::waitForResponse(android::Parcel*, int*)+26)
native: #04 pc 0003713d /system/lib/libbinder.so (android::BpBinder::transact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+36)
native: #05 pc 000c3cf1 /system/lib/libandroid_runtime.so (android_os_BinderProxy_transact(_JNIEnv*, _jobject*, int, _jobject*, _jobject*, int)+200)
at android.os.BinderProxy.transactNative(Native method)
at android.os.BinderProxy.transact(Binder.java:1127)
at android.net.wifi.IWifiManager$Stub$Proxy.getConnectionInfo(IWifiManager.java:1441)
at android.net.wifi.WifiManager.getConnectionInfo(WifiManager.java:1778)
at org.chromium.net.NetworkChangeNotifierAutoDetect$WifiManagerDelegate.getWifiInfoLocked(NetworkChangeNotifierAutoDetect.java:28)
at org.chromium.net.NetworkChangeNotifierAutoDetect$WifiManagerDelegate.getWifiSsid(NetworkChangeNotifierAutoDetect.java:22)
- locked <0x0f4edae7> (a java.lang.Object)
at org.chromium.net.NetworkChangeNotifierAutoDetect.getCurrentNetworkState(NetworkChangeNotifierAutoDetect.java:67)
at org.chromium.net.NetworkChangeNotifierAutoDetect.<init>(NetworkChangeNotifierAutoDetect.java:21)
at org.chromium.net.NetworkChangeNotifier.setAutoDetectConnectivityStateInternal(NetworkChangeNotifier.java:61)

It seems that the binder call is blocked, and the calling interface is IWifiManager.getConnectionInfo(). Because it is a binder call, check binder_sample.

08-28 18:54:01.384 10171 20576 20576 I binder_sample: [android.net.wifi.IWifiManager,24,16004,com.jeejen.family,100]
08-28 18:54:04.868 10171 20576 20576 I binder_sample: [android.net.wifi.IWifiManager,24,3479,com.jeejen.family,100]
08-28 18:56:12.712 10171 21885 21885 I binder_sample: [android.net.wifi.IWifiManager,24,8963,com.jeejen.family,100]

It can be seen that at times near ANR, the binder call using the IWifiManager interface does take a long time. So is this a system reason? Then take a look at the code of his peer Sysytem.

1763    /**
1764     * See {@link android.net.wifi.WifiManager#getConnectionInfo()}
1765     * @return the Wi-Fi information, contained in {@link WifiInfo}.
1766     */
1767    @Override
1768    public WifiInfo getConnectionInfo() {
1769        enforceAccessPermission();
1770        mLog.trace("getConnectionInfo uid=%").c(Binder.getCallingUid()).flush();
1771        /*
1772         * Make sure we have the latest information, by sending
1773         * a status request to the supplicant.
1774         */
1775        return mWifiStateMachine.syncRequestConnectionInfo();
1776    }
1521    public WifiInfo syncRequestConnectionInfo() {
1522        WifiInfo result = new WifiInfo(mWifiInfo);
1523        return result;
1524    }

getConnectionInfo calls syncRequestConnectionInfo in wifiStateMachine directly through wifiService. The implementation of this part will not block. Is Binder full? This is not seen from the trace, so what's going on? We tried to reproduce this problem, but fortunately it was relatively easy to reproduce.

09-04 18:24:29.182 D/WifiStateMachine( 1312): syncRequestConnectionInfo/in SSID: MIOffice-5G, BSSID: 70:3a:0e:2c:bb:f1, MAC: 80:ad:16:4c:0b:fe, Supplicant state: COMPLETED, RSSI: -44, Link speed: 400Mbps, Frequency: 5180MHz, Net ID: 0, Metered hint: false, score: 60 09-04 18:24:29.182 D/WifiStateMachine( 1312): syncRequestConnectionInfo/out SSID: MIOffice-5G, BSSID: 70:3a:0e:2c:bb:f1, MAC: 80:ad:16:4c:0b:fe, Supplicant state: COMPLETED, RSSI: -44, Link speed: 400Mbps, Frequency: 5180MHz, Net ID: 0, Metered hint: false, score: 60

It was found that the main thread was outputting a large amount of the above Log, and the minimalist desktop called this interface 160 times in 1 minute, causing the SystemServer to be unable to respond to the App in a timely manner, causing the App's own ANR. Problems such as ANR caused by Binder calls are very common, and there is a risk of being blocked. At this time, you can try to execute it asynchronously. Secondly, do not make a large number of Binder calls in a short period of time. This behavior may cause problems in the App itself, or in the worst case, the system may crash and restart with Watchdog. .

3.5. Case 3: Broadcast timeout causes App ANR

Let’s continue with Case 3. According to the above routine, first check the time when ANR occurred in the event log.

12-17 06:02:14.463  1566  1583 I am_anr  : [0,8769,com.android.updater,952680005,Broadcast of Intent { act=android.intent.action.BOOT_COMPLETED flg=0x9000010 cmp=com.android.updater/.BootCompletedReceiver (has extras) }]

The time when ANR occurred is am_anr. The time point is 12-17 06:02:14.463. Continue to look at the Log.

12-17 06:02:00.370  1566  1583 W BroadcastQueue: Timeout of broadcast BroadcastRecord{21ef8c2 u0 android.intent.action.BOOT_COMPLETED} - receiver=android.os.BinderProxy@2a6c365, started 60006ms ago
12-17 06:02:00.370  1566  1583 W BroadcastQueue: Receiver during timeout: ResolveInfo{5a8283a com.android.updater/.BootCompletedReceiver m=0x108000}
12-17 06:02:00.370  1566  1583 I am_broadcast_discard_app: [0,35584194,android.intent.action.BOOT_COMPLETED,49,ResolveInfo{5a8283a com.android.updater/.BootCompletedReceiver m=0x108000}]

However, we found that ANR has occurred at 12-17 06:02:00.370, indicating that the time in the event log is an approximate value, and there may be a certain degree of lag due to tight system resources. Since it is the android.intent.action.BOOT_COMPLETED broadcast that receives the ANR, then we can follow the clues.

12-17 06:01:00.383  1566  3524 I ActivityManager: Start proc 8769:com.android.updater/9802 for broadcast com.android.updater/.BootCompletedReceiver caller=null

The broadcast process was started at 12-17 06:01:00.383

12-17 06:01:36.721  8769  8769 D BootCompletedReceiver: onReceive android.intent.action.BOOT_COMPLETED
12-17 06:02:14.725 8769 8769 D UpdateService: onCreate

At 12-17 06:01:36.721, the client BootCompletedReceiver onReceiver method starts the callback, then onReceive starts UpdateService, and the time to call UpdateService.onCreate is 12-17 06:02:14.725. Based on the above analysis, there are two preliminary questions.

The broadcast started at 12-17 06:01:00.383, the onReceiver method of the broadcast started callback at 12-17 06:01:36.721, and the ANR time was at 12-17 06:02:00.370, so why did it start 36 seconds later? I just received the bootcompleted broadcast, which is abnormal in itself. Secondly, why does it take nearly more than a minute to start the UpdateService through the broadcast? After analyzing this point, the App students think that it is impossible to analyze further, and 80% of it is due to the system. Judging from the CPU statistics, it is believed that it is caused by excessive usage of certain programs, and the following Log is posted.

12-17 06:02:19.286  1566  1583 E ActivityManager: ANR in com.android.updater
12-17 06:02:19.286  1566  1583 E ActivityManager: PID: 8769
12-17 06:02:19.286  1566  1583 E ActivityManager: Reason: Broadcast of Intent { act=android.intent.action.BOOT_COMPLETED flg=0x9000010 cmp=com.android.updater/.BootCompletedReceiver (has extras) }
12-17 06:02:19.286  1566  1583 E ActivityManager: Load: 0.0 / 0.0 / 0.0
12-17 06:02:19.286  1566  1583 E ActivityManager: CPU usage from 0ms to 18846ms later (2017-12-17 06:02:00.379 to 2017-12-17 06:02:19.224):
12-17 06:02:19.286  1566  1583 E ActivityManager:   195% 6142/com.immomo.momo: 195% user + 0% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   2.3% 8170/com.tencent.mm: 2.3% user + 0% kernel / faults: 448 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.7% 1566/system_server: 0.4% user + 0.3% kernel / faults: 150 minor 1 major
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.4% 90/kworker/u16:3: 0% user + 0.4% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.3% 4704/com.tencent.mm:push: 0.1% user + 0.2% kernel / faults: 116 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.3% 8769/com.android.updater: 0.2% user + 0.1% kernel / faults: 1600 minor 2 major
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.2% 4790/com.tencent.mm:patch: 0.2% user + 0% kernel / faults: 748 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.2% 329/mmc-cmdqd/0: 0% user + 0.2% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.2% 5429/com.tencent.mm:push: 0% user + 0.1% kernel / faults: 17 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.2% 5435/com.tencent.mm:patch: 0.2% user + 0% kernel / faults: 82 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.2% 8712/com.tencent.mm:exdevice: 0.1% user + 0% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 432/logd: 0.1% user + 0% kernel / faults: 4 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 844/msm_irqbalance: 0% user + 0.1% kernel / faults: 4 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 7580/kworker/u16:2: 0% user + 0.1% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 7/rcu_preempt: 0% user + 0.1% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 1240/zygote: 0% user + 0.1% kernel / faults: 84 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0% 3216/com.xiaomi.simactivate.service: 0% user + 0% kernel / faults: 5 minor
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 8645/kworker/7:0: 0% user + 0.1% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0.1% 8730/kworker/4:2: 0% user + 0.1% kernel
12-17 06:02:19.286  1566  1583 E ActivityManager:   0% 45/rcuop/4: 0% user + 0% kernel

However, the CPU occupancy of 195% is not high. In multi-core, the maximum occupancy of each core is 100% (the eight-core occupancy is 800% ). Secondly, Load: 0.0 / 0.0 / 0.0. Load is not active during 15 minutes. It's 0. It's stopped? It seems that this Log is not quite correct. However, on Xiaomi phones, ANR monitoring will be strengthened, and the following Log is output.

12-17 06:02:14.693  8769  8769 W MIUI-BLOCK-MONITOR: The msg { when=-36s107ms what=113 obj=ReceiverData{intent=Intent { act=android.intent.action.BOOT_COMPLETED flg=0x9000010 cmp=com.android.updater/.BootCompletedReceiver (has extras) } packageName=com.android.updater resultCode=0 resultData=null resultExtras=null} target=android.app.ActivityThread$H planTime=1513461660613 dispatchTime=1513461696720 finishTime=0 } took 74080ms and took 37973ms after dispatch. 

We additionally record the time points of each status of each Message to facilitate our analysis.

  • when: the time from when the message should be executed to when anr occurs
  • planTime: the time point when the message plan is executed
  • dispatchTime: the time point when the message is actually executed
  • finishTime: The time point when the message is completed.
    Calculate the message execution time as: -when-(dispatchTime-planTime)=0, so what does this mean? This means that the Message 113 was about to start executing, but an ANR occurred before it started executing. It waited for 36 seconds in the Looper message queue of the main thread. There is no trace of the main thread in this log, and it has no effect because it can be seen that this message has not been executed yet? So what are you doing during this 36 seconds? There are more logs below.
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: The binder call took 3973ms.
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: java.lang.Throwable
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.os.AnrMonitor.checkBinderCallTime(AnrMonitor.java:591)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.os.BinderProxy.transact(Binder.java:623)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.content.pm.IPackageManager$Stub$Proxy.getApplicationInfo(IPackageManager.java:2658)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.ApplicationPackageManager.getApplicationInfoAsUser(ApplicationPackageManager.java:340)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.ApplicationPackageManager.getApplicationInfo(ApplicationPackageManager.java:333)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at miui.core.ManifestParser.create(SourceFile:64)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at miui.core.SdkManager.start(SourceFile:186)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at java.lang.reflect.Method.invoke(Native Method)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at miui.external.a.abx()
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at miui.external.a.attachBaseContext()
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.Application.attach(Application.java:193)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.Instrumentation.newApplication(Instrumentation.java:1009)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.Instrumentation.newApplication(Instrumentation.java:993)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.LoadedApk.makeApplication(LoadedApk.java:800)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.ActivityThread.handleBindApplication(ActivityThread.java:5471)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.ActivityThread.-wrap2(ActivityThread.java)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1584)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.os.Handler.dispatchMessage(Handler.java:102)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.os.Looper.loop(Looper.java:163)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at android.app.ActivityThread.main(ActivityThread.java:6221)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at java.lang.reflect.Method.invoke(Native Method)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:904)
12-17 06:01:29.334 8769 8769 W MIUI-BLOCK-MONITOR: at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:794)

The bindApplication operation is being performed during the 36 seconds. In this case, it is most likely that the system status is not optimistic at this time. Secondly, through the above analysis, we can also see that the onReceiver of BootCompletedReceiver is processed in the main thread, and starting the Service also takes time. After spending a lot of time, we can also consider specifying the Handler when registering the Receiver and letting the onReceiver run in the child thread (how to do this, you can see the source code)

4. Analysis routines for ANR problems

  • Grab the bugreport, search for ANR in, check the time and process of occurrence, and see if there is any problem with the CPU load.
  • Search the trace of the main thread according to the process and find the blocked place. If it is a Binder call, further confirm the situation of the opposite end; if it is a time-consuming operation, directly modify it to asynchronous. If you suspect that the system execution is slow, you can check binder_sample, dvm_lock and other information. , Secondly, whether there are many gcs and whether lmk kills processes frequently can tell the health status of the system.
  • Analyze and solve problems based on source code

This article only records some cases and analysis methods. The guiding idea is to find out why the main thread was blocked in the past period of time. Generally speaking, it is relatively easy to master. It has not yet gone into the specific principles, such as the dump principle of ANR, how the system determines ANR, what to do with invalid traces of ANR, and other more in-depth issues. Because ANR problems are sometimes a headache, trace may not be the first scene of the crime. Some mobile phone manufacturers have strengthened monitoring of ANRs, which can output more information and improve the efficiency of analyzing ANR problems. In addition, It can also be seen that students who work on ANR issues in Room will be more comfortable due to their experience in reading source code at work.

Guess you like

Origin blog.csdn.net/weixin_47465999/article/details/129664103