Android profiler : ANR

android official website: ANR

If an Android app's UI thread is blocked for too long, an "Application Not Responding" (ANR) error can be triggered. If the app is in the foreground, the system displays a dialog box to the user, as shown in Figure 1. The ANR dialog box provides the user with the option to force quit the application.

Insert image description here

Figure 1. ANR dialog box displayed to the user

ANRs are a problem because the application's main thread, which is responsible for updating the interface, cannot handle user input events or drawing operations, causing user dissatisfaction. For more information about your app's main thread, see Processes and Threads.

An ANR will be triggered for your app when any of the following conditions occur:

  • Input dispatch timeout: If your app does not respond to an input event (such as a key press or screen touch) within 5 seconds.
  • Execution of services: If the service declared by the application cannot complete Service.onCreate() and Service.onStartCommand()/Service.onBind() execution within a few seconds.
  • Service.startForeground() not called: If your app uses Context.startForegroundService() to start a new service in the foreground, but the service does not call startForeground() within 5 seconds.
  • Intent broadcast: If the BroadcastReceiver does not complete execution within a set period of time. If the application has any foreground activity, this timeout period is 5 seconds.
    If your app encounters an ANR error, you can use the guidance in this article to diagnose and resolve the issue.

Detection issues

If you have published your app, you can use Android Vitals to view information about your app's ANR. You can use other tools to detect ANR issues in the field, but please note that unlike Android Vitals, third-party tools cannot report ANR issues on older versions of Android (Android 10 and lower).

Android Vitals

Android Vitals helps you monitor and reduce your app's ANR incidence. Android Vitals measures the occurrence of various ANRs:

  • ANR incidence: The percentage of daily active users who experience any type of ANR.
  • User-perceived ANR incidence rate: The percentage of daily active users who have encountered at least 1 user-perceived ANR. Currently, we only consider Input dispatching timed out type ANRs as user-aware.
  • Multiple ANR incidence rate: the percentage of daily active users who have encountered at least 2 ANRs.
    "Daily active users" is the number of unique users who use your app in a day (possibly across multiple sessions). If a user uses your app on multiple devices during the day, the number of active users for that day is counted for each device. If multiple users use the same device throughout the day, they will be counted as one active user.

"User-perceived ANR incidence" is a core metric of Android vitals, which means it affects your app's visibility on Google Play. This metric is important because it counts ANR errors that occur all the time when users interact with your app, causing the most severe outages.

Play defines two bad behavior thresholds for this metric:

  • Overall Bad Behavior Threshold: Across all device models, at least 0.47% of daily active users experience a user-perceived ANR.
  • Single device adverse behavior threshold: On a single device model, at least 8% of daily active users experience user-perceived ANR.
    If your app exceeds the overall bad behavior threshold, your app may experience reduced visibility across all devices. If your app exceeds the single-device bad behavior threshold on certain devices, your app's visibility on those devices may be reduced and a warning may appear in your store listing.

Android Vitals can alert you through the Play Console when your app encounters ANR errors too many times.

To learn how Google Play collects Android vitals data, see the Play Console documentation.

Diagnosing ANR

There are several common patterns to consider when diagnosing ANR:

  • The application performs operations involving I/O very slowly on the main thread.
  • Applications perform long-running computations on the main thread.
  • The main thread is making a synchronous binder call to another process, and the latter takes a long time to return.
  • The main thread is blocked, waiting for a synchronized block for a long operation that occurs on another thread.
  • A deadlock occurs between the main thread and another thread in the process or through a binder call. The main thread is not only waiting for the long operation to complete, but is in a deadlock state. For more information, see Deadlock on Wikipedia.
    The following techniques can help you determine the cause of ANR.

HealthStats

HealthStats captures total user and system time, CPU time, network, wireless device statistics, screen on/off time, and wake-up alarms to provide information about app health. This helps measure overall CPU usage and power consumption.

debug

Debug helps you inspect Android apps during development, including tracing and allocation counts, to find stuck and lag issues in your app. You can also use Debug to obtain runtime and native memory counters and memory metrics, which can help determine the memory footprint of a specific process.

ApplicationExitInfo

ApplicationExitInfo is available in Android 11 (API level 30) or higher and provides information about the reason why the app exited. Such causes include ANR, out of memory, app crashes, high CPU usage, user interruption, system interruption, or runtime permission changes.

strict mode

Using StrictMode can help you detect unexpected I/O operations on the main thread when developing applications. You can use StrictMode at the application level or activity level.

Enable background ANR dialog

Android displays the ANR dialog box for apps that take too long to process broadcast messages only if Show all ANRs is enabled in the device's Developer Options. Therefore, the background ANR dialog box is not always displayed to the user, but the app may still experience performance issues.

TraceView

You can use TraceView to get trace information of a running application when viewing use cases and find out where the main thread is busy. To learn how to use TraceView, see Analyzing Performance with TraceView and dmtracedump.

Pull tracking information file

Android stores tracing information when an ANR is encountered. In older operating system versions, there is only one /data/anr/traces.txt file on the device. In newer operating system versions, there are multiple /data/anr/anr_* files. You can use Android Debug Bridge (adb) as root to get ANR trace information from the device or emulator:

adb root
adb shell ls /data/anr
adb pull /data/anr/<filename>

You can get a bug report from a physical device using the "Generate bug report" developer option on the device or the adb bugreport command on your development machine. For more information, see Obtaining and reading bug reports.

Solve the problem

After you identify the problem, you can use the tips in this section to resolve common problems.

Slow code execution on the main thread

Find places in your code where your application's main thread is busy for more than 5 seconds. Find suspicious use cases in your app and try to reproduce the ANR.

For example, Figure 2 shows the TraceView timeline where the main thread was busy for more than 5 seconds.

Insert image description here

Figure 2. TraceView timeline shows a busy main thread

Figure 2 shows that most of the offending code occurs in the onClick(View) handler, as shown in the following code example:

@Override
public void onClick(View view) {
    
    
    // This task runs on the main thread.
    BubbleSort.sort(data);
}

In this case, you should move the work running in the main thread to the worker thread. The Android framework contains classes that help move tasks to worker threads. For more information, see Worker threads.

IO on main thread

Performing IO operations on the main thread is a common cause of slow operations on the main thread, and slow operations on the main thread can cause ANR. It is recommended to move all IO operations to worker threads as shown in the previous section.

Examples of IO operations include network and storage operations. For more information, see Perform network operations and save data.

lock contention

In some cases, the work that causes the ANR is not performed directly on the application's main thread. An ANR may occur if a worker thread holds a lock on a resource that is required by the main thread to complete its work.

For example, Figure 4 shows a TraceView timeline in which most of the work is performed on worker threads.

Insert image description here

Figure 4. TraceView timeline showing work being performed on worker threads

But if the user still encounters ANR, you should check the status of the main thread in Android Device Monitor. Typically, the main thread is in the RUNNABLE state if it is ready to update the interface and is generally responsive.

But if the main thread cannot continue execution, it is in the BLOCKED state and cannot respond to events. This status will be displayed as "monitoring" or "waiting" in Android Device Monitor, as shown in Figure 5.

Insert image description here

Figure 5. Main thread in "monitoring" state

The following trace shows the application's main thread blocked waiting for resources:

...
AsyncTask #2" prio=5 tid=18 Runnable
  | group="main" sCount=0 dsCount=0 obj=0x12c333a0 self=0x94c87100
  | sysTid=25287 nice=10 cgrp=default sched=0/0 handle=0x94b80920
  | state=R schedstat=( 0 0 0 ) utm=757 stm=0 core=3 HZ=100
  | stack=0x94a7e000-0x94a80000 stackSize=1038KB
  | held mutexes= "mutator lock"(shared held)
  at com.android.developer.anrsample.BubbleSort.sort(BubbleSort.java:8)
  at com.android.developer.anrsample.MainActivity$LockTask.doInBackground(MainActivity.java:147)
  - locked <0x083105ee> (a java.lang.Boolean)
  at com.android.developer.anrsample.MainActivity$LockTask.doInBackground(MainActivity.java:135)
  at android.os.AsyncTask$2.call(AsyncTask.java:305)
  at java.util.concurrent.FutureTask.run(FutureTask.java:237)
  at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:243)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607)
  at java.lang.Thread.run(Thread.java:761)
...

Reviewing trace information can help you find code that is blocking the main thread. The following code holds a lock that causes the main thread to block in the previous trace:

@Override
public void onClick(View v) {
    
    
    // The worker thread holds a lock on lockedResource
   new LockTask().execute(data);

   synchronized (lockedResource) {
    
    
       // The main thread requires lockedResource here
       // but it has to wait until LockTask finishes using it.
   }
}

public class LockTask extends AsyncTask<Integer[], Integer, Long> {
    
    
   @Override
   protected Long doInBackground(Integer[]... params) {
    
    
       synchronized (lockedResource) {
    
    
           // This is a long-running operation, which makes
           // the lock last for a long time
           BubbleSort.sort(params[0]);
       }
   }
}

Another example is an application's main thread waiting for results from a worker thread, as shown in the following code. Please note that it is not recommended to use wait() and notify() in Kotlin. Kotlin has its own mechanism for handling concurrent operations. When using Kotlin, you should try to use Kotlin-specific mechanisms.

public void onClick(View v) {
    
    
   WaitTask waitTask = new WaitTask();
   synchronized (waitTask) {
    
    
       try {
    
    
           waitTask.execute(data);
           // Wait for this worker thread’s notification
           waitTask.wait();
       } catch (InterruptedException e) {
    
    }
   }
}

class WaitTask extends AsyncTask<Integer[], Integer, Long> {
    
    
   @Override
   protected Long doInBackground(Integer[]... params) {
    
    
       synchronized (this) {
    
    
           BubbleSort.sort(params[0]);
           // Finished, notify the main thread
           notify();
       }
   }
}

There are other situations that will block the main thread, including threads using Lock, Semaphore, and resource pools (such as database connection pools) or other mutual exclusion (mutex lock) mechanisms.

You should generally evaluate the locks your application holds on resources, but if you want to avoid ANR, you should look at the locks it holds on resources required by the main thread.

Make sure you minimize the time you hold the lock, or better yet, evaluate whether your application needs to hold the lock from the beginning. If you use locks to determine when to update the interface based on processing by worker threads, use mechanisms such as onProgressUpdate() and onPostExecute() to communicate between the worker threads and the main thread.

deadlock

A deadlock occurs when a thread enters a wait state because the required resource is held by another thread that is also waiting for a resource held by the first thread. If the application's main thread is in this situation, an ANR is likely to occur.

The deadlock phenomenon is well studied in the field of computer science, and there are currently several deadlock prevention algorithms that can be used to avoid deadlocks.

For more information, see Deadlocks and Deadlock Prevention Algorithms on Wikipedia.

Broadcast receiver that performs slowly

Apps can respond to broadcast messages through broadcast receivers, such as enabling or disabling airplane mode or changing connection status. ANR occurs if the application takes too long to process the broadcast message.

ANR occurs under the following circumstances:

The broadcast receiver took quite a while to finish executing the onReceive() method.
The broadcast receiver called goAsync() on the PendingResult object but failed to call finish().
Your app should only perform short operations in the BroadcastReceiver's onReceive() method. However, if your app requires more complex processing due to broadcast messages, you should defer that task to the IntentService.

You can use tools such as TraceView to identify whether a broadcast receiver is performing a long-running operation on your app's main thread. For example, Figure 6 shows the timeline for a broadcast receiver that took approximately 100 seconds to process messages on the main thread.

Insert image description here

Figure 6. TraceView timeline showing BroadcastReceiver work on the main thread

This behavior can be caused by performing a long-running operation on the BroadcastReceiver's onReceive() method, as shown in the following example:

@Override
public void onReceive(Context context, Intent intent) {
    
    
    // This is a long-running operation
    BubbleSort.sort(data);
}

In such cases, we recommend moving long-running operations to the IntentService as it uses worker threads to perform its work. The following code shows how to use IntentService to handle long-running operations:

@Override
public void onReceive(Context context, Intent intent) {
    
    
    // The task now runs on a worker thread.
    Intent intentService = new Intent(context, MyIntentService.class);
    context.startService(intentService);
}

public class MyIntentService extends IntentService {
    
    
   @Override
   protected void onHandleIntent(@Nullable Intent intent) {
    
    
       BubbleSort.sort(data);
   }
}

Due to the use of IntentService, long-running operations will be executed on the worker thread (not the main thread). Figure 7 shows the work deferred to the worker thread in the TraceView timeline.

Insert image description here

Figure 7. TraceView timeline showing broadcast messages processed on worker threads

Your broadcast receiver can use goAsync() to indicate to the system that it needs more time to process the message. However, you should call finish() on the PendingResult object. The following example shows how to call finish() to let the system recycle the broadcast receiver and avoid ANR:

final PendingResult pendingResult = goAsync();
new AsyncTask<Integer[], Integer, Long>() {
    
    
   @Override
   protected Long doInBackground(Integer[]... params) {
    
    
       // This is a long-running operation
       BubbleSort.sort(params[0]);
       pendingResult.finish();
   }
}.execute(data);

However, if the broadcast is running in the background, moving the code from the slow broadcast receiver to another thread and using goAsync() will not solve the ANR problem. ANR timeouts still apply.

GameActivity

The GameActivity library can reduce the ANR issues explored in case studies of games and applications written in C or C++. If you replace an existing native activity with a GameActivity, you can reduce UI thread blocking and prevent certain ANRs from occurring.

To learn more about ANR, see Keep your app responsive. To learn more about threads, see Thread performance.

Guess you like

Origin blog.csdn.net/sinat_31057219/article/details/132451008