[Android] Processo de geração de ANR do Android e método de análise

prefácio

O problema Android ANR sempre foi um problema relativamente difícil de resolver. Em primeiro lugar, é difícil de reproduzir e, em segundo lugar, não é fácil de analisar após a recorrência. Este artigo classifica o processo de geração de ANR e como localizar a causa do ANR para obter o arquivo de log. Na verdade, o monitoramento online de ANR também é bastante complicado. Depois de ler este artigo, vamos ver algumas soluções de monitoramento de terminal ANR (como o WeChat Matrix) e talvez ter uma ideia mais clara.

Quando ANR aparece como mostrado abaixo:
insira a descrição da imagem aqui

O que é ANR

ANR significa que o aplicativo não responde há muito tempo e uma janela pop-up aparecerá na interface (como mostrado acima). Não é uma exceção de tempo de execução e não pode ser capturado por catch. E ele é pop-up pelo processo do System Server, então o processo do aplicativo não consegue percebê-lo (pode ser percebido na camada nativa, que será analisada a seguir). Quando o ANR ocorre, o processo do System Server imprimirá logs no Logcat e enviará logs mais detalhados para o /data/anrdiretório e os armazenará na forma de arquivos (geralmente chamados de arquivos de rastreamento ANR).

causas de ANR

O Android ANR geralmente é gerado pelos seguintes motivos:

Service Timeout : o serviço de primeiro plano não é concluído em 20s e o serviço de segundo plano não é concluído em 200s;
BroadcastQueue Timeout : a transmissão de primeiro plano é concluída em 10s e o plano de fundo é de 60s;
ContentProvider Timeout : o tempo limite de liberação do provedor é de 10s;
InputDispatching Timeout : O tempo limite de processamento do evento de entrada é de 5s, incluindo eventos de tecla e toque.

O cenário mais comum é o quarto, ou seja, o tempo limite de resposta do evento de entrada, principalmente eventos de toque. Por que o tempo está esgotado? Geralmente, é porque o thread principal está bloqueado por alguns motivos, como tarefas demoradas, cálculos complexos, impasses, hibernação e assim por diante.

Processo de geração de ANR

Service Timeout indica que o ciclo de vida do componente Service funciona como o tempo limite do processo onCreate. O exemplo a seguir ilustra como a função de ciclo de vida onCreate de Service gera AAR.

onCreate é chamado após startService, então startService. O código a seguir é baseado no Android SDK 29.
O processo é como se segue:

Context.startService
ContextImpl.startService
ActivityManagerService.startService
ActiveServices.startServiceLocked
ActiveServices.startServiceInnerLocked
ActiveServices.bringUpServiceLocked
ActiveServices.realStartServiceLocked

Em seguida, concentre-se ActiveServices.realStartServiceLockedno código da função:

  private final void realStartServiceLocked(ServiceRecord r,
            ProcessRecord app, boolean execInFg) throws RemoteException {
    
    
		...
		//这个函数会发送一条延时20秒的消息
        bumpServiceExecutingLocked(r, execInFg, "create");
   		...
        try {
    
    
            ...
           //通知app进程创建Service:这里面会调用onCreate生命周期函数
            app.thread.scheduleCreateService(r, r.serviceInfo,
  		    ...
  		 } catch (DeadObjectException e) {
    
    
		 ...

bumpServiceExecutingLockedEnviar função de mensagem de atraso:

 private final void bumpServiceExecutingLocked(ServiceRecord r, boolean fg, String why) {
    
    
  ...
   scheduleServiceTimeoutLocked(r.app);
   ...
}

Continue a olhar para scheduleServiceTimeoutLockeda função:

    void scheduleServiceTimeoutLocked(ProcessRecord proc) {
    
    
        //获取延时消息
        Message msg = mAm.mHandler.obtainMessage(
                ActivityManagerService.SERVICE_TIMEOUT_MSG);
        msg.obj = proc;
        //发送延时消息,前台服务是20秒,后台是200秒
        mAm.mHandler.sendMessageDelayed(msg,
                proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
    }

Sobre a definição de SERVICE_TIMEOUTsoma constante SERVICE_BACKGROUND_TIMEOUT:

    //路径:com.android.server.am.ActiveServices.java
    
   // How long we wait for a service to finish executing.
    static final int SERVICE_TIMEOUT = 20*1000;

    // How long we wait for a service to finish executing.
    static final int SERVICE_BACKGROUND_TIMEOUT = SERVICE_TIMEOUT * 10;
    

Pode-se ver que o tempo ANR do serviço em primeiro plano é de 20 segundos e o tempo ANR do serviço em segundo plano é 10 vezes, ou seja, 200 segundos.

Continue para ver como o ActivityThread do processo do aplicativo lida com a criação de tarefas de serviço:

  private void handleCreateService(CreateServiceData data) {
    
    
        Service service = null;
        try {
    
    
            java.lang.ClassLoader cl = packageInfo.getClassLoader();
            service = packageInfo.getAppFactory()
                    .instantiateService(cl, data.info.name, data.intent);
        } catch (Exception e) {
    
    
      ...
        }

        try {
    
    
           ...
            ContextImpl context = ContextImpl.createAppContext(this, packageInfo);
            context.setOuterContext(service);

            Application app = packageInfo.makeApplication(false, mInstrumentation);
            service.attach(context, this, data.info.name, data.token, app,
                    ActivityManager.getService());
            //重点:调用生命周期函数onCreate
            service.onCreate();
            mServices.put(data.token, service);
            try {
    
    
            //重点:通知AMS Service创建完成,会清除handler里的延时消息
                ActivityManager.getService().serviceDoneExecuting(
                        data.token, SERVICE_DONE_EXECUTING_ANON, 0, 0);
            } catch (RemoteException e) {
    
    
                throw e.rethrowFromSystemServer();
            }
            ,,,

service.onCreateDepois disso, o serviço AMS será notificado de que a criação foi concluída.

ActivityManagerService.serviceDoneExecutingO método irá para ActiveServices
serviceDoneExecutingLocked:

  private void serviceDoneExecutingLocked(ServiceRecord r, boolean inDestroying,boolean finishing) {
    
    
 
	... 
	//移除之前发送的延时消息
	mAm.mHandler.removeMessages(ActivityManagerService.SERVICE_TIMEOUT_MSG, r.app);
	...
 }

Pode-se ver que, se a função de ciclo de vida do serviço onCreate for concluída em 20, a mensagem atrasada será apagada do manipulador e a mensagem não será executada.

Se a função de ciclo de vida do serviço onCreate não terminar em 20, a mensagem de atraso enviada anteriormente será executada.

Esta mensagem é a mensagem que manipula o ANR.

Em relação à mensagem atrasada: SERVICE_TIMEOUT_MSG, o processamento do MainHandler é o seguinte:

//com.android.server.am.ActivityManagerService
final class MainHandler extends Handler {
    
    
        public MainHandler(Looper looper) {
    
    
            super(looper, null, true);
        }

        @Override
        public void handleMessage(Message msg) {
    
    
            switch (msg.what) {
    
    
         	...
            case SERVICE_TIMEOUT_MSG: {
    
    
                mServices.serviceTimeout((ProcessRecord)msg.obj);
            ...
         	}
         	...
         }
 }

mServices.serviceTimeoutA implementação é a seguinte:

  void serviceTimeout(ProcessRecord proc) {
    
    
	  ...
      proc.appNotResponding(null, null, null, null, false, 
      ...
  }

Portanto, quando ocorrer um ANR no serviço, ele irá para ProcessRecord.appNotRespondinga função.

Após a análise de outros tipos de ANR, eles também irão para ProcessRecord.appNotRespondingfunções, como timeout do evento de entrada:

//com.android.server.am.ActivityManagerService
   /**
     * Handle input dispatching timeouts.
     * @return whether input dispatching should be aborted or not.
     */
    boolean inputDispatchingTimedOut(ProcessRecord proc, String activityShortComponentName,
            ApplicationInfo aInfo, String parentShortComponentName,
            WindowProcessController parentProcess, boolean aboveSystem, String reason) {
    
    
        if (checkCallingPermission(FILTER_EVENTS) != PackageManager.PERMISSION_GRANTED) {
    
    
            throw new SecurityException("Requires permission " + FILTER_EVENTS);
        }

        final String annotation;
        if (reason == null) {
    
    
            annotation = "Input dispatching timed out";
        } else {
    
    
            annotation = "Input dispatching timed out (" + reason + ")";
        }

        if (proc != null) {
    
    
            synchronized (this) {
    
    
                if (proc.isDebugging()) {
    
    
                    return false;
                }

                if (proc.getActiveInstrumentation() != null) {
    
    
                    Bundle info = new Bundle();
                    info.putString("shortMsg", "keyDispatchingTimedOut");
                    info.putString("longMsg", annotation);
                    finishInstrumentationLocked(proc, Activity.RESULT_CANCELED, info);
                    return true;
                }
            }
            //输入事件超时同样也会走到ProcessRecord.appNotResponding
            proc.appNotResponding(activityShortComponentName, aInfo,
                    parentShortComponentName, parentProcess, aboveSystem, annotation);
        }

        return true;
    }

O fluxo de processamento de tempo limite de eventos de entrada, transmissões e provedores não será analisado um por um.

Portanto, ProcessRecord.appNotRespondingessa função leva ao mesmo objetivo, e todos os tipos de ANR acabarão por vir aqui.

Lidar com ANRs

O processo de ANR é dividido nas seguintes etapas:

收集需要dump堆栈的进程id
分别通知这些进程开始dump线程堆栈-输出到/data/anr目录下
打印Logcat日志
前台进程弹出ANR弹窗/后台进程不弹

O processo detalhado é o seguinte:

//com.android.server.am.ProcessRecord
   void appNotResponding(String activityShortComponentName, ApplicationInfo aInfo,
          String parentShortComponentName, WindowProcessController parentProcess,
          boolean aboveSystem, String annotation) {
    
    
       //收集需要dump堆栈的进程id,分为firstPids、lastPids和nativeProcs
      ArrayList<Integer> firstPids = new ArrayList<>(5);
      SparseArray<Boolean> lastPids = new SparseArray<>(20);

      synchronized (mService) {
    
    
   		...
          // In case we come through here for the same app before completing
          // this one, mark as anring now so we will bail out.
          setNotResponding(true);

          // Dump thread traces as quickly as we can, starting with "interesting" processes.
          firstPids.add(pid);

          // Don't dump other PIDs if it's a background ANR
          if (!isSilentAnr()) {
    
    
              int parentPid = pid;
              if (parentProcess != null && parentProcess.getPid() > 0) {
    
    
                  parentPid = parentProcess.getPid();
              }
              if (parentPid != pid) firstPids.add(parentPid);

              if (MY_PID != pid && MY_PID != parentPid) firstPids.add(MY_PID);

              for (int i = getLruProcessList().size() - 1; i >= 0; i--) {
    
    
                  ProcessRecord r = getLruProcessList().get(i);
                  if (r != null && r.thread != null) {
    
    
                      int myPid = r.pid;
                      if (myPid > 0 && myPid != pid && myPid != parentPid && myPid != MY_PID) {
    
    
                          if (r.isPersistent()) {
    
    
                              firstPids.add(myPid);
                              if (DEBUG_ANR) Slog.i(TAG, "Adding persistent proc: " + r);
                          } else if (r.treatLikeActivity) {
    
    
                              firstPids.add(myPid);
                              if (DEBUG_ANR) Slog.i(TAG, "Adding likely IME: " + r);
                          } else {
    
    
                              lastPids.put(myPid, Boolean.TRUE);
                              if (DEBUG_ANR) Slog.i(TAG, "Adding ANR proc: " + r);
                          }
                      }
                  }
              }
          }
      }
  	//开始组装logcat日志
      // Log the ANR to the main log.
      StringBuilder info = new StringBuilder();
      info.setLength(0);
      info.append("ANR in ").append(processName);
      if (activityShortComponentName != null) {
    
    
          info.append(" (").append(activityShortComponentName).append(")");
      }
      info.append("\n");
      info.append("PID: ").append(pid).append("\n");
      if (annotation != null) {
    
    
          info.append("Reason: ").append(annotation).append("\n");
      }
      if (parentShortComponentName != null
              && parentShortComponentName.equals(activityShortComponentName)) {
    
    
          info.append("Parent: ").append(parentShortComponentName).append("\n");
      }

      ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);

  	//收集需要dump的native进程id
      // don't dump native PIDs for background ANRs unless it is the process of interest
      String[] nativeProcs = null;
      if (isSilentAnr()) {
    
    
          for (int i = 0; i < NATIVE_STACKS_OF_INTEREST.length; i++) {
    
    
              if (NATIVE_STACKS_OF_INTEREST[i].equals(processName)) {
    
    
                  nativeProcs = new String[] {
    
     processName };
                  break;
              }
          }
      } else {
    
    
          nativeProcs = NATIVE_STACKS_OF_INTEREST;
      }

      int[] pids = nativeProcs == null ? null : Process.getPidsForCommands(nativeProcs);
      ArrayList<Integer> nativePids = null;

      if (pids != null) {
    
    
          nativePids = new ArrayList<>(pids.length);
          for (int i : pids) {
    
    
              nativePids.add(i);
          }
      }
  	//重点:开始dump堆栈
      // For background ANRs, don't pass the ProcessCpuTracker to
      // avoid spending 1/2 second collecting stats to rank lastPids.
      File tracesFile = ActivityManagerService.dumpStackTraces(firstPids,
              (isSilentAnr()) ? null : processCpuTracker, (isSilentAnr()) ? null : lastPids,
              nativePids);

      String cpuInfo = null;
      if (isMonitorCpuUsage()) {
    
    
          mService.updateCpuStatsNow();
          synchronized (mService.mProcessCpuTracker) {
    
    
              cpuInfo = mService.mProcessCpuTracker.printCurrentState(anrTime);
          }
          info.append(processCpuTracker.printCurrentLoad());
          info.append(cpuInfo);
      }

      info.append(processCpuTracker.printCurrentState(anrTime));
      
  	  //输出日志到Logcat
      Slog.e(TAG, info.toString());
      if (tracesFile == null) {
    
    
          // There is no trace file, so dump (only) the alleged culprit's threads to the log
          Process.sendSignal(pid, Process.SIGNAL_QUIT);
      }

      synchronized (mService) {
    
    
  		...
  		//后台进程直接杀死,不弹ANR
          if (isSilentAnr() && !isDebugging()) {
    
    
              kill("bg anr", true);
              return;
          }
          //给app进程设置一个ANR状态
          // Set the app's notResponding state, and look up the errorReportReceiver
          makeAppNotRespondingLocked(activityShortComponentName,
                  annotation != null ? "ANR " + annotation : "ANR", info.toString());

          // mUiHandler can be null if the AMS is constructed with injector only. This will only
          // happen in tests.
          //开始弹出ANR弹窗
          if (mService.mUiHandler != null) {
    
    
              // Bring up the infamous App Not Responding dialog
              Message msg = Message.obtain();
              msg.what = ActivityManagerService.SHOW_NOT_RESPONDING_UI_MSG;
              msg.obj = new AppNotRespondingDialog.Data(this, aInfo, aboveSystem);

              mService.mUiHandler.sendMessage(msg);
          }
      }
  }

Continue para ver ActivityManagerServicecomo despejar a pilha:

  File tracesFile = ActivityManagerService.dumpStackTraces(firstPids,
                (isSilentAnr()) ? null : processCpuTracker, (isSilentAnr()) ? null : lastPids,
                nativePids);

ActivityManagerService.dumpStackTracesfunção:

//com.android.server.am.ActivityManagerService
 public static File dumpStackTraces(ArrayList<Integer> firstPids,
           ProcessCpuTracker processCpuTracker, SparseArray<Boolean> lastPids,
           ArrayList<Integer> nativePids) {
    
    
       ArrayList<Integer> extraPids = null;

       Slog.i(TAG, "dumpStackTraces pids=" + lastPids + " nativepids=" + nativePids);

       // Measure CPU usage as soon as we're called in order to get a realistic sampling
       // of the top users at the time of the request.
       if (processCpuTracker != null) {
    
    
           processCpuTracker.init();
           try {
    
    
               Thread.sleep(200);
           } catch (InterruptedException ignored) {
    
    
           }

           processCpuTracker.update();
   		...
   		//创建ANR的输出文件:ANR_TRACE_DIR = "/data/anr";
       final File tracesDir = new File(ANR_TRACE_DIR);
       // Each set of ANR traces is written to a separate file and dumpstate will process
       // all such files and add them to a captured bug report if they're recent enough.
       maybePruneOldTraces(tracesDir);

       // NOTE: We should consider creating the file in native code atomically once we've
       // gotten rid of the old scheme of dumping and lot of the code that deals with paths
       // can be removed.
       File tracesFile = createAnrDumpFile(tracesDir);
       if (tracesFile == null) {
    
    
           return null;
       }
   	//文件创建完毕,开始dump
       dumpStackTraces(tracesFile.getAbsolutePath(), firstPids, nativePids, extraPids);
       return tracesFile;
   }

ActivityManagerService.dumpStackTraces:

 //com.android.server.am.ActivityManagerService
 public static void dumpStackTraces(String tracesFile, ArrayList<Integer> firstPids,
            ArrayList<Integer> nativePids, ArrayList<Integer> extraPids) {
    
    

        Slog.i(TAG, "Dumping to " + tracesFile);

        // We don't need any sort of inotify based monitoring when we're dumping traces via
        // tombstoned. Data is piped to an "intercept" FD installed in tombstoned so we're in full
        // control of all writes to the file in question.

        // We must complete all stack dumps within 20 seconds.
        long remainingTime = 20 * 1000;

        // First collect all of the stacks of the most important pids.
        if (firstPids != null) {
    
    
            int num = firstPids.size();
            for (int i = 0; i < num; i++) {
    
    
                Slog.i(TAG, "Collecting stacks for pid " + firstPids.get(i));
                final long timeTaken = dumpJavaTracesTombstoned(firstPids.get(i), tracesFile,
                                                                remainingTime);

                remainingTime -= timeTaken;
                if (remainingTime <= 0) {
    
    
                    Slog.e(TAG, "Aborting stack trace dump (current firstPid=" + firstPids.get(i) +
                           "); deadline exceeded.");
                    return;
                }

                if (DEBUG_ANR) {
    
    
                    Slog.d(TAG, "Done with pid " + firstPids.get(i) + " in " + timeTaken + "ms");
                }
            }
        }

        // Next collect the stacks of the native pids
        if (nativePids != null) {
    
    
            for (int pid : nativePids) {
    
    
                Slog.i(TAG, "Collecting stacks for native pid " + pid);
                final long nativeDumpTimeoutMs = Math.min(NATIVE_DUMP_TIMEOUT_MS, remainingTime);

                final long start = SystemClock.elapsedRealtime();
                Debug.dumpNativeBacktraceToFileTimeout(
                        pid, tracesFile, (int) (nativeDumpTimeoutMs / 1000));
                final long timeTaken = SystemClock.elapsedRealtime() - start;

                remainingTime -= timeTaken;
                if (remainingTime <= 0) {
    
    
                    Slog.e(TAG, "Aborting stack trace dump (current native pid=" + pid +
                        "); deadline exceeded.");
                    return;
                }

                if (DEBUG_ANR) {
    
    
                    Slog.d(TAG, "Done with native pid " + pid + " in " + timeTaken + "ms");
                }
            }
        }

        // Lastly, dump stacks for all extra PIDs from the CPU tracker.
        if (extraPids != null) {
    
    
            for (int pid : extraPids) {
    
    
                Slog.i(TAG, "Collecting stacks for extra pid " + pid);

                final long timeTaken = dumpJavaTracesTombstoned(pid, tracesFile, remainingTime);

                remainingTime -= timeTaken;
                if (remainingTime <= 0) {
    
    
                    Slog.e(TAG, "Aborting stack trace dump (current extra pid=" + pid +
                            "); deadline exceeded.");
                    return;
                }

                if (DEBUG_ANR) {
    
    
                    Slog.d(TAG, "Done with extra pid " + pid + " in " + timeTaken + "ms");
                }
            }
        }
        Slog.i(TAG, "Done dumping");
    }

可见,dump trace用了两个函数:
dumpJavaTracesTombstonedDebug.dumpNativeBacktraceToFileTimeout,分别是Java层和native层的。Native层是直接调用android.os.Debug类处理。Java层调用dumpJavaTracesTombstoned处理。先看下Java层。

ActivityManagerService.dumpJavaTracesTombstoned:

 /**
     * Dump java traces for process {@code pid} to the specified file. If java trace dumping
     * fails, a native backtrace is attempted. Note that the timeout {@code timeoutMs} only applies
     * to the java section of the trace, a further {@code NATIVE_DUMP_TIMEOUT_MS} might be spent
     * attempting to obtain native traces in the case of a failure. Returns the total time spent
     * capturing traces.
     */
    private static long dumpJavaTracesTombstoned(int pid, String fileName, long timeoutMs) {
    
    
        final long timeStart = SystemClock.elapsedRealtime();
        boolean javaSuccess = Debug.dumpJavaBacktraceToFileTimeout(pid, fileName,
                (int) (timeoutMs / 1000));
        if (javaSuccess) {
    
    
            // Check that something is in the file, actually. Try-catch should not be necessary,
            // but better safe than sorry.
            try {
    
    
                long size = new File(fileName).length();
                if (size < JAVA_DUMP_MINIMUM_SIZE) {
    
    
                    Slog.w(TAG, "Successfully created Java ANR file is empty!");
                    javaSuccess = false;
                }
            } catch (Exception e) {
    
    
                Slog.w(TAG, "Unable to get ANR file size", e);
                javaSuccess = false;
            }
        }
        if (!javaSuccess) {
    
    
            Slog.w(TAG, "Dumping Java threads failed, initiating native stack dump.");
            if (!Debug.dumpNativeBacktraceToFileTimeout(pid, fileName,
                    (NATIVE_DUMP_TIMEOUT_MS / 1000))) {
    
    
                Slog.w(TAG, "Native stack dump failed!");
            }
        }

        return SystemClock.elapsedRealtime() - timeStart;
    }

又调用了 Debug.dumpJavaBacktraceToFileTimeout处理dump。

看下Debug类:

//android.os.Debug
  /**
     * Append the Java stack traces of a given native process to a specified file.
     *
     * @param pid pid to dump.
     * @param file path of file to append dump to.
     * @param timeoutSecs time to wait in seconds, or 0 to wait forever.
     * @hide
     */
    public static native boolean dumpJavaBacktraceToFileTimeout(int pid, String file,
                                                                int timeoutSecs);

    /**
     * Append the native stack traces of a given process to a specified file.
     *
     * @param pid pid to dump.
     * @param file path of file to append dump to.
     * @param timeoutSecs time to wait in seconds, or 0 to wait forever.
     * @hide
     */
    public static native boolean dumpNativeBacktraceToFileTimeout(int pid, String file,
                                                                  int timeoutSecs);

所以Dump trace最终还是调用android.os.Debug类的这两个函数:
dumpJavaBacktraceToFileTimeoutdumpNativeBacktraceToFileTimeout

这两个方法是native修饰的,因此需要去看下android源码。

注意这两个方法是加了@hide标记,app侧不能调用。

Native层如何dump trace

在Android源码中搜索dumpJavaBacktraceToFileTimeout这个函数对应的c++代码,找到了frameworks/base/core/jni/android_os_Debug.cpp,对应函数的实现:

frameworks/base/core/jni/android_os_Debug.cpp

static jboolean android_os_Debug_dumpJavaBacktraceToFileTimeout(JNIEnv* env, jobject clazz,
        jint pid, jstring fileName, jint timeoutSecs) {
    
    
    const bool ret = dumpTraces(env, pid, fileName, timeoutSecs, kDebuggerdJavaBacktrace);
    return ret ? JNI_TRUE : JNI_FALSE;
}

跟踪到了system/core/debuggerd/client/debuggerd_client.cppdebuggerd_trigger_dump方法:

bool debuggerd_trigger_dump(pid_t tid, DebuggerdDumpType dump_type, unsigned int timeout_ms,
                            unique_fd output_fd) {
    
    
     ...
 	// Send the signal.
  	const int signal = (dump_type == kDebuggerdJavaBacktrace) ? SIGQUIT 	: BIONIC_SIGNAL_DEBUGGER;
  	sigval val = {
    
    .sival_int = (dump_type == kDebuggerdNativeBacktrace) ? 1 : 0};
  	if (sigqueue(pid, signal, val) != 0) {
    
    
   	 log_error(output_fd, errno, "failed to send signal to pid %d", pid);
    	return false;
 	 }
 	 ...
  }

这个函数里面会通过sigqueue函数(bionic/libc/bionic/signal.cpp)给目标进程发送一个SIGQUIT信号。

继续看接收SIGQUIT信号的地方。

每一个app进程都会有一个SignalCatcher线程,专门处理SIGQUIT信号,来到art/runtime/signal_catcher.cc:

void* SignalCatcher::Run(void* arg) {
    
    
  SignalCatcher* signal_catcher = reinterpret_cast<SignalCatcher*>(arg);
  ...
  // Set up mask with signals we want to handle.
  SignalSet signals;
  signals.Add(SIGQUIT);
  signals.Add(SIGUSR1);

  while (true) {
    
    
    int signal_number = signal_catcher->WaitForSignal(self, signals);
    if (signal_catcher->ShouldHalt()) {
    
    
      runtime->DetachCurrentThread();
      return nullptr;
    }

    switch (signal_number) {
    
    
    case SIGQUIT:
      signal_catcher->HandleSigQuit();
      break;
    case SIGUSR1:
      signal_catcher->HandleSigUsr1();
      break;
    default:
      LOG(ERROR) << "Unexpected signal %d" << signal_number;
      break;
    }
  }
}

监听到SIGQUIT信号后交给了HandleSigQuit函数处理:

void SignalCatcher::HandleSigQuit() {
    
    
  Runtime* runtime = Runtime::Current();
  std::ostringstream os;
  os << "\n"
      << "----- pid " << getpid() << " at " << GetIsoDate() << " -----\n";

  DumpCmdLine(os);

  // Note: The strings "Build fingerprint:" and "ABI:" are chosen to match the format used by
  // debuggerd. This allows, for example, the stack tool to work.
  std::string fingerprint = runtime->GetFingerprint();
  os << "Build fingerprint: '" << (fingerprint.empty() ? "unknown" : fingerprint) << "'\n";
  os << "ABI: '" << GetInstructionSetString(runtime->GetInstructionSet()) << "'\n";

  os << "Build type: " << (kIsDebugBuild ? "debug" : "optimized") << "\n";

  runtime->DumpForSigQuit(os);

  if ((false)) {
    
    
    std::string maps;
    if (android::base::ReadFileToString("/proc/self/maps", &maps)) {
    
    
      os << "/proc/self/maps:\n" << maps;
    }
  }
  os << "----- end " << getpid() << " -----\n";
  Output(os.str());
}

中间调用art/runtime/runtime.cc的DumpForSigQuit方法收集了更多详细的信息,包括线程堆栈。

void Runtime::DumpForSigQuit(std::ostream& os) {
    
    
  // Print backtraces first since they are important do diagnose ANRs,
  // and ANRs can often be trimmed to limit upload size.
  thread_list_->DumpForSigQuit(os);
  GetClassLinker()->DumpForSigQuit(os);
  GetInternTable()->DumpForSigQuit(os);
  GetJavaVM()->DumpForSigQuit(os);
  GetHeap()->DumpForSigQuit(os);
  oat_file_manager_->DumpForSigQuit(os);
  if (GetJit() != nullptr) {
    
    
    GetJit()->DumpForSigQuit(os);
  } else {
    
    
    os << "Running non JIT\n";
  }
  DumpDeoptimizations(os);
  TrackedAllocators::Dump(os);
  GetMetrics()->DumpForSigQuit(os);
  os << "\n";

  BaseMutex::DumpAll(os);

  // Inform anyone else who is interested in SigQuit.
  {
    
    
    ScopedObjectAccess soa(Thread::Current());
    callbacks_->SigQuit();
  }
}

ANR打印的信息比较多,详细请参阅相关源码。

到这里已经分析完了整个ANR从发生到打印的流程。

ANR分析方法

现在已经知道了ANR是怎么回事了,现在看下发生了ANR是如何定位原因的。
上文已经讲到发生ANR会在两个地方打印日志,一个是在Logcat里打印,一个是在/data/anr/目录下的trace文件里打印。

下面模拟两个场景复现ANR,一个场景是耗时操作导致ANR,一个是死锁导致ANR。

场景1:耗时操作导致ANR

为了方便,就让主线程休眠10s。

在Activity界面上有一个按钮,点击会让主线程休眠10s,代码如下,显然会发生ANR。

class AnrTestActivity : AppCompatActivity() {
    
    
    override fun onCreate(savedInstanceState: Bundle?) {
    
    
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_anr_test)
        this.findViewById<Button>(R.id.button).setOnClickListener{
    
    
            SystemClock.sleep(10000)
        }
    }

连续点击两次,5s之后会弹出ANR弹窗。
insira a descrição da imagem aqui
Logcat输出日志如下:

2022-10-02 15:38:00.505 594-5381/system_process E/ActivityManager: ANR in com.devnn.demo (com.devnn.demo/.AnrTestActivity)
    PID: 5232
    Reason: Input dispatching timed out (f99e8bb com.devnn.demo/com.devnn.demo.AnrTestActivity (server) is not responding. Waited 5008ms for MotionEvent(deviceId=8, source=0x00005002, displayId=0, action=DOWN, actionButton=0x00000000, flags=0x00000000, metaState=0x00000000, buttonState=0x00000000, classification=NONE, edgeFlags=0x00000000, xPrecision=22.8, yPrecision=12.8, xCursorPosition=nan, yCursorPosition=nan, pointers=[0: (804.9, 1173.9)]), policyFlags=0x62000000)
    Parent: com.devnn.demo/.AnrTestActivity
    Load: 0.05 / 0.01 / 0.0
    ----- Output from /proc/pressure/memory -----
    some avg10=0.00 avg60=0.00 avg300=0.00 total=0
    full avg10=0.00 avg60=0.00 avg300=0.00 total=0
    ----- End output from /proc/pressure/memory -----
    
    CPU usage from 158257ms to 0ms ago (2022-10-02 15:35:18.256 to 2022-10-02 15:37:56.513):
      6.2% 279/[email protected]: 0.3% user + 5.9% kernel
      2.2% 292/[email protected]: 0% user + 2.1% kernel
      1.6% 594/system_server: 0.3% user + 1.3% kernel / faults: 1085 minor
      1.4% 300/[email protected]: 0% user + 1.4% kernel
      0.4% 277/android.hardware.audio.service.ranchu: 0% user + 0.4% kernel / faults: 10 minor
      0.2% 371/audioserver: 0% user + 0.2% kernel / faults: 4 minor
      0.2% 5232/com.devnn.demo: 0% user + 0.2% kernel / faults: 272 minor
      0.2% 318/surfaceflinger: 0% user + 0.2% kernel
      0% 16/ksoftirqd/1: 0% user + 0% kernel
      0% 365/adbd: 0% user + 0% kernel
      0% 477/llkd: 0% user + 0% kernel
      0% 872/[email protected]: 0% user + 0% kernel
      0% 10/rcu_preempt: 0% user + 0% kernel
      0% 2014/com.android.systemui: 0% user + 0% kernel / faults: 39 minor
      0% 9/ksoftirqd/0: 0% user + 0% kernel
      0% 1002/com.android.phone: 0% user + 0% kernel / faults: 100 minor
      0% 3645/kworker/0:2-events_power_efficient: 0% user + 0% kernel
      0% 157/logd: 0% user + 0% kernel
      0% 427/libgoldfish-rild: 0% user + 0% kernel / faults: 16 minor
      0% 3270/kworker/1:1-mm_percpu_wq: 0% user + 0% kernel
      0% 159/servicemanager: 0% user + 0% kernel
      0% 160/hwservicemanager: 0% user + 0% kernel
      0% 478/hostapd_nohidl: 0% user + 0% kernel
      0% 5346/kworker/u4:0-events_unbound: 0% user + 0% kernel
      0% 11/migration/0: 0% user + 0% kernel
      0% 15/migration/1: 0% user + 0% kernel
      0% 164/qemu-props: 0% user + 0% kernel
      0% 188/jbd2/dm-5-8: 0% user + 0% kernel
      0% 269/statsd: 0% user + 0% kernel
      0% 342/logcat: 0% user + 0% kernel
      0% 418/media.metrics: 0% user + 0% kernel / faults: 1 minor
      0% 442/[email protected]: 0% user + 0% kernel
      0% 761/wpa_supplicant: 0% user + 0% kernel
      0% 3615/logcat: 0% user + 0% kernel
      0% 5068/kworker/u4:1-phy0: 0% user + 0% kernel
    1.9% TOTAL: 0.1% user + 1.7% kernel + 0% softirq
    CPU usage from 20ms to 335ms later (2022-10-02 15:37:56.533 to 2022-10-02 15:37:56.848):
      22% 594/system_server: 15% user + 7.5% kernel / faults: 161 minor
        22% 5381/AnrConsumer: 7.5% user + 15% kernel
      6.9% 279/[email protected]: 0% user + 6.9% kernel
        6.9% 1215/[email protected]: 0% user + 6.9% kernel
      3.5% 292/[email protected]: 0% user + 3.5% kernel
    18% TOTAL: 8.6% user + 10% kernel

注意需要选中system_process进程。

从Logcat日志可以看出来,是进程id=5323的处理输入事件超时了。这个日志也是上文分析的ProcessRecord.appNotResponding方法打印出来的。

下面看下/data/anr/目录下的日志内容是怎么样的。

整个trace文件就代表发生一次ANR的日志。每发生一次ANR就会生成新的trace文件,trace文件名称以时间命名的。
insira a descrição da imagem aqui
整个trace文件是有结构的,它整体上是以进程为单位进行打印的。

由于发生ANR不一定是app进程导致的,可能是其它关联进程导致的,所以它把相关进程的信息都打印在同一个文件里了。基本上是以下面这个结构打印的。

----- pid 5232 at 2022-10-02 15:37:56 -----
进程5232的详细日志
----- end 5232 -----

----- pid 594 at 2022-10-02 15:37:57 -----
进程594的详细日志
----- end 594 -----

----- pid xxx at xxxx-xx-xx xx:xx:xx -----
进程xxx的详细日志
----- end xxx -----

第一个进程就是发生ANR的进程,一般是app进程。

由于内容过长,整个trace文件有700多KB,下面就截取app进程的主要信息。

每个进程信息的开头是它的概要信息,包括进程id,发生ANR的时间,进程的名称。

----- pid 5232 at 2022-10-02 15:37:56 -----
Cmd line: com.devnn.demo
Build fingerprint: 'Android/sdk_phone_x86_64/generic_x86_64:11/RSR1.210722.012/7758210:userdebug/test-keys'
ABI: 'x86_64'
Build type: optimized
Zygote loaded classes=15740 post zygote classes=1289
Dumping registered class loaders
#0 dalvik.system.PathClassLoader: [], parent #1
#1 java.lang.BootClassLoader: [], no parent
#2 dalvik.system.PathClassLoader: [/data/app/~~Qnj80NrB3yjtX87JepktGQ==/com.devnn.demo-FWP2tIJA7Ec1qoJefwnc0A==/base.apk:/data/app/~~Qnj80NrB3yjtX87JepktGQ==/com.devnn.demo-FWP2tIJA7Ec1qoJefwnc0A==/base.apk!classes10.dex:/data/app/~~Qnj80NrB3yjtX87JepktGQ==/com.devnn.demo-FWP2tIJA7Ec1qoJefwnc0A==/base.apk!classes11.dex:/data/app/~~Qnj80NrB3yjtX87JepktGQ==/com.devnn.demo-FWP2tIJA7Ec1qoJefwnc0A==/base.apk!classes6.dex:/data/app/~~Qnj80NrB3yjtX87JepktGQ==/com.devnn.demo-FWP2tIJA7Ec1qoJefwnc0A==/base.apk!classes2.dex:/data/app/~~Qnj80NrB3yjtX87JepktGQ==/com.devnn.demo-FWP2tIJA7Ec1qoJefwnc0A==/base.apk!classes3.dex:/data/app/~~Qnj80NrB3yjtX87JepktGQ==/com.devnn.demo-FWP2tIJA7Ec1qoJefwnc0A==/base.apk!classes8.dex], parent #1
Done dumping class loaders
Classes initialized: 526 in 694.025ms
Intern table: 31792 strong; 523 weak
JNI: CheckJNI is on; globals=639 (plus 37 weak)
Libraries: libandroid.so libaudioeffect_jni.so libcompiler_rt.so libicu_jni.so libjavacore.so libjavacrypto.so libjnigraphics.so libmedia_jni.so libopenjdk.so librs_jni.so libsfplugin_ccodec.so libsoundpool.so libstats_jni.so libwebviewchromium_loader.so (14)
Heap: 46% free, 11MB/21MB; 75810 objects
//此处省略部分内容

第二部分是进程里所有线程的状态、堆栈,也是我们重点要关注的:


suspend all histogram:	Sum: 74.854ms 99% C.I. 0.005ms-43.315ms Avg: 3.742ms Max: 44.394ms
DALVIK THREADS (21):
"Signal Catcher" daemon prio=10 tid=4 Runnable
  | group="system" sCount=0 dsCount=0 flags=0 obj=0x12c40b10 self=0x7fada5a4af50
  | sysTid=5242 nice=-20 cgrp=top-app sched=0/0 handle=0x7fac275adcf0
  | state=R schedstat=( 21716542 2041235 2 ) utm=0 stm=2 core=0 HZ=100
  | stack=0x7fac274b6000-0x7fac274b8000 stackSize=995KB
  | held mutexes= "mutator lock"(shared held)
  native: #00 pc 000000000054da9e  /apex/com.android.art/lib64/libart.so (art::DumpNativeStack(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, int, BacktraceMap*, char const*, art::ArtMethod*, void*, bool)+126)
  native: #01 pc 000000000069615c  /apex/com.android.art/lib64/libart.so (art::Thread::DumpStack(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, bool, BacktraceMap*, bool) const+380)
  native: #02 pc 00000000006b7320  /apex/com.android.art/lib64/libart.so (art::DumpCheckpoint::Run(art::Thread*)+1088)
  native: #03 pc 00000000006b064d  /apex/com.android.art/lib64/libart.so (art::ThreadList::RunCheckpoint(art::Closure*, art::Closure*)+557)
  native: #04 pc 00000000006af729  /apex/com.android.art/lib64/libart.so (art::ThreadList::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, bool)+1817)
  native: #05 pc 00000000006aec28  /apex/com.android.art/lib64/libart.so (art::ThreadList::DumpForSigQuit(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)+824)
  native: #06 pc 00000000006470d9  /apex/com.android.art/lib64/libart.so (art::Runtime::DumpForSigQuit(std::__1::basic_ostream<char, std::__1::char_traits<char> >&)+201)
  native: #07 pc 000000000065ceb6  /apex/com.android.art/lib64/libart.so (art::SignalCatcher::HandleSigQuit()+1766)
  native: #08 pc 000000000065bc85  /apex/com.android.art/lib64/libart.so (art::SignalCatcher::Run(void*)+357)
  native: #09 pc 00000000000c7d2a  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+58)
  native: #10 pc 000000000005f0c7  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+55)
  (no managed stack frames)

"main" prio=5 tid=1 Sleeping
  | group="main" sCount=1 dsCount=0 flags=1 obj=0x71fb36a8 self=0x7fada5a477b0
  | sysTid=5232 nice=-10 cgrp=top-app sched=0/0 handle=0x7faecb97d4f8
  | state=S schedstat=( 5775317077 4230577099 871 ) utm=286 stm=291 core=0 HZ=100
  | stack=0x7ffc29566000-0x7ffc29568000 stackSize=8192KB
  | held mutexes=
  at java.lang.Thread.sleep(Native method)
  - sleeping on <0x06059c02> (a java.lang.Object)
  at java.lang.Thread.sleep(Thread.java:442)
  - locked <0x06059c02> (a java.lang.Object)
  at java.lang.Thread.sleep(Thread.java:358)
  at android.os.SystemClock.sleep(SystemClock.java:131)
  at com.devnn.demo.AnrTestActivity.onCreate$lambda-0(AnrTestActivity.kt:17)
  at com.devnn.demo.AnrTestActivity.lambda$UpadNwrDNzrVyNaTI0ysWoH569M(AnrTestActivity.kt:-1)
  at com.devnn.demo.-$$Lambda$AnrTestActivity$UpadNwrDNzrVyNaTI0ysWoH569M.onClick(lambda:-1)
  at android.view.View.performClick(View.java:7448)
  at android.view.View.performClickInternal(View.java:7425)
  at android.view.View.access$3600(View.java:810)
  at android.view.View$PerformClick.run(View.java:28305)
  at android.os.Handler.handleCallback(Handler.java:938)
  at android.os.Handler.dispatchMessage(Handler.java:99)
  at android.os.Looper.loop(Looper.java:223)
  at android.app.ActivityThread.main(ActivityThread.java:7656)
  at java.lang.reflect.Method.invoke(Native method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:592)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:947)

"perfetto_hprof_listener" prio=10 tid=5 Native (still starting up)
  | group="" sCount=1 dsCount=0 flags=1 obj=0x0 self=0x7fada5a4cb20
  | sysTid=5243 nice=-20 cgrp=top-app sched=0/0 handle=0x7fac274afcf0
  | state=S schedstat=( 3314219 3983561 6 ) utm=0 stm=0 core=0 HZ=100
  | stack=0x7fac273b8000-0x7fac273ba000 stackSize=995KB
  | held mutexes=
  native: #00 pc 00000000000b1ec5  /apex/com.android.runtime/lib64/bionic/libc.so (read+5)
  native: #01 pc 000000000001cb70  /apex/com.android.art/lib64/libperfetto_hprof.so (void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, ArtPlugin_Initialize::$_29> >(void*)+288)
  native: #02 pc 00000000000c7d2a  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+58)
  native: #03 pc 000000000005f0c7  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+55)
  (no managed stack frames)
  //...省略其它线程

可以看到第一个线程是Signal Catcher守护线程,用来捕获SIGQUIT信号的。从这里也说明这个线程是属于app进程的。第二个线程就是我们app的主线程:

"main" prio=5 tid=1 Sleeping
  | group="main" sCount=1 dsCount=0 flags=1 obj=0x71fb36a8 self=0x7fada5a477b0
  | sysTid=5232 nice=-10 cgrp=top-app sched=0/0 handle=0x7faecb97d4f8
  | state=S schedstat=( 5775317077 4230577099 871 ) utm=286 stm=291 core=0 HZ=100
  | stack=0x7ffc29566000-0x7ffc29568000 stackSize=8192KB
  | held mutexes=
  at java.lang.Thread.sleep(Native method)
  - sleeping on <0x06059c02> (a java.lang.Object)
  at java.lang.Thread.sleep(Thread.java:442)
  - locked <0x06059c02> (a java.lang.Object)
  at java.lang.Thread.sleep(Thread.java:358)
  at android.os.SystemClock.sleep(SystemClock.java:131)
  at com.devnn.demo.AnrTestActivity.onCreate$lambda-0(AnrTestActivity.kt:17)
  at com.devnn.demo.AnrTestActivity.lambda$UpadNwrDNzrVyNaTI0ysWoH569M(AnrTestActivity.kt:-1)
  at com.devnn.demo.-$$Lambda$AnrTestActivity$UpadNwrDNzrVyNaTI0ysWoH569M.onClick(lambda:-1)
  at android.view.View.performClick(View.java:7448)
  at android.view.View.performClickInternal(View.java:7425)

Você pode ver que o thread principal não consegue responder aos eventos de entrada porque está inativo.

A primeira linha de cada informação de thread é fixa:

"main" prio=5 tid=1 Sleeping

O primeiro indica o nome do encadeamento, o segundo é sua prioridade, o terceiro é o ID do encadeamento e o quarto é o estado do encadeamento.

A informação chave aqui é o estado do encadeamento. Geralmente, você provavelmente pode saber o que causou o ANR observando o estado do encadeamento. Parece que está inativo aqui, então você pode analisar a localização específica do código olhando para sua pilha mais tarde.

Vejamos um exemplo de ANR causado por uma operação de impasse.

Cenário 2: Impasse leva a ANR

 private fun clickTest() {
    
    

        val obj1 = Object()
        val obj2 = Object()

        Thread {
    
    
            synchronized(obj1) {
    
    
                Thread.sleep(100)
                //子线程已经获取obj1的锁,想要获取ojb2的锁
                synchronized(obj2) {
    
    
                    Log.i("AnrTest", "sub")
                }
            }
        }.start()

        synchronized(obj2) {
    
    
            Thread.sleep(100)
            //子线程已经获取obj2的锁,想要获取ojb1的锁
            synchronized(obj1) {
    
    
                Log.i("AnrTest", "main")
            }
        }

    }

O log do Logcat é o seguinte e ainda mostra que não pode responder aos eventos de entrada.

2022-10-02 16:30:14.001 594-5956/system_process E/ActivityManager: ANR in com.devnn.demo (com.devnn.demo/.AnrTestActivity)
    PID: 5906
    Reason: Input dispatching timed out (1313584 com.devnn.demo/com.devnn.demo.AnrTestActivity (server) is not responding. Waited 5007ms for MotionEvent(deviceId=8, source=0x00005002, displayId=0, action=DOWN, actionButton=0x00000000, flags=0x00000000, metaState=0x00000000, buttonState=0x00000000, classification=NONE, edgeFlags=0x00000000, xPrecision=22.8, yPrecision=12.8, xCursorPosition=nan, yCursorPosition=nan, pointers=[0: (721.0, 1641.9)]), policyFlags=0x62000000)
    Parent: com.devnn.demo/.AnrTestActivity
    Load: 0.8 / 0.67 / 0.39
    ----- Output from /proc/pressure/memory -----
    some avg10=0.00 avg60=0.00 avg300=0.00 total=0
    full avg10=0.00 avg60=0.00 avg300=0.00 total=0
    ----- End output from /proc/pressure/memory -----
    
    CPU usage from 285508ms to 0ms ago (2022-10-02 16:25:25.779 to 2022-10-02 16:30:11.287):
      8.1% 279/[email protected]: 0.6% user + 7.4% kernel
      4.4% 292/[email protected]: 0.3% user + 4.1% kernel
      4.3% 594/system_server: 1.4% user + 2.8% kernel / faults: 19536 minor
      2.8% 318/surfaceflinger: 0.3% user + 2.4% kernel / faults: 871 minor
      2% 300/[email protected]: 0% user + 1.9% kernel
      0.5% 2014/com.android.systemui: 0% user + 0.5% kernel / faults: 4342 minor
      0.4% 365/adbd: 0% user + 0.4% kernel / faults: 946 minor
      0.2% 1152/com.android.launcher3: 0% user + 0.2% kernel / faults: 50 minor
      0.2% 157/logd: 0% user + 0.2% kernel / faults: 13 minor
      0.2% 277/android.hardware.audio.service.ranchu: 0% user + 0.1% kernel / faults: 5 minor
      0.2% 10/rcu_preempt: 0% user + 0.2% kernel
      0.1% 1002/com.android.phone: 0% user + 0% kernel / faults: 1267 minor

O motivo específico não pode ser visto no Logat, portanto, depende do arquivo de rastreamento.

----- pid 5906 at 2022-10-02 16:30:11 -----
Cmd line: com.devnn.demo
Build fingerprint: 'Android/sdk_phone_x86_64/generic_x86_64:11/RSR1.210722.012/7758210:userdebug/test-keys'
ABI: 'x86_64'
Build type: optimized

...省略无关内容 


"main" prio=5 tid=1 Blocked
  | group="main" sCount=1 dsCount=0 flags=1 obj=0x71fb36a8 self=0x7fada5a477b0
  | sysTid=5906 nice=-10 cgrp=top-app sched=0/0 handle=0x7faecb97d4f8
  | state=S schedstat=( 2792813804 2053378730 782 ) utm=161 stm=117 core=0 HZ=100
  | stack=0x7ffc29566000-0x7ffc29568000 stackSize=8192KB
  | held mutexes=
  at com.devnn.demo.AnrTestActivity.clickTest(AnrTestActivity.kt:48)
  - waiting to lock <0x026f6b14> (a java.lang.Object) held by thread 2
  - locked <0x0188dfbd> (a java.lang.Object)
  at com.devnn.demo.AnrTestActivity.onCreate$lambda-1(AnrTestActivity.kt:21)
  at com.devnn.demo.AnrTestActivity.lambda$W1-GSjdjbC-dtyUoueoTRdjL4Es(AnrTestActivity.kt:-1)
  at com.devnn.demo.-$$Lambda$AnrTestActivity$W1-GSjdjbC-dtyUoueoTRdjL4Es.onClick(lambda:-1)
  at android.view.View.performClick(View.java:7448)
  at android.view.View.performClickInternal(View.java:7425)
  at android.view.View.access$3600(View.java:810)
  at android.view.View$PerformClick.run(View.java:28305)
  at android.os.Handler.handleCallback(Handler.java:938)
  at android.os.Handler.dispatchMessage(Handler.java:99)
  at android.os.Looper.loop(Looper.java:223)
  at android.app.ActivityThread.main(ActivityThread.java:7656)
  at java.lang.reflect.Method.invoke(Native method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:592)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:947)

Você pode ver que o status do encadeamento principal é Bocked (bloqueado).

 waiting to lock <0x026f6b14> (a java.lang.Object) held by thread 2
  - locked <0x0188dfbd> (a java.lang.Object)

A pilha mostra que a thread principal está adquirindo 0x026f6b14um bloqueio neste objeto, que é mantido pela thread 2. Ao mesmo tempo, o thread principal está segurando 0x0188dfbdo bloqueio de objeto.

Em seguida, observe a pilha do thread 2:

"Thread-5" prio=5 tid=2 Blocked
  | group="main" sCount=1 dsCount=0 flags=1 obj=0x12db7fc0 self=0x7fada5a55630
  | sysTid=5953 nice=0 cgrp=top-app sched=0/0 handle=0x7fabdc49fcf0
  | state=S schedstat=( 1560220 17477159 3 ) utm=0 stm=0 core=0 HZ=100
  | stack=0x7fabdc39c000-0x7fabdc39e000 stackSize=1043KB
  | held mutexes=
  at com.devnn.demo.AnrTestActivity.clickTest$lambda-4(AnrTestActivity.kt:39)
  - waiting to lock <0x0188dfbd> (a java.lang.Object) held by thread 1
  - locked <0x026f6b14> (a java.lang.Object)
  at com.devnn.demo.AnrTestActivity.lambda$A4lEoLZVf4n-xUBZSqj2v3ihIqw(AnrTestActivity.kt:-1)
  at com.devnn.demo.-$$Lambda$AnrTestActivity$A4lEoLZVf4n-xUBZSqj2v3ihIqw.run(lambda:-1)
  at java.lang.Thread.run(Thread.java:923)

A Thread 2 também está no estado Blocked, e está esperando 0x0188dfbdo bloqueio deste objeto, que está sendo mantido pela Thread 1. E o thread 2 está segurando 0x026f6b14esse bloqueio de objeto.

Este é o ANR causado pelo impasse.

Estado do encadeamento no arquivo de rastreamento

Ao visualizar o estado do encadeamento no arquivo de rastreamento, você pode ver que o encadeamento tem vários estados:

"Signal Catcher" daemon prio=10 tid=4 Runnable
"RenderThread" daemon prio=7 tid=21 Native
"DefaultDispatcher-worker-1" daemon prio=5 tid=22 TimedWaiting
"main" prio=5 tid=1 Blocked
"main" prio=5 tid=1 Sleeping
"main" prio=5 tid=1 MONITOR

Existem principalmente esses estados, e vários estados foram definidos na classe Thread, mas Nativeo que MONITORé o estado?

Revise Threados vários estados de encadeamento definidos nas seguintes classes:

//java.lang.Thread
public class Thread implements Runnable {
    
    
 public enum State {
    
    
        /**
         * Thread state for a thread which has not yet started.
         */
        NEW,

        /**
         * Thread state for a runnable thread.  A thread in the runnable
         * state is executing in the Java virtual machine but it may
         * be waiting for other resources from the operating system
         * such as processor.
         */
        RUNNABLE,

        /**
         * Thread state for a thread blocked waiting for a monitor lock.
         * A thread in the blocked state is waiting for a monitor lock
         * to enter a synchronized block/method or
         * reenter a synchronized block/method after calling
         * {@link Object#wait() Object.wait}.
         */
        BLOCKED,

        /**
         * Thread state for a waiting thread.
         * A thread is in the waiting state due to calling one of the
         * following methods:
         * <ul>
         *   <li>{@link Object#wait() Object.wait} with no timeout</li>
         *   <li>{@link #join() Thread.join} with no timeout</li>
         *   <li>{@link LockSupport#park() LockSupport.park}</li>
         * </ul>
         *
         * <p>A thread in the waiting state is waiting for another thread to
         * perform a particular action.
         *
         * For example, a thread that has called <tt>Object.wait()</tt>
         * on an object is waiting for another thread to call
         * <tt>Object.notify()</tt> or <tt>Object.notifyAll()</tt> on
         * that object. A thread that has called <tt>Thread.join()</tt>
         * is waiting for a specified thread to terminate.
         */
        WAITING,

        /**
         * Thread state for a waiting thread with a specified waiting time.
         * A thread is in the timed waiting state due to calling one of
         * the following methods with a specified positive waiting time:
         * <ul>
         *   <li>{@link #sleep Thread.sleep}</li>
         *   <li>{@link Object#wait(long) Object.wait} with timeout</li>
         *   <li>{@link #join(long) Thread.join} with timeout</li>
         *   <li>{@link LockSupport#parkNanos LockSupport.parkNanos}</li>
         *   <li>{@link LockSupport#parkUntil LockSupport.parkUntil}</li>
         * </ul>
         */
        TIMED_WAITING,

        /**
         * Thread state for a terminated thread.
         * The thread has completed execution.
         */
        TERMINATED;
    }
}

Existem seus relacionamentos correspondentes no VMThread:

//VMThread.java
    /**
     * Holds a mapping from native Thread statuses to Java one. Required for
     * translating back the result of getStatus().
     */
    static final Thread.State[] STATE_MAP = new Thread.State[] {
    
    
        Thread.State.TERMINATED,     // ZOMBIE
        Thread.State.RUNNABLE,       // RUNNING
        Thread.State.TIMED_WAITING,  // TIMED_WAIT
        Thread.State.BLOCKED,        // MONITOR
        Thread.State.WAITING,        // WAIT
        Thread.State.NEW,            // INITIALIZING
        Thread.State.NEW,            // STARTING
        Thread.State.RUNNABLE,       // NATIVE
        Thread.State.WAITING,        // VMWAIT
        Thread.State.RUNNABLE        // SUSPENDED
    };

O visível NATIVErepresenta RUNNABLE, MONITORrepresenta BLOCKED.

OK, este é o fim da introdução ao método de análise e processo de geração de problemas de ANR.

Acho que você gosta

Origin blog.csdn.net/devnn/article/details/127138547
Recomendado
Clasificación