Android ANR触发机制（二）

上一篇文章看了Service的ANR触发流程，现在看一下其他三种ANR触发流程。

1.BroadcastReceiver触发ANR

BroadcastReceiver超时是位于ActivityManager线程中的BroadcastQueue.BroadcastHandler收到BROADCAST_TIMEOUT_MSG消息时触发。

广播队列分为foreground队列和background队列两个。其中前台广播的超时为10s；后台广播的超时为60s。

广播的使用需要注册广播和发送广播两步：

①注册广播

不管Activity还是Service中的registerReceiver，最终都调用ContextImpl中的registerReceiver方法：

ContextImpl.java：

@Override

public Intent registerReceiver(BroadcastReceiver receiver, IntentFilter filter, String broadcastPermission, Handler scheduler) {

return registerReceiverInternal(receiver, getUserId(), filter, broadcastPermission, scheduler, getOuterContext(), 0);

}

这个方法调用registerReceiverInternal方法：

private Intent registerReceiverInternal( BroadcastReceiver receiver, int userId, IntentFilter filter, String broadcastPermission, Handler scheduler, Context context, int flags) {

……

final Intent intent = ActivityManager.getService().registerReceiver( mMainThread.getApplicationThread(), mBasePackageName, rd, filter, broadcastPermission, userId, flags);

…

}

这里调用了AMS的registerReceiver方法：

AMS.java：

public Intent registerReceiver(IApplicationThread caller, String callerPackage, IIntentReceiver receiver, IntentFilter filter, String permission, int userId, int flags) {

mRegisteredReceivers.put( receiver.asBinder(), rl); //在这里将信息保存

}

可见注册广播就是通过层层调用，将BroadcastReceiver信息保存到AMS中。

②发送广播

在ContextImpl中调用sendBroadcast方法：

ContextImpl.java：

@Override

public void sendBroadcast(Intent intent) {

ActivityManager.getService().broadcastIntent( mMainThread.getApplicationThread(), intent, resolvedType, null, Activity.RESULT_OK, null, null, null, AppOpsManager.OP_NONE, null, false, false, getUserId());

}

发送广播调用了AMS的broadcastIntent方法：

AMS.java：

public final int broadcastIntent( IApplicationThread caller, Intent intent, String resolvedType, IIntentReceiver resultTo, int resultCode, String resultData, Bundle resultExtras, String[] requiredPermissions, int appOp, Bundle bOptions, boolean serialized, boolean sticky, int userId) {

int res = broadcastIntentLocked(callerApp, callerApp != null ? callerApp.info.packageName : null, intent, resolvedType, resultTo, resultCode, resultData, resultExtras, requiredPermissions, appOp, bOptions, serialized, sticky, callingPid, callingUid, userId);

}

这个方法又调用broadcastIntentLocked方法：

final int broadcastIntentLocked(ProcessRecord callerApp, String callerPackage, Intent intent, String resolvedType, IIntentReceiver resultTo, int resultCode, String resultData, Bundle resultExtras, String[] requiredPermissions, int appOp, Bundle bOptions, boolean ordered, boolean sticky, int callingPid, int callingUid, int userId) {

//...

queue.scheduleBroadcastsLocked();

//...

}

在broadcastIntentLocked方法中，通过调用BroadcastQueue的scheduleBroadcastsLocked方法发送了一个BROADCAST_INTENT_MSG消息：

public void scheduleBroadcastsLocked() {

mHandler.sendMessage( mHandler.obtainMessage(BROADCAST_INTENT_MSG, this));

}

通过BroadcastHandler发送了一个BROADCAST_INTENT_MSG消息。收到该消息后，会在handleMessage方法中进行处理：

BroadcastQueue.java：

private final class BroadcastHandler extends Handler {

@Override

public void handleMessage(Message msg) {

switch (msg.what) {

case BROADCAST_INTENT_MSG:

processNextBroadcast(true);

break;

}

在handleMessage方法中调用processNextBroadcast方法：

final void processNextBroadcast(boolean fromMsg) {

synchronized (mService) {

processNextBroadcastLocked(fromMsg, false);

}

调用processNextBroadcastLocked方法：

final void processNextBroadcastLocked(boolean fromMsg, boolean skipOomAdj) {

//（装炸弹）设置超时时间

setBroadcastTimeoutLocked(timeoutTime);

//回调onReceive()方法，如果在延迟时间内没有执行完成，则会触发ANR（引爆炸弹）

deliverToRegisteredReceiverLocked(r, filter, r.ordered, recIdx);

//（拆炸弹）onReceive在规定时间内执行完成，则取消延迟消息

cancelBroadcastTimeoutLocked();

}

发送广播时，调用的这三个方法很重要。其中setBroadcastTimeoutLocked方法设置超时时间，也就是装炸弹。deliverToRegisteredReceiverLocked内会回调onReceive()方法，如果在延迟时间内没有执行完成，则会触发ANR，也就是引爆炸弹。cancelBroadcastTimeoutLocked()用于取消延迟消息，也就是拆炸弹。

下面分别看一下：

①装炸弹

final void setBroadcastTimeoutLocked(long timeoutTime) {

Message msg = mHandler.obtainMessage( BROADCAST_TIMEOUT_MSG, this);

mHandler.sendMessageAtTime(msg, timeoutTime);

}

static final int BROADCAST_FG_TIMEOUT = 10*1000;

static final int BROADCAST_BG_TIMEOUT = 60*1000;

发送广播的时候，同时发送了一个BROADCAST_TIMEOUT_MSG延迟消息（炸弹）。

②引爆炸弹

private void deliverToRegisteredReceiverLocked( BroadcastRecord r, BroadcastFilter filter, boolean ordered, int index) {

performReceiveLocked(filter.receiverList.app, filter.receiverList.receiver, new Intent(r.intent), r.resultCode, r.resultData, r.resultExtras, r.ordered, r.initialSticky, r.userId);

...

}

这个方法会调用performReceiveLocked方法：

BroadcastQueue.java：

void performReceiveLocked(ProcessRecord app, IIntentReceiver receiver, Intent intent, int resultCode, String data, Bundle extras, boolean ordered, boolean sticky, int sendingUser) throws RemoteException {

…

app.thread.scheduleRegisteredReceiver( receiver, intent, resultCode, data, extras, ordered, sticky, sendingUser, app.repProcState);

}

调用Activity中的ApplicationThread的scheduleRegisteredReceiver方法：

ActivityThread$ApplicationThread.java：

public void scheduleRegisteredReceiver( IIntentReceiver receiver, Intent intent, int resultCode, String dataStr, Bundle extras, boolean ordered, boolean sticky, int sendingUser, int processState) throws RemoteException {

…

receiver.performReceive(intent, resultCode, dataStr, extras, ordered, sticky, sendingUser);

}

调用BroadcastQueue的ReceiverDispatcher的InnerReceiver的performReceive方法：

@Override

public void performReceive(Intent intent, int resultCode, String data, Bundle extras, boolean ordered, boolean sticky, int sendingUser) {

final LoadedApk.ReceiverDispatcher rd;

rd.performReceive(intent, resultCode, data, extras, ordered, sticky, sendingUser);

}

调用BroadcastQueue的ReceiverDispatcher的performReceive方法：

public void performReceive(Intent intent, int resultCode, String data, Bundle extras, boolean ordered, boolean sticky, int sendingUser) {

if (intent == null || !mActivityThread.post(args.getRunnable())) {

…

}

这里调用了ReceiverDispatcher的Handler类型的mActivityThread的post方法，将一个runnable发送出去。

public final Runnable getRunnable() {

return () -> {

//回调onReceive

receiver.onReceive(mContext, intent);

…

}

在这里回调onReceive。

③拆炸弹

onReceive执行完成后，会调用cancelBroadcastTimeoutLocked方法取消延迟消息，也就是拆炸弹。

BroadcastQueue.java：

final void cancelBroadcastTimeoutLocked() {

mHandler.removeMessages( BROADCAST_TIMEOUT_MSG, this);

}

此时移除了超时检测的消息，此时ANR不会触发。

如果没有在规定时间完成，则会处理BROADCAST_TIMEOUT_MSG消息：

@Override

public void handleMessage(Message msg) {

switch (msg.what) {

case BROADCAST_TIMEOUT_MSG:

synchronized (mService) {

broadcastTimeoutLocked(true);

}

break;

}

调用broadcastTimeoutLocked触发ANR：

final void broadcastTimeoutLocked(boolean fromMsg) {

...

mHandler.post(new AppNotResponding(app, anrMessage));

}

总结一下BroadcastReceiver的ANR触发流程：

①发送消息时，会通过Handler发送一个BROADCAST_TIMEOUT_MSG延迟消息（装炸弹）。

②如果onReceive方法在延迟时间内执行完成，则取消那个延迟消息（拆炸弹）。

③否则，执行Handler的handMessage方法来处理超时消息（引爆炸弹）。

2.ContentProvider触发ANR

ContentProvider超时是位于ActivityManager线程中的AMS.MainHandler收到CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG消息时触发。

ContentProvider超时时间是10s。

1）第一步：装炸弹

ContentProvider的注册在启动进程的时候就开始执行，进程创建后如果有ContentProvider就会进入AMS进程调用attachApplicationLocked()方法：

ActivityManagerService.java：

private final boolean attachApplicationLocked( IApplicationThread thread, int pid) {

boolean normalMode = mProcessesReady || isAllowedWhileBooting(app.info);

List<ProviderInfo> providers = normalMode ? generateApplicationProvidersLocked(app) : null;

//（装炸弹）app进程存在正在启动中的provider，则10s后发送CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG消息

if (providers != null && checkAppInLaunchingProvidersLocked(app)) {

Message msg = mHandler.obtainMessage( CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG);

msg.obj = app;

mHandler.sendMessageDelayed(msg, CONTENT_PROVIDER_PUBLISH_TIMEOUT);

}

static final int CONTENT_PROVIDER_PUBLISH_TIMEOUT = 10*1000; //10s

在绑定Application时，会判断是否有ContentProvide，如果有，则会通过Handler发送CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG延迟消息（炸弹）。

2）第二步：拆炸弹

在AT.installContentProviders()安装完后会调用AMS.publishContentProviders()方法，即当provider成功publish之后，便会拆除该炸弹：

public final void publishContentProviders( IApplicationThread caller, List<ContentProviderHolder> providers) {

//成功pubish则移除该消息

if (wasInLaunchingProviders) {

mHandler.removeMessages(CONTENT_PR OVIDER_PUBLISH_TIMEOUT_MSG, r);

}

3）第三步：引爆炸弹

如果延迟消息在指定时间内没被移除则引爆炸弹。CONTENT_PROVIDER_PUBLISH_TIMEOUT _MSG的handler在AMS.MainHandler中，当倒计时结束便会向该Handler线程发送延迟信息，MainHandler是AMS的内部类。

final class MainHandler extends Handler {

public void handleMessage(Message msg) {

switch (msg.what) {

case CONTENT_PROVIDER_PUBLISH_TIM EOUT_MSG:

ProcessRecord app = (ProcessRecord)msg.obj;

synchronized ( ActivityManagerService.this) {

processContentProviderPublishTime dOutLocked(app);

}

break;

}

private final void processContentProviderPub lishTimedOutLocked(ProcessRecord app) {

cleanupAppInLaunchingProvidersLocked(app, true); //移除死亡的provider

mProcessList.removeProcessLocked(app, false, true, "timeout publishing content providers"); //移除mProcessList中的相应对象

}

总结一下ContentProvider的ANR触发流程：

①app进程启动的时候，如果有ContentProvider就会注册ContentProvider，在注册过程中发送一个延迟消息（炸弹）。

②如果在延迟消息内ContentProvider成功pubish了，则移除该延迟消息（拆炸弹）。

③否则会引爆炸弹，触发ANR。

3.InputDispatching超时机制

input的超时检测机制跟service、broadcast、provider截然不同，为了更好的理解input过程先了解两个重要线程的相关工作：

①InputReader线程负责通过EventHub(监听目录/dev/input)读取输入事件，一旦监听到输入事件则放入到InputDispatcher的mInBoundQueue队列，并通知其处理该事件；

②InputDispatcher线程负责将接收到的输入事件分发给目标应用窗口，分发过程使用到3个事件队列：mInBoundQueue用于记录InputReader发送过来的输入事件；outBoundQueue用于记录即将分发给目标应用窗口的输入事件；waitQueue用于记录已分发给目标应用，且应用尚未处理完成的输入事件。

所以，InputReader不断的从EventHub中监听是否有Input事件，InputReader把事件分发给InputDispatcher。InputDispatcher调用dispatchOnce()方法开始把事件分发给对应的View，就从InputDispatcher的分发开始监控ANR，InputDispatcher的ANR区间是查找窗口findFocusedWindowTargetsLocked()方法到resetANRTimeoutsLocked()重置方法。

void InputDispatcher::dispatchOnce() {

...

// 调用dispatchOnceInnerLocked

dispatchOnceInnerLocked( &nextWakeupTime);

}

void InputDispatcher::dispatchOnceInnerLocked( nsecs_t* nextWakeupTime) {

nsecs_t currentTime = now();

...

resetANRTimeoutsLocked(); // 重置标记

switch (mPendingEvent -> type) {

case EventEntry::TYPE_KEY:

// key类型

done = dispatchKeyLocked(currentTime, typedEntry, & dropReason, nextWakeupTime);

break;

case EventEntry::TYPE_MOTION:

done = dispatchMotionLocked( currentTime, typedEntry, & dropReason, nextWakeupTime);

break;

default:

ALOG_ASSERT(false);

break;

}

void InputDispatcher::resetANRTimeoutsLocked( ) {

// 将mInputTargetWaitCause设置为INPUT_TARGET_WAIT_CAUSE_NONE

mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_NONE;

mInputTargetWaitApplicationToken.clear();

}

在分发之前会调用resetANRTimeoutsLocked()方法，重置mInputTargetWaitCause标记为：INPUT_TARGET_WAIT_CAUSE_NONE。接着根据下发的类型，寻找对应的窗口，比如KEY类型，则调用dispatchKeyLocked()方法。

bool InputDispatcher::dispatchKeyLocked( nsecs_t currentTime, KeyEntry*entry, DropReason*dropReason, nsecs_t*nextWakeupTime) {

// 寻找目标窗口

int32_t injectionResult = findFocusedWindowTargetsLocked(currentTime, entry, inputTargets, nextWakeupTime);

// 给目标窗口分发事件

dispatchEventLocked(currentTime, entry, inputTargets);

return true;

}

int32_t InputDispatcher::findFocusedWindowTar getsLocked(nsecs_t currentTime, const EventEntry*entry, std::vector<InputTarget>&inputTargets, nsecs_t*nextWakeupTime) {

...

// 检查窗口不能input的原因

reason = checkWindowReadyForMoreInputLo cked(currentTime, focusedWindowHandle, entry, "focused");

if (!reason.empty()) {

// 调用handleTargetsNotReadyLocked()方法

injectionResult = handleTargetsNotReadyLocked(currentTime, entry, focusedApplicationHandle, focusedWindowHandle, nextWakeupTime, reason.c_str());

goto Unresponsive;

}

...

return injectionResult;

}

int32_t InputDispatcher::handleTargetsNotRea dyLocked(nsecs_t currentTime,const EventEntry*entry,const sp<InputApplicationHandle>&applicationHandle,const sp<InputWindowHandle>&windowHandle, nsecs_t*nextWakeupTime, const char*reason) {

// 在resetANRTimeoutsLocked方法中，mInputTargetWaitCause为INPUT_TARGET_WAIT_CAUSE_NONE

if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY) {

// DEFAULT_INPUT_DISPATCHING_TIMEOUT为5s

nsecs_t timeout;

if (windowHandle != nullptr) {

timeout = windowHandle -> getDispatchingTimeout(DEFAULT_INPUT_DISPATCHING_TIMEOUT);

} else if (applicationHandle != nullptr) {

timeout = applicationHandle -> getDispatchingTimeout(DEFAULT_INPUT_DISPATCHING_TIMEOUT);

} else {

timeout = DEFAULT_INPUT_DISPATCHING_TIMEOUT;

}

// 要等到下次调用resetANRTimeoutsLocked时才能进

mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY;

// 当前时间加上5s

mInputTargetWaitTimeoutTime = currentTime + timeout;

mInputTargetWaitTimeoutExpired = false;

mInputTargetWaitApplicationToken.clear();

}

if (mInputTargetWaitTimeoutExpired) {

return INPUT_EVENT_INJECTION_TIMED_OUT;

}

if (currentTime >= mInputTargetWaitTimeoutTime) {

// 当前时间超过设定的5s，后执行onANRLocked()的ANR方法

onANRLocked(currentTime, applicationHandle, windowHandle, entry -> eventTime, mInputTargetWaitStartTime, reason);

return INPUT_EVENT_INJECTION_PENDING;

} else {

// Force poll loop to wake up when timeout is due.

if (mInputTargetWaitTimeoutTime < *nextWakeupTime){

*nextWakeupTime = mInputTargetWaitTimeoutTime;

}

return INPUT_EVENT_INJECTION_PENDING;

}

在分发一次事件时，会调用resetANRTimeoutsLocked将标记为INPUT_TARGET_WAIT_CAUSE_NONE，所以第一次事件会设置一个5s后的超时时间，并把标记设置为INPUT_TARGET_WAIT_CAUSE_APPLICATI ON_NOT_READY，如果下次事件来临时当前的时间超过上次设置的5s时间就会调用onANRLocked()方法产生ANR。

void InputDispatcher::onANRLocked(nsecs_t currentTime, const sp<InputApplicationHandle>&applicationHandle, const sp<InputWindowHandle>&windowHandle, nsecs_t eventTime, nsecs_t waitStartTime, const char*reason) {

float dispatchLatency = (currentTime - eventTime) * 0.000001f;

float waitDuration = (currentTime - waitStartTime) * 0.000001f;

// 收集ANR现场信息

time_t t = time(nullptr);

struct tm tm;

localtime_r( & t, &tm);

char timestr[ 64];

strftime(timestr, sizeof(timestr), "%F %T", & tm);

mLastANRState.clear();

mLastANRState += INDENT "ANR:\n";

mLastANRState += StringPrintf( INDENT2"Time: %s\n", timestr);

mLastANRState += StringPrintf( INDENT2"Window: %s\n", getApplicationWindowLabel(applicationHandle, windowHandle).c_str());

mLastANRState += StringPrintf( INDENT2"DispatchLatency: %0.1fms\n", dispatchLatency);

mLastANRState += StringPrintf( INDENT2"WaitDuration: %0.1fms\n", waitDuration);

mLastANRState += StringPrintf( INDENT2"Reason: %s\n", reason);

//dump信息

dumpDispatchStateLocked(mLastANRState);

//将ANR命令加入commandQueue

CommandEntry * commandEntry = postCommandLocked( &InputDispatcher::doNotifyANRLockedInterruptible);

commandEntry -> inputApplicationHandle = applicationHandle;

commandEntry -> inputChannel = windowHandle != nullptr ? getInputChannelLocked(windowHandle -> getToken()) : nullptr;

commandEntry -> reason = reason;

}

在下次执行InputDispatcher.dispatchOnce时会先执行commandQueue的队列命令，这里把InputDispatcher::doNotifyANRLockedInterruptible放入到了队列。

void InputDispatcher::doNotifyANRLockedInterr uptible(CommandEntry*commandEntry) {

mLock.unlock();

//mPolicy是指NativeInputManager

nsecs_t newTimeout = mPolicy -> notifyANR( commandEntry -> inputApplicationHandle, commandEntry -> inputChannel ? commandEntry -> inputChannel -> getToken() : nullptr, commandEntry -> reason);

mLock.lock();

resumeAfterTargetsNotReadyTimeoutLocked( newTimeout, commandEntry -> inputChannel);

}

mPolicy -> notifyANR通过JNI最终调用到InputManagerService.notifyANR()方法：

private long notifyANR(InputApplicationHandle inputApplicationHandle, IBinder token, String reason) {

return mWindowManagerCallbacks.notifyANR( inputApplicationHandle, token, reason);

}

这里的mWindowManagerCallbacks是InputManagerCallback对象。

InputManagerCallback.java：

public long notifyANR(InputApplicationHandle inputApplicationHandle, IBinder token, String reason) {

final long startTime = SystemClock.uptimeMillis();

try {

return notifyANRInner( inputApplicationHandle, token, reason);

} finally {

}

private long notifyANRInner( InputApplicationHandle inputApplicationHandle, IBinder token, String reason) {

...

// 调用AMS的inputDispatchingTimedOut()方法

long timeout = mService.mAmInternal.inputDispatchingTimedOut(windowPid, aboveSystem, reason);

return 0; // abort dispatching

}

最终调用到AMS.inputDispatchingTimedOut()方法：

AMS.java：

long inputDispatchingTimedOut(int pid, final boolean aboveSystem, String reason) {

if (checkCallingPermission(FILTER_EVENTS) != PackageManager.PERMISSION_GRANTED) {

throw new SecurityException("Requires permission " + FILTER_EVENTS);

}

ProcessRecord proc;

long timeout;

synchronized (this) {

synchronized (mPidsSelfLocked) {

proc = mPidsSelfLocked.get(pid);

}

timeout = proc != null ? proc.getInputDispatchingTimeout() : KEY_DISPATCHING_TIMEOUT_MS;

}

// 调用inputDispatchingTimedOut

if (inputDispatchingTimedOut(proc, null, null, null, null, aboveSystem, reason)) {

return -1;

}

return timeout;

}

boolean inputDispatchingTimedOut( ProcessRecord proc, String activityShortComponentName, ApplicationInfo aInfo, String parentShortComponentName, WindowProcessController parentProcess, boolean aboveSystem, String reason) {

// 调用appNotResponding方法

mAnrHelper.appNotResponding(proc, activityShortComponentName, aInfo, parentShortComponentName, parentProcess, aboveSystem, annotation);

return true;

}

最终还是执行appNotResponding()方法。

注意，input的超时机制并非时间到了一定就会爆炸，而是在处理后续上报事件的过程才会去检测是否该爆炸，它更像扫雷的过程，具体如图：

①InputReader线程通过EventHub监听底层上报的输入事件，一旦收到输入事件则将其放至mInBoundQueue队列，并唤醒InputDispatcher线程；

②inputDispatcher开始分发输入事件，设置埋雷的起点时间。

先检测是否有正在处理的事件(mPendingEvent)，如果没有则取出mInBoundQueue队头的事件，并将其赋值给mPendingEvent，且重置ANR的timeout；否则不会从mInBoundQueue中取出事件，也不会重置timeout。

然后检查窗口是否就绪(checkWindowReadyForMoreInputLocked)，满足以下任一情况：

1）对于按键类型的输入事件，则outboundQueue或者waitQueue不为空；

2）对于非按键的输入事件，则waitQueue不为空，且等待队头时间超时500ms。

就会进入扫雷状态(检测前一个正在处理的事件是否超时)，终止本轮事件分发，否则继续执行步骤3。当应用窗口准备就绪，则将mPendingEvent转移到outBoundQueue队列。

③当outBoundQueue不为空，且应用管道对端连接状态正常，则将数据从outboundQueue中取出事件，放入waitQueue队列；

④InputDispatcher通过socket告知目标应用所在进程可以准备开始干活；

⑤App在初始化时默认已创建跟中控系统双向通信的socketpair，此时App的包工头(main线程)收到输入事件后，会层层转发到目标窗口来处理；

⑥包工头完成工作后，会通过socket向中控系统汇报工作完成，则中控系统会将该事件从waitQueue队列中移除。

input超时机制为什么是扫雷，而非定时爆炸呢？这是因为对于input来说即便某次事件执行时间超过timeout时长，只要用户后续没有再生成输入事件，则不会触发ANR。这里的扫雷是指当前输入系统中正在处理着某个耗时事件的前提下，后续的每一次input事件都会检测前一个正在处理的事件是否超时（进入扫雷状态），检测当前的时间距离上次输入事件分发时间点是否超过timeout时长。如果前一个输入事件，则会重置ANR的timeout，从而不会爆炸。

到这里，关于service ，广播，provider的anr原因都清楚了。下面就看看是如何对anr信息进行收集的。

4.appNotResponding处理流程

不管是哪种anr，最终都会调用到ProcessRecord 的appNotResponding方法：

ProcessRecord.java：

void appNotResponding(String activityShortComponentName, ApplicationInfo aInfo, String parentShortComponentName, WindowProcessController parentProcess, boolean aboveSystem, String annotation) {

ArrayList<Integer> firstPids = new ArrayList<>(5);

SparseArray<Boolean> lastPids = new SparseArray<>(20);

mWindowProcessController.appEarlyNotResp onding(annotation, () -> kill("anr", true));

// anr时间，实际上发生anr的时候，此时收集的运行堆栈有可能并不是引起 anr 的堆栈

long anrTime = SystemClock.uptimeMillis();

if (isMonitorCpuUsage()) {

mService.updateCpuStatsNow();

}

synchronized (mService) {

// 记录 anr 到 eventlog

EventLog.writeEvent( EventLogTags.AM_ANR, userId, pid, processName, info.flags, annotation);

}

// 记录 anr 到 mainlog

StringBuilder info = new StringBuilder();

info.setLength(0);

info.append("ANR in ").append(processName);

if (activityShortComponentName != null) {

info.append(" (").append( activityShortComponentName).append(")");

}

info.append("\n");

info.append("PID: ").append(pid).append("\n");

if (annotation != null) {

info.append("Reason: ").append( annotation).append("\n");

}

if (parentShortComponentName != null && parentShortComponentName.equals(activityShortComponentName)) {

info.append("Parent: ").append( parentShortComponentName).append("\n");

}

　// 创建 cpu tracker 对象

ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);

// 收集堆栈信息

File tracesFile = ActivityManagerService. dumpStackTraces(firstPids, (isSilentAnr()) ? null : processCpuTracker, (isSilentAnr()) ? null : lastPids, nativePids);

String cpuInfo = null;

　// 添加 cpu 信息

if (isMonitorCpuUsage()) {

mService.updateCpuStatsNow();

synchronized ( mService.mProcessCpuTracker) {

cpuInfo = mService.mProcessCpuTr acker.printCurrentState(anrTime);

}

info.append( processCpuTracker.printCurrentLoad());

info.append(cpuInfo);

}

info.append( processCpuTracker.printCurrentState(anrTime));

}

StatsLog.write(StatsLog.ANR_OCCURRED, uid, processName, activityShortComponentName == null ? "unknown": activityShortComponentName, annotation, (this.info != null) ? (this.info.isInstantApp() ? StatsLog.ANROCCURRED__IS_INSTANT_APP__TRUE : StatsLog.ANROCCURRED__IS_INSTANT_APP__FALSE) : StatsLog.ANROCCURRED__IS_INSTANT_APP__UNAVAILABLE, isInterestingToUserLocked() ? StatsLog.ANROCCURRED__FOREGROUND_STATE__FOREGROUND : StatsLog.ANROCCURRED__FOREGROUND_STATE__BACKGROUND, getProcessClassEnum(),(this.info != null) ? this.info.packageName : "");

final ProcessRecord parentPr = parentProcess != null ? (ProcessRecord) parentProcess.mOwner : null;

// 将traces文件和 CPU使用率信息保存到dropbox，即data/system/dropbox目录

mService.addErrorToDropBox("anr", this, processName, activityShortComponentName, parentShortComponentName, parentPr, annotation, cpuInfo, tracesFile, null);

if(mWindowProcessController.appNotRe sponding(info.toString(), () -> kill("anr", true),() -> {

synchronized (mService) {

mService.mServices.scheduleServiceTime outLocked(this);

}

})) {

return;

}

makeAppNotRespondingLocked( activityShortComponentName, annotation != null ? "ANR " + annotation : "ANR", info.toString());

if (mService.mUiHandler != null) {

Message msg = Message.obtain();

msg.what = ActivityManagerService.SHO W_NOT_RESPONDING_UI_MSG;

msg.obj = new AppNotRespondingDialog.Data(this, aInfo, aboveSystem);

　 // 发送 anr 弹窗信息

mService.mUiHandler.sendMessage( msg);

}

当发生ANR时, 会按顺序依次执行：

①输出ANR Reason信息到EventLog. 也就是说ANR触发的时间点最接近的就是EventLog中输出的am_anr信息;

②收集并输出重要进程列表中的各个线程的traces信息，该方法较耗时;

③输出当前各个进程的CPU使用情况以及CPU负载情况;

④将traces文件和 CPU使用情况信息保存到dropbox，即data/system/dropbox目录

⑤根据进程类型,来决定直接后台杀掉,还是弹框告知用户。

5.总结

①当出现ANR时，都是调用到AMS.appNotResponding()方法， rovider例外.

②Timeout时长

对于前台服务，则超时为SERVICE_TIMEOUT = 20s；

对于后台服务，则超时为SERVICE_BACKGROUND_TIMEOUT = 200s

对于前台广播，则超时为BROADCAST_FG_TIMEOUT = 10s；

对于后台广播，则超时为BROADCAST_BG_TIMEOUT = 60s;

ContentProvider超时为CONTENT_PROVIDER_PUBLISH_TIMEOUT = 10s;

③超时检测

Service超时检测机制：

超过一定时间没有执行完相应操作来触发移除延时消息，则会触发anr;

BroadcastReceiver超时检测机制：

有序广播的总执行时间超过 2* receiver个数 * timeout时长，则会触发anr;

有序广播的某一个receiver执行过程超过 timeout时长，则会触发anr;

④对于Service, Broadcast, Input发生ANR之后,最终都会调用AMS.appNotResponding;

对于provider,在其进程启动时publish过程可能会出现ANR, 则会直接杀进程以及清理相应信息，而不会弹出ANR的对话框. appNotRespondingViaProvider()过程会走appNotResponding(), 这个就不介绍了，很少使用，由用户自定义超时时间.

Android ANR触发机制（二）

猜你喜欢