[Android] ANR principle analysis (Service)

foreword

ANR, Application Not responding, that is, the application is not responding.

The Android system needs to complete some events within a certain time range. If the response cannot be obtained after the predetermined time or the response time is too long, ANR will be caused.

All messages related to ANR will be scheduled system_serverby , and then dispatched to the application process to complete the actual processing of the message. Once the application program processes the message timeout, the system will collect some system status, such as CPU/IO usage, process function call stack, and report that ANR has occurred in the user's process, and the ANR dialog box will pop up at the same time.

The ANR mechanism can be divided into two parts:

  • Monitoring mechanism: There is a set of monitoring mechanisms for different ANR types such as Broadcast, Service, and InputEvent
  • Reporting mechanism: After monitoring ANR, it is necessary to display the ANR dialog box and output the log

Service ANR principle

Android uses the Handler mechanism to set timing messages to detect Service timeout.

Service runs on the main thread of the application. If the Service is executed in the foreground process for more than 20 seconds , ANR will be triggered. If the Service is executed in the background process for more than 200 seconds , ANR will be triggered.

How to detect the ANR problem of Service is divided into two steps:

  • When a Service ANR occurs, you can generally check whether there is a time-consuming operation in the Service life cycle function
  • If the application layer code logic can't find the problem, you need to check the current system status, such as CPU usage, system service status, etc., to determine whether the process that ANR occurred at that time was affected by the abnormal operation of the system

So what is the principle of ANR in Service? Next, we will analyze it from the perspective of source code.

Android Serviced startup process

Let's first analyze the startup process of Service.

// 点击按钮 - 启动服务
fun clickToBindService(view: View) {
    
    
    Intent(this, MyNewService::class.java).also {
    
    
        startService(it)
    }
}

In the method of clicking the button to start the service, the startService method is called to start a Service, and the method is tracked, which will be called ContextImpl.startServicein :

// ContextImpl.java
@Override
public ComponentName startService(Intent service) {
    
    
    warnIfCallingFromSystemProcess();
    return startServiceCommon(service, false, mUser);
}

Then continue to execute ContextImpl.startServiceCommonthe method :

// ContextImpl.java
private ComponentName startServiceCommon(Intent service, boolean requireForeground,
        UserHandle user) {
    
    
// …………………………………………………………………………………………………………………………………………………………………………………………
        ComponentName cn = ActivityManager.getService().startService(
                mMainThread.getApplicationThread(), service,
                service.resolveTypeIfNeeded(getContentResolver()), requireForeground,
                getOpPackageName(), getAttributionTag(), user.getIdentifier());
// …………………………………………………………………………………………………………………………………………………………………………………………
}

ContextImpl.startServiceCommonAs you can see in the method, the method of is ActivityManagerService,即 AMSexecuted and tracked:startService

// ActivityManagerService.java
@Override
public ComponentName startService(IApplicationThread caller, Intent service,
        String resolvedType, boolean requireForeground, String callingPackage,
        String callingFeatureId, int userId)
        throws TransactionTooLargeException {
    
    
         UserHandle user) {
    
    
// …………………………………………………………………………………………………………………………………………………………………………………………
            res = mServices.startServiceLocked(caller, service,
                    resolvedType, callingPid, callingUid,
                    requireForeground, callingPackage, callingFeatureId, userId);
        UserHandle user) {
    
    
// …………………………………………………………………………………………………………………………………………………………………………………………
        return res;
    }
}

It can be seen that mServices.startServiceLockedthe method , and this mServices is an ActiveServices object, so the code continues to execute ActiveServices.startServiceLockedthe method:

// ActivityService.java
ComponentName startServiceLocked(IApplicationThread caller, Intent service, String resolvedType,
        int callingPid, int callingUid, boolean fgRequired, String callingPackage,
        @Nullable String callingFeatureId, final int userId)
        throws TransactionTooLargeException {
    
    
    return startServiceLocked(caller, service, resolvedType, callingPid, callingUid, fgRequired,
            callingPackage, callingFeatureId, userId, false, null);
}

In ActivityService, after a series of calls, ActivityService.realStartServiceLockedthe method

private void realStartServiceLocked(ServiceRecord r, ProcessRecord app,
        IApplicationThread thread, int pid, UidRecord uidRecord, boolean execInFg,
        boolean enqueueOomAdj) throws RemoteException {
    
    
    // ............................................................
    // 设置ANR超时,可见在正式启动 Service之前,会开始 ANR 的监测
    bumpServiceExecutingLocked(r, execInFg, "create", null /* oomAdjReason */);
    // ............................................................
    // 启动 Service 
    thread.scheduleCreateService(r, r.serviceInfo,
                mAm.compatibilityInfoForPackage(r.serviceInfo.applicationInfo),
                app.mState.getReportedProcState());
    // ............................................................  
    // 调动 Service 的其他方法,如 onStartCommand
    sendServiceArgsLocked(r, execInFg, true);
}

In the realStartServiceLocked method, ApplicationThread.scheduleCreateServicethe method (I won’t read it here), and finally ActivityThread.scheduleCreateServicethe method will be executed, as follows:

// ActivityThread.java
public final void scheduleCreateService(IBinder token,
        ServiceInfo info, CompatibilityInfo compatInfo, int processState) {
    
    
    updateProcessState(processState, false);
    CreateServiceData s = new CreateServiceData();
    s.token = token;
    s.info = info;
    s.compatInfo = compatInfo;
	// 通过 handler 发送创建 Service 的消息
    sendMessage(H.CREATE_SERVICE, s);
}

In ActivityThread.scheduleCreateServicethe method , a message to create a Service will be sent through the handler : CREATE_SERVICE . When this message is received, ActivityThread.handleCreateServicethe method , and finally service.onCreate()the method will be executed, and then the declaration cycle of the Service will be executed, as shown below:

// ActivityThread.java
public void handleMessage(Message msg) {
    
    
    if (DEBUG_MESSAGES) Slog.v(TAG, ">>> handling: " + codeToString(msg.what));
    switch (msg.what) {
    
    
        // ..........................................
        case CREATE_SERVICE:
            handleCreateService((CreateServiceData)msg.obj);
            break;
    }
    // ..........................................
}

// 最终执行到这里,调用 service.onCreate() 方法
private void handleCreateService(CreateServiceData data) {
    
    
// ..........................................
            service.onCreate();
// ..........................................
    }

The above is the startup process of the Service, and the overall is relatively clear.

Service timeout monitoring

The Service timeout monitoring mechanism can be found in the Service startup process.

In the above-mentioned Service startup process, we traced the execution to the ActivityService.realStartServiceLockedmethod , in which bumpServiceExecutingLockedwe start monitoring ANR by calling the method, the code is as follows:

// ActivityService.java:
private boolean bumpServiceExecutingLocked(ServiceRecord r, boolean fg, String why,
        @Nullable String oomAdjReason) {
    
    
// ..........................................
    scheduleServiceTimeoutLocked(r.app);
// ..........................................
}

bumpServiceExecutingLockedThe method will continue to call scheduleServiceTimeoutLockedthe method :

void scheduleServiceTimeoutLocked(ProcessRecord proc) {
    
    
    if (proc.mServices.numberOfExecutingServices() == 0 || proc.getThread() == null) {
    
    
        return;
    }
    Message msg = mAm.mHandler.obtainMessage(
            ActivityManagerService.SERVICE_TIMEOUT_MSG);
    msg.obj = proc;
    // 延时指定时间后,发送消息:SERVICE_TIMEOUT_MSG
    // 前台进程中执行 Service,SERVICE_TIMEOUT = 20s
    // 后台进程中执行 Service,SERVICE_BACKGROUND_TIMEOUT = 200s
    mAm.mHandler.sendMessageDelayed(msg, proc.mServices.shouldExecServicesFg()
            ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
}

If the message SERVICE_TIMEOUT_MSG is not removed within the specified time , the message will be processed in ActivityManagerService:

// ActivityManagerService.java:
final class MainHandler extends Handler {
    
    
    // ..........................................
    @Override
    public void handleMessage(Message msg) {
    
    
        switch (msg.what) {
    
    
        // ..........................................
        case SERVICE_TIMEOUT_MSG: {
    
    
            mServices.serviceTimeout((ProcessRecord) msg.obj);
        } break;
        case SERVICE_FOREGROUND_TIMEOUT_MSG: {
    
    
            mServices.serviceForegroundTimeout((ServiceRecord) msg.obj);
        } break;
        // ..........................................
    }
}

Here we see that mServices.serviceTimeout()the method , and mServices refers to ActivityService, so we enter ActivityService.serviceTimeoutthe method:

// ActivityService.java:
void serviceTimeout(ProcessRecord proc) {
    
    
        String anrMessage = null;
        synchronized(mAm) {
    
    
            if (proc.isDebugging()) {
    
    
                // 应用程序正在调试,忽略超时
                return;
            }
            final ProcessServiceRecord psr = proc.mServices;
            if (psr.numberOfExecutingServices() == 0 || proc.getThread() == null) {
    
    
                return;
            }
            final long now = SystemClock.uptimeMillis();
            final long maxTime =  now -
                    (psr.shouldExecServicesFg() ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
            ServiceRecord timeout = null;
            long nextTime = 0;
            // 遍历所有正在执行的服务,寻找运行超时的 Service
            for (int i = psr.numberOfExecutingServices() - 1; i >= 0; i--) {
    
    
                ServiceRecord sr = psr.getExecutingServiceAt(i);
                if (sr.executingStart < maxTime) {
    
    
                    timeout = sr;
                    break;
                }
                if (sr.executingStart > nextTime) {
    
    
                    nextTime = sr.executingStart;
                }
            }
            // 判断执行 Service 超时的进程是否在最近运行进程列表,如果不在,则忽略这个ANR
            if (timeout != null && mAm.mProcessList.isInLruListLOSP(proc)) {
    
    
                Slog.w(TAG, "Timeout executing service: " + timeout);
                StringWriter sw = new StringWriter();
                PrintWriter pw = new FastPrintWriter(sw, false, 1024);
                pw.println(timeout);
                timeout.dump(pw, "    ");
                pw.close();
                mLastAnrDump = sw.toString();
                mAm.mHandler.removeCallbacks(mLastAnrDumpClearer);
                mAm.mHandler.postDelayed(mLastAnrDumpClearer, LAST_ANR_LIFETIME_DURATION_MSECS);
                anrMessage = "executing service " + timeout.shortInstanceName;
            } else {
    
    
                Message msg = mAm.mHandler.obtainMessage(
                        ActivityManagerService.SERVICE_TIMEOUT_MSG);
                msg.obj = proc;
                mAm.mHandler.sendMessageAtTime(msg, psr.shouldExecServicesFg()
                        ? (nextTime+SERVICE_TIMEOUT) : (nextTime + SERVICE_BACKGROUND_TIMEOUT));
            }
        }
		// 当存在timeout的service,则执行appNotResponding,报告ANR
        if (anrMessage != null) {
    
    
            mAm.mAnrHelper.appNotResponding(proc, anrMessage);
        }
    }

It can be seen that at the end of this method, if there is an ANR, AnrHelper.appNotRespondingthe method to report an ANR.

The above is all the content of this article. The next article will continue to explore the ANR principle introduced by input event processing timeout from the perspective of source code.

Guess you like

Origin blog.csdn.net/yang553566463/article/details/125300720