Can the APP never crash—the match between Xiaoguang and me

Author: zz building blocks

Preface

Regarding intercepting exceptions , everyone must know that Thread.setDefaultUncaughtExceptionHandlerexceptions that occur in the App can be intercepted and then processed.

So, I had an immature idea. . .


Let my app never crash

Since we can intercept the crash, we directly intercept all exceptions in the APP without killing the program. The user experience of such a crash-free APP is not leveraged?

  • Someone shook their heads and expressed disagreement.

"Old Tie, there is a crash because you want you to solve it, not to cover it up!!"

  • I slapped the fan a few times, it was a bit cold but pretending to be calm and said:

"Brother, you can upload the exception to your own server for processing. You can get the cause of your crash, and the user will not crash the APP due to the exception. Isn't that good?"

  • Xiaoguang said angrily:

"This is definitely a problem, listen to it do not fly , hum, I give it a try."


Hikaru's experiment

So 小博主—积木Xiaoguang wrote the following code to catch the exception according to an article on the Internet :

//定义CrashHandler
class CrashHandler private constructor(): Thread.UncaughtExceptionHandler {
    private var context: Context? = null
    fun init(context: Context?) {
        this.context = context
        Thread.setDefaultUncaughtExceptionHandler(this)
    }

    override fun uncaughtException(t: Thread, e: Throwable) {}

    companion object {
        val instance: CrashHandler by lazy(mode = LazyThreadSafetyMode.SYNCHRONIZED) {
            CrashHandler() }
    }
}

//Application中初始化
class MyApplication : Application(){
    override fun onCreate() {
        super.onCreate()
        CrashHandler.instance.init(this)
    }
}

//Activity中触发异常
class ExceptionActivity : AppCompatActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_exception)

        btn.setOnClickListener {
            throw RuntimeException("主线程异常")
        }
        btn2.setOnClickListener {
            thread {
                throw RuntimeException("子线程异常")
            }
        }
    }
}

Xiaoguang wrote down the whole set of code after a pause. In order to verify his conjecture, he wrote two situations that trigger exceptions: child thread crash and main thread crash .

  • Run, click button 2 to trigger an abnormal crash of the child thread:

"Hey, there is really no effect, the program can continue to run normally"

  • Then click button 1 to trigger an abnormal crash of the main thread:

"Hey, I'm stuck, just click a few more times and ANR directly"

"Sure enough, there is a problem, but why is there a problem with the main thread? I have to figure it out before confronting the old iron."


Xiaoguang's thinking (analysis of abnormal source code)

First of all, the exceptions in java under popular science, including 运行时异常and 非运行时异常:

  • Runtime exception. RuntimeExceptionThe exceptions of the class and its subclasses are unchecked exceptions, such as system exceptions or program logic exceptions. We often encounter such exceptions NullPointerException、IndexOutOfBoundsException. Encounter this kind of exception, it Java Runtimewill stop the thread, print the exception, and stop the program running, which is what we often call the program crash.

  • Non-runtime exception. It belongs to the Exceptionclass and its subclasses, is the checked exception, and RuntimeExceptionother exceptions. This kind of exception must be handled in the program, and the program cannot be compiled normally if it is not handled, such as NoSuchFieldException,IllegalAccessExceptionthis.

ok, which means that RuntimeExceptionafter we throw an exception, the thread will be stopped. If this exception is thrown in the main thread, the main thread will be stopped, so the APP will be stuck and cannot operate normally, and it will take a long time ANR. The collapse of the child thread will not affect the operation of the main thread, that is, the UI thread, so the user can still use it normally.

This seems to make sense.

Wait, why setDefaultUncaughtExceptionHandlerdoesn't it crash when encountered ?

We have to start with the abnormal source code:

Under normal circumstances, the threads used in an application are all in the same thread group, and as long as there is an uncaught exception in one thread in this thread group, the JAVA virtual machine will call the current thread in the thread group uncaughtException()method.

// ThreadGroup.java
  private final ThreadGroup parent;

    public void uncaughtException(Thread t, Throwable e) {
        if (parent != null) {
            parent.uncaughtException(t, e);
        } else {
            Thread.UncaughtExceptionHandler ueh =
                Thread.getDefaultUncaughtExceptionHandler();
            if (ueh != null) {
                ueh.uncaughtException(t, e);
            } else if (!(e instanceof ThreadDeath)) {
                System.err.print("Exception in thread \""
                                 + t.getName() + "\" ");
                e.printStackTrace(System.err);
            }
        }
    }

parentRepresents the parent thread group of the current thread group, so this method will still be called in the end. Then look at the code behind, by getDefaultUncaughtExceptionHandlergetting the system default exception handler, and then calling the uncaughtExceptionmethod. Then we go to find this exception handler in the original system- UncaughtExceptionHandler.

This is from APP start talking about the process, also I said before, all Android进程are made zygote进程forkfrom, and when a new process is started will call the zygoteInitmethod, which will be some initial work in the application:

    public static final Runnable zygoteInit(int targetSdkVersion, String[] argv, ClassLoader classLoader) {
        if (RuntimeInit.DEBUG) {
            Slog.d(RuntimeInit.TAG, "RuntimeInit: Starting application from zygote");
        }

        Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "ZygoteInit");
        //日志重定向
        RuntimeInit.redirectLogStreams();
        //通用的配置初始化  
        RuntimeInit.commonInit();
        // zygote初始化
        ZygoteInit.nativeZygoteInit();
        //应用相关初始化
        return RuntimeInit.applicationInit(targetSdkVersion, argv, classLoader);
    }

Regarding the exception handler, it is in this general configuration initialization method:

    protected static final void commonInit() {
        if (DEBUG) Slog.d(TAG, "Entered RuntimeInit!");

       //设置异常处理器
        LoggingHandler loggingHandler = new LoggingHandler();
        Thread.setUncaughtExceptionPreHandler(loggingHandler);
        Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler(loggingHandler));

        //设置时区
        TimezoneGetter.setInstance(new TimezoneGetter() {
            @Override
            public String getId() {
                return SystemProperties.get("persist.sys.timezone");
            }
        });
        TimeZone.setDefault(null);

        //log配置
        LogManager.getLogManager().reset();
        //***    

        initialized = true;
    }

Found it, here is the default exception handler of the application- KillApplicationHandler.

private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
        private final LoggingHandler mLoggingHandler;

        public KillApplicationHandler(LoggingHandler loggingHandler) {
            this.mLoggingHandler = Objects.requireNonNull(loggingHandler);
        }

        @Override
        public void uncaughtException(Thread t, Throwable e) {
            try {
                ensureLogging(t, e);
                //...    
                // Bring up crash dialog, wait for it to be dismissed
                ActivityManager.getService().handleApplicationCrash(
                        mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
            } catch (Throwable t2) {
                if (t2 instanceof DeadObjectException) {
                    // System process is dead; ignore
                } else {
                    try {
                        Clog_e(TAG, "Error reporting crash", t2);
                    } catch (Throwable t3) {
                        // Even Clog_e() fails!  Oh well.
                    }
                }
            } finally {
                // Try everything to make sure this process goes away.
                Process.killProcess(Process.myPid());
                System.exit(10);
            }
        }

        private void ensureLogging(Thread t, Throwable e) {
            if (!mLoggingHandler.mTriggered) {
                try {
                    mLoggingHandler.uncaughtException(t, e);
                } catch (Throwable loggingThrowable) {
                    // Ignored.
                }
            }
        }

Seeing this, Xiaoguang smiled relievedly and was caught by me. In the uncaughtExceptioncallback method, a handleApplicationCrashmethod will be executed for exception handling, and finally will go to the finallymiddle for process destruction Try everything to make sure this process goes away. So the program crashed.

The crash notification pop-up window we usually see on the mobile phone handleApplicationCrashpops up in this method. Not only java crashes, but also native_crash、ANRexceptions we usually encounter will eventually go to the handleApplicationCrashmethod for crash handling.

In addition, some people may find a constructor, passing one LoggingHandler, and in uncaughtExceptionthe callback method also call this LoggingHandlerthe uncaughtExceptionmethod, Is this LoggingHandlerwhat we usually encounter crash, saw the crash log? Go in and take a look:

private static class LoggingHandler implements Thread.UncaughtExceptionHandler {
        public volatile boolean mTriggered = false;

        @Override
        public void uncaughtException(Thread t, Throwable e) {
            mTriggered = true;
            if (mCrashing) return;

            if (mApplicationObject == null && (Process.SYSTEM_UID == Process.myUid())) {
                Clog_e(TAG, "*** FATAL EXCEPTION IN SYSTEM PROCESS: " + t.getName(), e);
            } else {
                StringBuilder message = new StringBuilder();
                message.append("FATAL EXCEPTION: ").append(t.getName()).append("\n");
                final String processName = ActivityThread.currentProcessName();
                if (processName != null) {
                    message.append("Process: ").append(processName).append(", ");
                }
                message.append("PID: ").append(Process.myPid());
                Clog_e(TAG, message.toString(), e);
            }
        }
    }

    private static int Clog_e(String tag, String msg, Throwable tr) {
        return Log.printlns(Log.LOG_ID_CRASH, Log.ERROR, tag, msg, tr);
    }

Isn't that it? Some information about the crash—such as thread, process, process id, crash reason, etc.—is printed out through Log. Here is a picture of the crash log for everyone to see:

Okay, back to the right track, so we setDefaultUncaughtExceptionHandlerset up our own crash handler through the method, and topped the crash handler set by the previous application, and then we didn’t do any processing, and the natural program would not crash. , Here is a summary picture.


Xiaoguang came to confront me again

  • Xiaoguang who figured out all this came to me again:

"Lao Tie, take a look. This is Demothe information I wrote and summarized. Your set simply doesn't work. If the main thread crashes, GG will happen. I'll just say there is a problem."

  • I continue to pretend to be calm :

"Brother, I forgot to say it last time. Just adding this UncaughtExceptionHandleris not enough. You have to add a piece of code and send it to you. Go back and try."

    Handler(Looper.getMainLooper()).post {
        while (true) {
            try {
                Looper.loop()
            } catch (e: Throwable) {
            }
        }
    }

"This, can it work"


Xiaoguang's experiment again

Xiaoguang added the above code to the program (Application—onCreate) and run it again:

I went to, 真的没问题了after clicking the main thread crash, the app can still be operated normally. What is the principle?


Xiaoguang's thinking again (the idea of ​​intercepting the crash of the main thread)

We all know that Handlera set of mechanisms maintained in the main thread Looperare created and initialized when the application starts , and the loopmethod is called to start the message loop processing. In the application process, all operations of the main thread, such as event clicks, list sliding, etc., are processed in this loop. The essence is to add messages to the MessageQueuequeue, and then loop the messages from this queue and process them. If not When the message is processed, it will rely on the epoll mechanism to suspend and wait for wake-up. Post my condensed loopcode:

    public static void loop() {
        final Looper me = myLooper();
        final MessageQueue queue = me.mQueue;
        for (;;) {
            Message msg = queue.next(); 
            msg.target.dispatchMessage(msg);
        }
    }

An endless loop, constantly fetching messages to process messages. Look back at the code just added:

    Handler(Looper.getMainLooper()).post {
        while (true) {
            //主线程异常拦截
            try {
                Looper.loop()
            } catch (e: Throwable) {
            }
        }
    }

We Handlersend a runnabletask to the main thread , and then runnableadd an infinite loop to this, and execute the Looper.loop()message loop reading in the infinite loop . This will cause all subsequent main thread messages to go to our loopmethod for processing, that is, once the main thread crashes, exception capture can be performed here. At the same time, because we are writing a while loop, after catching the exception, a new Looper.loop()method execution will start . In this way, the Looper of the main thread can always read messages normally, and the main thread can always run normally.

Pictures with unclear text to help us:

At the same time, the previous CrashHandlerlogic can ensure that the child thread is not affected by the crash, so the two pieces of code are added and all live.

But Xiaoguang was still not convinced, and he thought of another collapse. . .


Xiaoguang experimented again

class Test2Activity : AppCompatActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_exception)

        throw RuntimeException("主线程异常")
    }
}

Hey, I onCreatewill throw an exception for you directly in it, run and see:

It's black~ Yes, it's a black screen .


The final dialogue (Cockroach library ideas)

  • Seeing this scene, I took the initiative to find Xiaoguang:

"This situation is really troublesome. If Activityan exception is thrown directly in the life cycle, the interface drawing will not be completed and Activitycannot be started correctly, and the screen will be white or black. This seriously affects the user experience. It is recommended to directly 杀死APP, Because it is likely to affect other functional modules. Or if some Activity is not very important, it can be just finishthis Activity."

  • Xiaoguang asked thoughtfully:

"So how do you tell when a crash occurs during this life cycle?"

"This is through reflection. Borrowing Cockroachthe ideas in the open source library, since Activitythe life cycle of the Handlermessage processing is through the main thread , we can replace the Callbackcallback in the Handler of the main thread through reflection , that is ActivityThread.mH.mCallback, and then for each The message corresponding to each life cycle is trycatch to catch the exception, and then the finishActivityprocess can be executed or killed."

The main code:

		Field mhField = activityThreadClass.getDeclaredField("mH");
        mhField.setAccessible(true);
        final Handler mhHandler = (Handler) mhField.get(activityThread);
        Field callbackField = Handler.class.getDeclaredField("mCallback");
        callbackField.setAccessible(true);
        callbackField.set(mhHandler, new Handler.Callback() {
            @Override
            public boolean handleMessage(Message msg) {
                if (Build.VERSION.SDK_INT >= 28) {
                //android 28之后的生命周期处理
                    final int EXECUTE_TRANSACTION = 159;
                    if (msg.what == EXECUTE_TRANSACTION) {
                        try {
                            mhHandler.handleMessage(msg);
                        } catch (Throwable throwable) {
                            //杀死进程或者杀死Activity
                        }
                        return true;
                    }
                    return false;
                }

                //android 28之前的生命周期处理
                switch (msg.what) {
                    case RESUME_ACTIVITY:
                    //onRestart onStart onResume回调这里
                        try {
                            mhHandler.handleMessage(msg);
                        } catch (Throwable throwable) {
                            sActivityKiller.finishResumeActivity(msg);
                            notifyException(throwable);
                        }
                        return true;

A part of the code is posted, but everyone should understand the principle. It is by replacing the main thread Handlerto Callbackcatch the exception of the declaration cycle.

The next step is to carry out the processing work after the capture , either kill the process or kill the Activity.

  • Kill the process, this should be familiar to everyone
  Process.killProcess(Process.myPid())
  exitProcess(10)
  • finish off Activity

Here we need to analyze the finishflow of Activity again. To put it simply, take android29the source code as an example.

    private void finish(int finishTask) {
        if (mParent == null) {

            if (false) Log.v(TAG, "Finishing self: token=" + mToken);
            try {
                if (resultData != null) {
                    resultData.prepareToLeaveProcess(this);
                }
                if (ActivityTaskManager.getService()
                        .finishActivity(mToken, resultCode, resultData, finishTask)) {
                    mFinished = true;
                }
            } 
        } 

    }

    @Override
    public final boolean finishActivity(IBinder token, int resultCode, Intent resultData,
            int finishTask) {
        return mActivityTaskManager.finishActivity(token, resultCode, resultData, finishTask);
    }    

From the Activity finish源码can be learned, and ultimately call to ActivityTaskManagerServicethe finishActivitymethod, which has four parameters, which is used to identify a Activityparameter that is the most important parameter - token. So go to the source code to find the token~

Since the place we captured is in the handleMessagecallback method, so only one parameter Messagecan be used, so you can start from this aspect. Go back to the source code of the message we just processed, and see if we can find any clues:

 class H extends Handler {
        public void handleMessage(Message msg) {
            switch (msg.what) {
                case EXECUTE_TRANSACTION: 
                    final ClientTransaction transaction = (ClientTransaction) msg.obj;
                    mTransactionExecutor.execute(transaction);
                    break;              
            }        
        }
    }

    public void execute(ClientTransaction transaction) {
        final IBinder token = transaction.getActivityToken();
        executeCallbacks(transaction);
        executeLifecycleState(transaction);
        mPendingActions.clear();
        log("End resolving transaction");
    }    

You can see how the Handler processes the EXECUTE_TRANSACTIONmessage in the source code , gets the msg.objobject, which is the ClientTransactionclass instance, and then calls the executemethod. And in the executemethod. . . Hey, isn't this a token?

(I found it too fast. The main activityreason is that the source code explanation of this part of starting and destroying is not the focus of today, so I just took it away)

Find it token, then we can destroy the Activity through reflection:

    private void finishMyCatchActivity(Message message) throws Throwable {
        ClientTransaction clientTransaction = (ClientTransaction) message.obj;
        IBinder binder = clientTransaction.getActivityToken();

       Method getServiceMethod = ActivityManager.class.getDeclaredMethod("getService");
        Object activityManager = getServiceMethod.invoke(null);

        Method finishActivityMethod = activityManager.getClass().getDeclaredMethod("finishActivity", IBinder.class, int.class, Intent.class, int.class);
        finishActivityMethod.setAccessible(true);
        finishActivityMethod.invoke(activityManager, binder, Activity.RESULT_CANCELED, null, 0);
    }

Ah, finally got it done, but Xiaoguang still looked at me with a puzzled look:

"I'm still going to see Cockroachthe source code of the library~"

"I go,,"


to sum up

Today, I mainly talked about one thing: how to catch exceptions in the program and prevent APP from crashing, so as to bring users the best experience. The main methods are as follows:

  • By sending a message in the main thread, catch the exception of the main thread, and continue to call the Looper.loopmethod after the exception occurs , so that the main thread continues to process the message.
  • For the exception of the child thread, it can Thread.setDefaultUncaughtExceptionHandlerbe intercepted, and the stop of the child thread will not bring the user's perception.
  • For exceptions that occur during the life cycle, alternative ActivityThread.mH.mCallbackmethods can be used to capture, and through tokento end the Activity or directly kill the process. But this method needs to be adapted to the source code of different SDK versions, so use it with caution, you can see the source code of the Cockroach library at the end of the article.

Some friends may ask, why should the program not crash? There will be what the situation requires us to do this operation?

In fact, there are still many times when we have some abnormalities, 无法预料or we may bring almost 无感知abnormal abnormalities to users , such as:

  • Some bugs in the system
  • Some bugs in third-party libraries
  • Some bugs brought by mobile phones of different manufacturers

Waiting for these conditions, we can use this operation to APPsacrifice this part of the function to maintain the stability of the system.


At last

Finally, here I also share a piece of dry goods. The Android learning PDF + architecture video + source notes collected by the big guys , as well as advanced architecture technology advanced brain maps, Android development interview special materials, advanced advanced architecture materials to help everyone learn and improve Advanced, it also saves everyone's time to search for information on the Internet to learn, and you can also share with friends around you to learn together

If you need it, you can click to get it

Guess you like

Origin blog.csdn.net/ajsliu1233/article/details/111009178