Android Native crash handling case sharing

1. Background

Currently mPaas[1] Android uses the Crash SDK to handle crashes. CrashSDK is a powerful crash log collection SDK on the Android platform. It has a very high crash collection rate and complete and comprehensive crash log information. The generated log The content is very helpful to follow up and solve the problem. In daily operation and maintenance, I often encounter some crashes, and I cannot find the reason directly from the crash stack, especially some non-Java Native crashes. Here is how to use the Crash SDK to analyze the crashes under the mPaas framework.

2. Introduction of Flashback Message Analysis Tool

For mPaas users, the original crash information exported from the crash analysis platform on MAS is usually the original crash information, which is more difficult to read directly. Users can download the Chrome plug-in LogAnalyzer, and LogAnalyzer will Crash SDK The generated log text content is transformed into an HTML page with strong visual effects, which is powerful and includes:

1) Highlight the key information in the log, and use different colors to distinguish;
2) Support the preview of the overall structure of the log content, and quickly locate the key content;
3) Common crash causes reminders;

After installing the chrome plug-in, the following configuration is still required:

1. Modify the suffix of the crash file to .txt

Since the suffix of the file downloaded by MAS is .dat by default, it needs to be changed to .txt, otherwise LogAnalyzer cannot recognize it.

2. Modify the plugin configuration

Due to Chrome's default permission restrictions, any Chrome plug-in cannot access file URLs by default. You need to perform the following operations in the Chrome plug-in.

1) Open the Chrome plug-in management page chrome://extensions/
2) Find the LogAnalyzer plug-in, click "Details" to enter the settings:

3) Check the "Allow access to file URL" option
4) Open or refresh the log page, LogAnalyzer will take effect.

3. Effectiveness

After dragging the log file directly to chrome, you can see that after the plugin on the right takes effect, the various fields of the flashback information can be displayed in different colors

The instructions for use after first opening are as follows:

The screenshot of the normal view flashback is as follows:

3. Crash analysis example

We often encounter some non-Java Native modules crashing during daily operation and maintenance, such as UC. In fact, in many scenarios, the root cause of the crash is not UC, but the final crash point is UC. Take the daily operation and maintenance of the high-frequency UC kernel crash as an example, and share the handling of some cases as follows.

1. java null pointer causes UC to crash

The following flashback information can be seen on the flashback point (customer apk related information has been hidden), there is no clue for now, continue to check the log.

When viewing the logcat node information, you first see the keyword: begin to generate native report, which means that it is the log reported by the flashback log. Looking forward, the abnormal stack information is printed in the logcat node. You can see from the stack information, because The precreate operation triggered the underlying null pointer, which caused an initialization exception and triggered a crash. The solution is to temporarily close the pre-creation to avoid the crash.

As can be seen from the above case,

1) The cause of the crash of Native is not necessarily the Native module, it may be caused by java exception.
2) Begin to generate native report You can view the logcat information related to the crash and assist in locating the context log of the crash.

2. The upper OOM causes UC to crash

First, check the log of the reported crash point. As shown in the figure below, the crash is in RenderThread and there is no clue.

Secondly, look for the begin to generate native report report node in the logcat node, and you can see a large number of abnormal logs of the underlying OOM. It is basically determined that it is the cause of OOM. Continue to find the trigger source of OOM.

Click on the memory node in the flashback, the basic reason is clearer, the vmsize of the current mobile phone has basically reached the maximum, we know that for a 32-bit process, the maximum VmSize that APP can use is 3GB, but when running on a 64-bit CPU When, VmSize can exceed 3GB at most, close to 4GB. However, due to the kernel needs to occupy a part and the difference between different ROM versions, there are the following rules: Android 8.1.0 and later systems, when most native oom crashes occur, vmSize is distributed between 3.5-3.9 G, which is relatively concentrated. The following explains how to solve OOM.

3. FD is turned off by mistake, causing UC to crash

The log is shown in the figure below. I can probably only see that SIGILL may crash actively, and crash ILL_ILLOPC indicates an illegal operation.

Then we continue to look at the begin to generate native report of the logcat node, basically confirming that the reason is that the FD object used for uc was closed by other programs.

Subsequently, UC provided a toolkit with FDscan. After we reproduced it, we found that it was because the input stream object that UC called the shouldIntercept callback was closed by other modules, which caused the UC to find that the FD object was closed when it was used, which caused a crash deal with. The final solution becomes the user to solve the problem of wrongly closing FD of other modules.

4. Summary

Based on the above case analysis, when encountering a native module crash, generally if you can’t see the cause from the direct crash stack, don’t be anxious, you can search for begin to generate native report to find the crash context, and look at the logcat flash. Retire the context log, there will be some gains. At the same time, for oom-type problems, you can combine the current memory statistics to see.

Original link

This article is the original content of Alibaba Cloud and may not be reproduced without permission

Guess you like

Origin blog.csdn.net/weixin_43970890/article/details/112978794