Android- Troubleshooting tools and monitoring solutions for page freezes

Author: A Bowl of Clear Soup Noodles

foreword

It is a cliché to talk about freezes, and freezes are sensitive to users and are easily felt by users directly. So the reason, how to define the freeze, how to troubleshoot the occurrence of the freeze, when the online user freezes, when the offline cannot be reproduced, how to obtain information to locate the problem?

what is carton

Generally speaking, 12FPS (display frames per second) is the minimum standard, the picture below 12fps is basically not continuous, and when it is greater than 60FPS, it is difficult for human eyes to distinguish obvious changes, so 60FPS is a measure of the smoothness of an interface important indicator of degree. However, low FPS does not mean that freeze occurs, and FPS must not be high when freeze occurs, but is measured by the degree of frame drop over a period of time.
Assuming that according to 60FPS, the time of one frame is 1000ms/60 = 16.6667ms, if a serious frame drop causes the FPS to be extremely low within 1000ms, then the picture will be stuck.

Caton Troubleshooting Tool

There are many tools for troubleshooting stuttering, most of which are divided into two types:
instrument: Obtain the calling process of all functions within a period of time, and then further analyze the points to be optimized by analyzing the function calling process during this period.
sample: Selectively or use a sampling method to observe certain function call processes, and use these limited information to infer suspicious points in the process, and then continue to refine the analysis.

1. CPU Profiler

CPU Profiler is a tool that comes with Android Studio to view the usage of CPU, memory, network and battery resources. It can display the execution time and call stack in a graphical form, and the information includes all threads comprehensively. We can detect the code of the interval through the method of the Debug class:

Debug.startMethodTracing("xxx"); 
...
Debug.stopMethodTracing();

Finally, a xxx.trace file will be generated under the Android/data/packagename/files folder of the sd card. Through AS analysis, we can see the function calls and time-consuming, frame rate, CPU and other threads during this period. We can analyze the specific position to adjust, for example:

But the tool itself will bring a lot of performance overhead, so sometimes it cannot reflect the real situation.

2. Systrace

systrace is a new performance analysis tool in Android 4.1. systrace uses the ftrace debugging tool of Linux, which is equivalent to adding some performance probes in various key positions of the system, that is, adding some buried points for performance monitoring in the code. Android encapsulates atrace on the basis of ftrace, and adds more unique probes, such as Graphics, Activity Manager, Dalvik VM, System Server, etc. The systrace tool can only monitor the time-consuming situation of specific system calls, it belongs to the sample type, and the performance overhead is very low. But it does not support time-consuming analysis of application code, so there are some limitations when using it.
Instructions:
systrace.py -t 10 sched gfx view wm am app webview -a com.example.android_kt_wandroid

It can also use the method of code insertion to analyze before and after the method:

Trace.begainSection(name); 
...
Trace.endSection(name);

Using the systrace tool will generate a .systrace file, we can enter the chrome://tracing Load file in the Chrome browser, it can also see some conditions of cpu, frame, thread:

Through the Alerts on the right, you can prompt some suggestions for dropping frames.

To sum up: No matter which freeze monitoring tool is used, we can finally get some information about the stack when the freeze occurs and the CPU running at that time. Most of the stuck problems are relatively easy to locate, such as the main thread executing a time-consuming task, reading a very large file, or executing a network request, etc.

Caton Monitoring Solution

Sometimes we can’t use the above-mentioned tools at all. For example, when a user feedbacks xxx page xxx, a batch of freezes occurs. When I actually follow the steps described by the user, the whole process is very smooth, so we need the user to freeze. Some information, such as the user's system version, CPU load, network environment, application data, etc., is difficult to reproduce locally if it is separated from the site, and it is also difficult to solve the problem.

1. Message queue

This method relies on the main thread Looper to monitor dispatchMessagethe execution time of each time, which is how BlockCanary is implemented

public static void loop() {
    ...
    for (;;) {
        ...
        // This must be in a local variable, in case a UI event sets the logger
        Printer logging = me.mLogging;
        if (logging != null) {
            logging.println(">>>>> Dispatching to " + msg.target + " " +
                    msg.callback + ": " + msg.what);
        }
        msg.target.dispatchMessage(msg);
        if (logging != null) {
            logging.println("<<<<< Finished to " + msg.target + " " + msg.callback);
        }
        ...
    }
}

All the tasks of the main thread are dispatchMessage(msg)executed in the method, then we use the custom Printer dispatchMessage(msg)to record the time before and after, and before that, through another thread through Thread#getStackTracethe interface, the main thread execution stack information has been obtained and recorded, when the dispatchMessage(msg)time exceeds When a threshold is reached, we notify another thread to obtain stack information for analysis.

2.Choreographer

This method relies on the Choreographer module. We know that the Android system sends a VSYNC signal every 16ms to notify the interface to redraw and render. The cycle of each synchronization is 16.6ms, which represents the refresh rate of one frame. The SDK contains a related class and related callbacks. Theoretically speaking, the time period between the two callbacks should be 16ms. If it exceeds 16ms, we will consider that a freeze has occurred, and use the time period between the two callbacks to determine whether a freeze occurs.
We can register a FrameCallback listening object in the Choreographer module, and record the stack information of the main thread through another thread at the same time, and register the doFramelistening object cyclically every time the Vsync event notification returns, and indirectly count the time interval between two Vsync events. When the threshold is exceeded, the recorded stack is taken for analysis.

3. Stake

The above two methods need to open a single thread to record the method stack, which will occupy system resources, and secondly, sometimes there will be problems, because the running function may not be a real time-consuming function.
For example: Suppose three functions are executed in a message loop, and the entire message execution takes 3000ms. Since function A (1500ms) and function B (1000ms) have been executed, the function C (500ms) we get is not time-consuming, so we need Get the time-consuming of each function.

We can get all the class files at compile time through custom transform, insert all the methods before and after using the ASM tool, and retrieve the method information after detecting the lag at runtime.

//插桩前
void func(){
    dosomething();
}

//插桩后
void func(){
    Trace.i(x);
    dosomething();
    Trace.o(x);
}

But there will also be many details and optimization points:

  1. Avoid the explosion of the number of methods. The same function should be inserted at the entry and exit of the function, and an independent ID is assigned to each method in the code as a parameter in advance at compile time.
  2. Filter simple functions. Filter some simple functions like direct return, i++, and support blacklist configuration. For some functions that are called very frequently, they need to be added to the blacklist to reduce the performance loss of the entire solution.
  3. Optimize the time-consuming of compilation. It can be inserted asynchronously through the thread pool, combined with Future.
  4. The way method information is stored at runtime. Save the method information as an int value, similar to the design of MeasureSpec.

The Trace Canary module in WeChat's open source project matrixhas already solved the above problems. If you are interested, you can take a look at the source code.

Summarize

So far we have learned about several freeze monitoring methods, which can solve some of our problems. In fact, many freeze problems are not difficult to solve. Compared with solving them, it is more difficult to quickly find these freeze points. And find the real cause of the freeze through more auxiliary information.

In order to help everyone better grasp performance optimization comprehensively and clearly, we have prepared relevant learning routes and core notes (returning to the underlying logic):https://qr18.cn/FVlo89

Performance optimization core notes:https://qr18.cn/FVlo89

Startup optimization

Memory optimization

UI

optimization Network optimization

Bitmap optimization and image compression optimization : Multi-thread concurrency optimization and data transmission efficiency optimization Volume package optimizationhttps://qr18.cn/FVlo89




"Android Performance Monitoring Framework":https://qr18.cn/FVlo89

"Android Framework Study Manual":https://qr18.cn/AQpN4J

  1. Boot Init process
  2. Start the Zygote process at boot
  3. Start the SystemServer process at boot
  4. Binder driver
  5. AMS startup process
  6. The startup process of the PMS
  7. Launcher's startup process
  8. The four major components of Android
  9. Android system service - distribution process of Input event
  10. Android underlying rendering-screen refresh mechanism source code analysis
  11. Android source code analysis in practice
  12. ……

Guess you like

Origin blog.csdn.net/weixin_61845324/article/details/132581930