Performance Optimization: How to Completely Solve the Caton Caused by SharedPreferences

background

After launching the ANR monitoring platform, a lot of ANR logs were collected online. From the flame graph information, the function is blocked on the QueuedWork related functions. This article mainly introduces the reasons for this phenomenon and how to solve this problem.

The solution introduced in this article has been put on github https://github.com/Knight-ZXW/SpWaitKiller for reference implementation

The principle of ANR caused by SP task blocking the main thread

First, a brief introduction to the QueuedWork class. QueuedWork is mainly used to execute and track some process-wide tasks, but currently it mainly schedules SP-related asynchronous tasks. When calling the appplay method of SharedPreferences, the SP file change operation to be performed will be It is converted into corresponding tasks and sent to the QueuedWork class for execution by calling the QueuedWork.queue method.

At the same time, in order to ensure that these tasks have been executed before some key actions are triggered (such as page jumping to start the Activity), a waiting mechanism is designed. Briefly describe the mechanism

•The apply operation of SP will generate 2 Runnable objects. In fact, one is the work task (Work Runnable) modified by the specific file, and the other is the waiting task (awitCommit). When the work task is executed, it will pass a CountDownLatch [1 ] The object notifies the waiting task, and awitCommit mainly waits for the CountDownLatch counter
. The working task will eventually be sent to the asynchronous thread for execution through the queue of QueueWork. The
waiting task (awitCommit) will be added to the waiting queue inside QueueWork through the addFinisher function of QueueWork.
• Finally, in some key processes of the system, such as ActivityThread, when executing handleStopActivity, waitToFinish will be used to ensure that these asynchronous tasks have been executed

At this time, if the system resources (cpu, io) are relatively tight, or there are many asynchronous tasks submitted, it may cause onStop to execute for a long time, resulting in ANR.

In addition, in versions above Android 8.0.0, Google has made some changes in the implementation of waitToFinish. On the basis of the original waiting for the completion of all asynchronous tasks, it will directly take out the unexecuted tasks in QueueWork by calling processPendingWork in the current thread Execute directly. The reason for this change is that waitToFinish is usually called on the main thread, and the priority of the main thread is higher than that of the internal threads of QueueWork. Therefore, unexecuted tasks are redistributed to the main thread for direct execution to improve execution efficiency.

SP blocking problem solved

Reflection replaces the finishers queue object

There are many ways to solve the blocking problem caused by SP, such as changing the code using SP in the application to MMKV or other more efficient key-value storage library implementation through bytecode instrumentation. Another way is that ByteDance proposed in a shared article [2] to replace the sFinishers object inside the Queuework class through a proxy to ensure that the queue length is empty when executing waitToFinish.


Here, the sFinishers.poll function is called in the entire class, and only this one place is called, so the object is replaced by a dynamic proxy, and the implementation of the poll function is rewritten so that it always returns a null object, which will not affect other processes

The sFinishers object uses different classes in different versions

•The version below android 8.0 uses ConcurrentLinkedQueue
•The version after android 8.0 uses LinkedList

Taking version 8.0 or above as an example, create a proxy class and modify the implementation of poll

Then replace the implementation class by reflection.

Resolve processPendingWork calls

It was introduced before that when calling waitToFinish in version 8.0 and above, the processPendingWork function will be called directly in the current thread except before executing the waiting finishers queue. The following is a diagram of the relationship between the main thread and the asynchronous worker threads when the program is running.

Therefore, processPendingWork may be executed in the main thread or in an asynchronous thread. There may be two block points in the call of processPendingWork under 8.0~11.0

1. The asynchronous thread is executing the processPendingWork function, and the asynchronous working thread holds the sProcessingWork lock. Therefore, when the main thread executes processPendingWork, because the sProcessingWork lock cannot be acquired, lock waiting occurs

2. When the main thread successfully acquires the sProcessingWork lock and calls the clone function, there are indeed unexecuted tasks in the sWork queue, and these tasks will be executed directly on the main thread. If the IO operation is slow at this time, the main thread will appear due to slow IO blocking or even ANR

For these two reasons, it is not feasible to proxy only the clone function, because if the asynchronous thread is executing the processPendingWork function and the execution is relatively slow, the main thread will still wait. The final method is that no matter which thread is executed, the clone function of the agent returns an empty queue, which ensures that the calls of processPendingWork will not block each other, which means that processPendingWork does not actually perform any operations, and obtains the QueuedWork through reflection The Looper object of mHandler creates a new Hander, and submits the tasks in sWork to this Handler for execution, thus realizing non-blocking operation.


It should be noted that due to the limitation of the hidden API, the sWork member variable can only be reflected in the app whose target sdk version is less than or less, so if you want to work normally in the app with the target greater than 28, you need to break through the limitation of the hidden api of the system. Here the hiddenApiBypass [3] library can be provided using LSPosed.

In addition, in the Android 12 version, this part of the code has changed again. Instead of using clone and clear to copy the copy of the collection, it is implemented by directly replacing the reference of sWork.

In this way, the scheme of replacing the clone function is not feasible, and since the object pointed to by the sWork variable will change every time processPendingWork is called, the operation of replacing the sWork object by the dynamic proxy cannot be performed only once. Continue to look for points that can be hooked, for

for (Runnable v: work)

This code will actually be converted into an iterator call at the bytecode level, so the previous operation can be converted to the iterator function for execution, returning an empty iterator object, so the previous scheme is changed from the proxy clone function to the proxy iterator function, and it is necessary to ensure that the sWork object is re-proxyed after each call to obtain the iterator function.

at last

The amount of code for the above solution is actually not much, so I built a project on github to simulate and solve the ANR problem caused by QueueWork task blocking, for reference https://github.com/Knight-ZXW/SpWaitKiller . When it goes online , corresponding tests should be carried out on the business that uses SP. For example, if there is a situation where cross-process components depend on the same SP file, since we have canceled the flashing behavior of the SP file change when the Activity is stopped, if you jump to other A component of a process that depends on the latest configuration values ​​changed by the SP before the jump may cause problems. In addition, in fact, from the collection of other context information of ANR, although the operation blocking of SP caused the ANR operation, it does not mean that the real reason is caused by SP, for example, it may be caused by physical memory shortage and frequent swa operation. The normal io operation affects the flashing speed of the SP, which eventually leads to ANR.

In order to help everyone better grasp the performance optimization in a comprehensive and clear manner, we have prepared relevant core notes (returning to the underlying logic):https://qr18.cn/FVlo89

Performance optimization core notes:https://qr18.cn/FVlo89

Startup optimization

Memory optimization

UI

optimization Network optimization

Bitmap optimization and image compression optimization : Multi-thread concurrency optimization and data transmission efficiency optimization Volume package optimizationhttps://qr18.cn/FVlo89




"Android Performance Monitoring Framework":https://qr18.cn/FVlo89

"Android Framework Study Manual":https://qr18.cn/AQpN4J

  1. Boot Init process
  2. Start the Zygote process at boot
  3. Start the SystemServer process at boot
  4. Binder driver
  5. AMS startup process
  6. The startup process of the PMS
  7. Launcher's startup process
  8. The four major components of Android
  9. Android system service - distribution process of Input event
  10. Android underlying rendering-screen refresh mechanism source code analysis
  11. Android source code analysis in practice

Guess you like

Origin blog.csdn.net/weixin_61845324/article/details/131807756