Help you unlock five misunderstandings and two doubts about Android performance optimization!

In recent years, the community has been full of various misunderstandings about Android performance optimization. This article follows the spirit of the misunderstanding terminator , uses specific performance testing tools, carefully analyzes these situations in combination with real cases, and compares their test results. It will also focus on Android. Developers usually use actual data to tell you in the actual scene of the coding process that you must perform the necessary performance testing before the actual coding .

Myth 1: Kotlin is more performance-intensive than Java

The Google Drive team has now fully replaced its application from Java to Kotlin. The scope of refactoring involves more than 170 files, more than 16,000 lines of code, including more than 40 compilation products. Among the indicators monitored by the team, the first element is startup time, the test results are as follows:

As shown, using kotlin did not have a substantial impact on performance, and the Google team did not observe a noticeable performance difference throughout the benchmark, even though there was a slight increase in compile time and compiled code size. All kept within 2%, completely negligible. Thanks to kotlin's concise syntax, the team's code lines have been reduced by about 25%, and it has become more readable and maintainable.

It is also worth mentioning that when using kotlin, we can also use code reduction tools like R8 to further optimize the code.

Myth 2: Getters and Setters are more time-consuming

For fear of performance degradation, some developers choose to directly use public modified fields in the class instead of writing getter and setter methods, such as the following code, where the getFoo() method is the getter function of the variable foo:

public class ToyClass {
   public int foo;
   public int getFoo() { return foo; }
}

ToyClass tc = new ToyClass();

Using tc.foo to get variables directly has obviously broken object-oriented encapsulation, and in terms of performance, we used Jetpack Benchmark to benchmark the two methods of tc.getFoo () and tc.foo on Pixel 3 with Android 10 Test, the library provides the function of warming up the code, the final stable test results are as follows:

The performance of the getter method is not much different from the performance of the direct access variable, the result is not surprising, because the Android RunTime (ART) inlines all the getter methods in the code, so the code executed after JIT or AOT compilation is the same Yes, because of this, even if we need to use getter or setter to obtain variables by default in kotlin, the performance will not decrease. If you use Java, unless you have special needs, you should not use this method to break the encapsulation of the code .

Myth 3: Lambdas are slower than inner classes

Lambdas (especially with the introduction of the Stream API) are a very convenient syntax that allows for very concise code. The following code sums the internal field values ​​of the object array. Here, the Stream API is used with the map-reduce operation:

ArrayList<ToyClass> array = build();

int sum = array.stream().map(tc -> tc.foo).reduce(0, (a, b) -> a + b);

The first lambda will convert the object to an integer, and the second lambda will add the two resulting values.

In the following code, we replace the lambda expression with an inner class:

ToyClassToInteger toyClassToInteger = new ToyClassToInteger();

SumOp sumOp = new SumOp();

int sum = array.stream().map(toyClassToInteger).reduce(0, sumOp);

Here, there are two inner classes: one is toyClassToInteger, which can convert objects into integers, and the second SumOp is used for summing operations.

Syntactically, the first example with lambdas is clearly more elegant and readable. So, what about the performance difference? We used the Jetpack Benchmark again on the Pixel 3 and saw no performance difference:

As can be seen from the figure, we also defined a separate external (top-level) class for comparison, and found that there is no difference in performance. The reason is that lambda expressions will eventually be converted into anonymous inner classes. Therefore, for the simplicity and readability of the code, lambda expressions are the first choice in this scenario.

Misunderstanding 4: The object allocation overhead is too high, and the object pool should be used

Android has built-in the most advanced memory allocation and garbage collection mechanism, as shown in the figure below, almost every version of the update has made various updates in terms of object allocation.

Garbage collection performance has improved significantly between versions, and now garbage collection has little to no impact on the smoothness of the application. The figure below shows Google's official improvements to object collection with generational concurrent collection in Android 10, and there are also significant improvements in the new version of Android 11.

In GC benchmarks (such as H2), the throughput has been greatly improved by more than 170%, and in real applications (such as Google Sheets), the throughput has also increased by 68%.

The argument that garbage collection is inefficient and memory allocation burdensome is equivalent to thinking that the less garbage is created, the less work garbage collection will do, so instead of creating new objects every time we use them, we can maintain a frequently used Type object pool, and then get the created object from the pool, as follows:

Pool<A> pool[] = new Pool<>[50];

void foo() {
   A a = pool.acquire();
   pool.release(a);
}

The details of the code are omitted here, in general, a pool is defined, objects are obtained from the pool, and then finally released.

To test this scenario, we use microbenchmarks: the overhead of allocating objects from the pool, as well as the CPU overhead, to determine whether garbage collection affects application performance.

In this case, we were still able to loop through the code that allocated objects thousands of times on a Pixel 2 XL with Android 10, since performance can vary for small or large objects, we also added different Fields to simulate different object sizes, the final overhead results are as follows:

The CPU overhead for garbage collection results in the following:

image

As can be seen from the graph, the difference between standard allocation and pooled objects is also small, however, when it comes to garbage collection of larger objects, the pooled solution is slightly higher.

This result is not unexpected, because pooling objects increases the memory footprint of the application. At this point, the application suddenly occupies too much memory, even though the cost of each garbage collection call is reduced due to pooling objects. Also higher, because the garbage collector has to walk more memory to determine which objects need to be collected and which objects need to be kept.

So, whether objects should be pooled or not depends mainly on the needs of the application. If code complexity is not considered, pooled objects have the following disadvantages:

  • increase memory usage
  • Make objects live longer
  • Need a very complete object pool mechanism

However, the pooling approach may indeed be effective for large and time-consuming object allocations, the key is to remember to test adequately before choosing a solution.

Misunderstanding 5: Perform performance analysis in debug mode

It is very convenient to analyze the performance of the application while debugging. After all, we usually code in the debug mode, and even if the performance analysis in the debug application is not accurate, iterative changes can be made faster to improve efficiency, and then the fact Yes and no .

In order to verify this misunderstanding, we analyzed the test results of common operations related to Activity, as shown in the following figure:

In some tests (such as deserialization), debugging or not has no impact on performance, however, some results have a difference of 50% or more, and we even found examples where the result speed may be 100% slower, because the runtime There is little optimization to the code when in debug mode, so it is very different from the code a user would run on a production device.

As a result of profiling in debug mode, it can lead to misleading optimization directions, resulting in wasted time optimizing things that don't need to be optimized.

Doubtful

Now, we need to consciously avoid the five misunderstandings mentioned above. Next, let’s look at some questions that are not obvious in daily development, but we often have doubts. The actual results may also be quite different from what we think.

Doubt 1: Multidex: Does it affect application performance?

Today's APK files are getting bigger and bigger, because large applications often exceed the number of methods limited by Android, breaking the traditional dex specification using the Multidex scheme.

The question is, how many methods can be called more? And if the application contains a lot of dex, does it have an impact on performance? In many cases, we use Multidex not because the application is too large, but to split the dex file according to the function to facilitate team development.

To test the impact of multiple dex files on performance, we used the calculator application, which by default only contains a single dex file, we can split it into five dex files according to their package boundaries, to Simulate a split.

First, test the performance of the startup application, the results are as follows:

So splitting the dex file has no effect here, for other apps there may be a slight overhead due to certain factors: the size of the app and how it is split. However, as long as you split the dex files reasonably and don't add hundreds of dex files, the impact on startup time should be small.

Next is the size and memory consumption of the APK:

As you can see, both the APK size and the app's runtime memory footprint increase slightly, because when the app is split into multiple dex files, each dex file will have some duplication in the symbol table and cache table data.

However, we can minimize this situation by reducing the dependencies between dex files. In this case, the dex package is not quantified. We can use tools such as R8 and D8 to properly analyze the project structure and use Minimized dependencies, these tools can automatically split dex files and help us avoid common mistakes, minimize dependencies, such as creating no more than the specified number of dex files, and not copying all startup classes are placed in the main file. However, if we do a custom split on the dex file, make sure to analyze it properly.

Doubt 2: useless code

One of the benefits of using a just-in-time compiler like ART is that the code can be analyzed and optimized at runtime. There is a saying that if the code is not profiled by the interpreter/JIT system, it may not be executed. To test this theory, we examined the ART configuration files generated by Google apps and found that a lot of code was not profiled by the JIT, suggesting that much of the code was never actually executed on the device.

There are several types of code that may not be parsed:

  • Error handling code, hopefully it doesn't execute too much.
  • Compatibility code, code that does not execute on all devices, especially Android 5+ devices.
  • Code for less commonly used functions.

However, judging from the resulting distribution, there will still be a lot of unnecessary code in the application. R8 can help us delete unnecessary code quickly, easily, and for free to reduce this part of the overhead. If we don't do this, we can also package the application into an Android App Bundle, a format that only uses the code and resources required by a specific device to run the application.

Summarize

In this article, we have analyzed the five misunderstandings of Android performance optimization, but in some cases the results of the data are not clear, what we need to do is to do a good job of performance testing before optimizing and modifying the code.

At present, there are already many tools that can help us analyze and evaluate how to optimize applications, such as profilers in Android Studio, which also provides battery and network monitoring functions. It is also possible to go deeper with tools such as Perfetto and Systrace, which provide more detailed functionality, such as what happens during application startup or execution.

Jetpack Benchmark abandons all the complex operations of monitoring and benchmarking. The official strongly recommends that we use it in continuous integration systems to track performance and view the behavior of applications adding features. The last thing to note is that you should not analyze in debug mode application performance.

In order to help everyone better grasp the performance optimization in a comprehensive and clear manner, we have prepared relevant core notes (returning to the underlying logic):https://qr18.cn/FVlo89

Performance optimization core notes:https://qr18.cn/FVlo89

Startup optimization

Memory optimization

UI

optimization Network optimization

Bitmap optimization and image compression optimization : Multi-thread concurrency optimization and data transmission efficiency optimization Volume package optimizationhttps://qr18.cn/FVlo89




"Android Performance Monitoring Framework":https://qr18.cn/FVlo89

"Android Framework Learning Manual":https://qr18.cn/AQpN4J

  1. Boot Init process
  2. Start the Zygote process at boot
  3. Start the SystemServer process at boot
  4. Binder driver
  5. AMS startup process
  6. The startup process of the PMS
  7. Launcher's startup process
  8. The four major components of Android
  9. Android system service - distribution process of Input event
  10. Android underlying rendering-screen refresh mechanism source code analysis
  11. Android source code analysis in practice

Guess you like

Origin blog.csdn.net/weixin_61845324/article/details/131648407