IOS Performance Optimization- Analysis & Application

1. Performance indicators

The performance indicators of APP mainly include several major aspects such as CPU, GPU, memory, battery power consumption, and network loading. Network loading will be mentioned below. Battery power consumption is mainly determined by factors such as CPU, GPU, and network. as the underlying indicator.

1. CPU occupancy

IOS APP is a single-process application and does not involve cross-process communication (excluding Extention).

1.1 Thread usage

The use of threads and communication will bring CPU overhead. When a large number of threads are enabled, the CPU usage will naturally increase. The communication between different threads needs to add locks to ensure thread safety, which also increases the life cycle of threads.

Note when using threads:

  • Do not use too many thread lock operations in the parallel queue. If necessary, you need to reduce the execution time of the locking code and simplify it as much as possible. You can also directly use the serial queue to achieve synchronization.
  • Special scenes use their own queues, such as SDImage storage (serial), decoding (serial), download (parallel).

1.2 Execution method time-consuming

Common time-consuming scenarios are as follows.

  • Object creation: The creation of objects will allocate memory and adjust properties, and the creation of individual classes of objects is more time-consuming, such as NSDateFormatter, NSCalendar... .
  • Layout calculation: The calculation of view layout will bring different degrees of CPU overhead due to the runtime consumption of different logics.
  • Image drawing: Image drawing usually refers to the process of drawing an image into a canvas with those methods starting with CG, and then creating and displaying a picture from the canvas.
  • Picture decoding: The picture is set to UIImageView or CALayer.contents, and the data in CGImage will be decoded before CALayer is submitted to the GPU ( 5 kinds of picture thumbnail technology and performance discussion of iOS ).

1.3 I/O operations  

I/O operations refer to reading, writing, and updating of files. The execution speed of disk IO is much lower than the speed of CPU and memory. The main performance overhead of reading and writing files is I/O, and there will also be a small proportion of CPU and memory consumption.

During the running of the APP, due to the slow speed of I/O operations, the time consumption of method calls is naturally greater. Usually, multi-threading is used to read and write files to prevent the main thread from being blocked. The file size and the number of files are related to the overhead of thread resources, which ultimately determines the performance overhead of the CPU.

1.4 CPU usage analysis

The CPU detection tool that comes with Xcode:

Third-party open source CPU detection components:

  • Didi's DoraemonKit is an efficiency platform for the entire life cycle of pan-front-end product development.

2. GPU rendering - FPS

FPS: The abbreviation of Frames Per Second, which means the number of frames transmitted per second, which can be understood as what we often call "refresh rate" (in Hz). FPS is a measure of the amount of information used to save and display dynamic video. The more frames per second, the smoother the displayed picture will be, and the lower the FPS value, the more stuck it will be. Therefore, this value can measure the performance of the application in image rendering and rendering processing to a certain extent. The normal screen refresh rate in the iOS system is 60Hz (60 times per second). The content related to page rendering optimization will be listed below according to specific scenarios.

The FPS detection tool that comes with Xcode:

Third-party open source FPS detection components:

  • Didi's DoraemonKit is an efficiency platform for the entire life cycle of pan-front-end product development.

 3. Memory

The memory mentioned here is mainly memory cache, and I don’t want to describe too much about memory management. If you are interested, you can read my previous blog- IOS memory management .

Each Iphone machine has a fixed physical memory space, which is what we often call the hardware configuration of 2G and 4G of running memory. The operation of the system will have part of the memory overhead, and the rest will be allocated by the running APP.

Unlike Android, the IOS system does not have fixed memory allocation rules, so running an APP can sometimes reach hundreds or even more than 1GB of memory usage, but such unlimited memory consumption will cause memory warnings, and eventually lead to process was killed.

Memory usage scenarios:

  • Temporary/partial, the memory space applied for temporarily, will be released after use, such as the data source cache of the secondary page.
  • Static/global, static memory, constants and objects declared by static, const, extern (singleton object, global array).

Memory caching strategy: MemoryCache

  • Regular cache, NSDictionary, NSArray, NSSet, NSPointerArray / NSMapTable / NSHashTable (supports weak references).
  • Cache + elimination strategy, LRU, LFU, NSCache (LFU takes precedence over LRU).

The memory detection tool that comes with Xcode:

Third-party open source memory monitoring components:

  • Facebook's FBMemoryProfiler , which analyzes iOS memory usage and detects circular references, only detects OC.
  • Tencent's OOMDetector , OOM monitoring, large memory allocation monitoring, memory leak detection, supports monitoring C++ objects and malloc memory blocks and VM memory.

2. Scenario application

1. start

The IOS cold start process is divided into Pre-main and main, that is, the two parts before and after the main function entry. There are also a lot of information on the Internet in this regard, so here is a brief overview. Friends who need to know the details recommend byte official blogs: Douyin-iOS Startup Optimization Principles , Douyin-iOS Startup Optimization Practical Articles , Douyin-Based on The solution to binary file rearrangement

1.1 Pre-main

1) Specific process

  • Dyld: The dynamic linker, after the system kernel prepares the program, dyld is responsible for the rest of the work.
  • Load Dylibs: Load the dynamic library. The dynamic library of IOS includes dylib and dynamic framework, and the static library includes .a and static framework.
  • Rebase: read the image into the memory, and perform encryption and verification in units of pages to ensure that it will not be tampered with, and the performance consumption is mainly in IO.
  • Bind: Query the symbol table and set the pointer to the outside of the image. The performance consumption is mainly calculated by the CPU. 
  • Objc: Read all classes and register the class objects into the global table; read all categories and load the categories into the class objects; check the uniqueness of the selector.
  • initalizers: dyld starts to run the initialization function of the program, calls the +load method of each Objc class and category, calls the constructor function in C/C++, and creates non-basic C++ static global variables.  

2) Optimization strategy

  • Check useless dylibs and reduce the number of dylibs.
  • Use binary rearrangement to solve the page fault problem of page loading.
  • Reduce the number of ObjC classes, methods (selector), and categories (category).
  • Do less in the +load method of the class, and try to postpone the +initailize implementation as much as possible.

1.2 Main

1) Specific process

2) Optimization strategy

  • If the SDK registration is time-consuming, you can use asynchronous concurrent loading, and the SDK that is only used by some secondary pages can use lazy loading.
  • Prevent too many serial interface operations at startup, and try to streamline it.
  • Avoid excessive performance-consuming operations after startup, such as frequent read and write IO, data decoding and other time-consuming method calls.

2. Pages

2.1 Native page - rendering principle

1) View rendering

The display of View is implemented by Layer, and View mainly handles events related to the Touch response chain. UIView provides the drawing API-drawRect, you can get the graphics context in this method, and realize the drawing of the graphics, and call setNeedsDisplay to refresh the drawing.

After adding subviews to the View, draw and call layoutSubviews in the callback of mainRunloop. When the subView layout changes, draw and call layoutSubviews in the callback of mainRunloopp, so layoutSubviews only has multiple operations in different mainRunloop trigger time periods There will be multiple calls.

As mentioned above, the essence of View is layer, and layer contains contents, which point to a cache, also known as Baking Store. Objective-c provides the rendering kernel of Core Animation, and the bottom layer is GPU rendering implemented by OpenGL. The process is roughly as follows:

  1. Initialize the context EAGLContext for drawing;
  2. Create frame buffer and render buffer, set the width and height of the canvas;
  3. Add attachments, such as color attachments or depth attachments;
  4. Switch to the frame buffer and draw in the frame buffer;
  5. Switch to the screen buffer and read the information in the frame buffer;
  6. Draw to screen, delete buffer when container dealloc.

  

2) GPU off-screen rendering

  • The current screen rendering refers to that the rendering operation of the GPU is performed on the screen buffer currently used for display.
  • Off-screen rendering means that the GPU opens a new buffer outside the current screen buffer for rendering operations.

The main overhead of off-screen rendering includes creating a new buffer, switching back and forth from the screen buffer to the off-screen buffer.

In IOS, it is mainly due to the off-screen rendering caused by some attribute settings of the layer. Common ones include mask, clip, opaque, shadow, rasterize, circle Corner (cornerRadius), off-screen rendering will make the interaction of the APP not smooth (for example: a more complex graphic and text mixed list), so it should be avoided as much as possible, and other methods should be used to achieve it.

2.2 Native page - complex layout

There are generally two common scenarios for the complex layout of native pages:

  • List pages with diverse styles such as Weibo, Space, and Moments are characterized by a low degree of cell reuse and complex unit layer elements.
  • Stock candlestick charts, image editing, dynamic charts and other graphics drawing pages, the main feature is to quickly draw and display graphics in a fixed canvas according to the data source and corresponding scene requirements.

2) Low reuse list

  • Off-screen rendering: mask/clip/opaque/shadow/rasterize/cornerRadius, these attributes will cause off-screen rendering, and low refresh rate will not cause frame jamming, which mainly occurs when the list page slides quickly.
  • View overdrawing: use layer instead of non-interactive layers in the cell; reduce layer nesting, simplify the number of layers; use asynchronous rendering, open sub-threads to draw complex layer elements into a bitmap, and then switch back to the main thread exhibit.
  • Data loading: through lazy loading (need to be used in acquisition)/preloading (obtain in advance for backup), specific application in specific scenarios; use asynchronous threads to achieve data acquisition/processing (IO operations, image transcoding, etc.).

3) Frequent canvas redrawing

  • Unified event source triggering: The unified entry of timer and touch event source prevents the event source from triggering layer redrawing too frequently.
  • Reduce partial refresh: take the change of the overall data source as the refresh frequency, and reduce the frequency of local refresh (similar to the unified event source).
  • Reduce layer nesting: The scene of graphic drawing reduces Layer and layer nesting, and uses the more efficient CGGraphis-API.
  • Use asynchronous drawing: start the child thread to draw complex layer elements into a bitmap, and then switch back to the main thread for display.
/* 异步绘制,在需要频繁重绘的视图上效果最好(比如绘图应用、TableViewCell之类)*/
- (void)drawsAsynchronously:(void(^)(CGContextRef context))drawsBlock imageBlock:(void(^)(UIImage* image))imageBlock {
    /* 开启异步线程实现图形绘制,最终刷新还是在UI线程 */
    CGSize size = self.bounds.size;
    dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
        UIGraphicsBeginImageContext(size);
        CGContextRef context = UIGraphicsGetCurrentContext();
        drawsBlock(context);
        UIImage *resultImage = UIGraphicsGetImageFromCurrentImageContext();
        UIGraphicsEndImageContext();
        dispatch_async(dispatch_get_main_queue(), ^{
            imageBlock(resultImage);
        });
    });

    /* 或者使用系统提供的属性来实现异步绘制
     self.layer.drawsAsynchronously
     */
}

2.2 Native page - animation effect

UI animation often has a relatively high performance overhead. The most common animations in IOS projects include frame animation and core animation. The way to configure the frame picture through imageView, or the git component to realize the frame animation effect,

UIImageView animations are suitable for scenes with a small number of frames, eliminating the need for Gif parsing and directly configuring frame images.

Gif playback has a large overhead on CPU and memory (file parsing->caching->timer->decoding display), you can use FLAnimatedImage / YYImage (local), SDWebImage (network), both of which have optimized Gif rendering. As far as the implementation of FLAnimatedImage is concerned, the rendering of gif is optimized from three aspects, namely, asynchronous parsing of gifData, the use of CADisplayLink, and the caching strategy of gifData size (see the picture below).

Although some optimizations have been made to Gif rendering, Gif will still bring a lot of overhead when the number of GIF frames and frame images is large, especially for pages with multiple GIFs rendered at the same time. The appearance of Lottie solves this problem very well. It is a cross-platform animation framework based on mobile and web. Designers can use the  Lottie provided Bodymovin plug-in to export the designed animation into JSON format and realize animation on mobile and web. rendering.

Animation conflicts will also cause obvious freezes. For example, when a VC is pushed, the VC page will immediately wake up the keyboard, and there will be freezes or no pop-up effects. This can be avoided by calling asynchronously.

Core animation includes basic animation, key frame animation, combined animation, and transition animation, which can be directly implemented by calling the API provided by the system.

2.3 Web pages

1) Long white screen time 

  • Resource localization: A common problem with web pages is that the white screen takes a long time, and HTML, CDN resource files, and network requests for the page need to be loaded in sequence. You can reduce the duration of the white screen by loading the H5 local resource package, or cdn resource interception + local mapping. For specific implementation, please refer to the H5 resource localization strategy - IOS .
  • Skeleton screen: Although there will be a loading circle prompt when the page loads network data, the slow response of the interface will cause the page to keep turning in circles. At this time, a skeleton screen needs to be introduced. After the page loads web resources, the skeleton screen generated through webpage packaging Pre-display the general structure of the page ( Vue page skeleton screen injection practice ), or pre-display the general structure of the page by setting the placeholders of each UI component.

2) Picture display

  • Upload compression: reduce network loading time consumption and rendering overhead of large images.
  • Image placeholder: Prevent the page from bouncing when the image is loaded.

 2.4 Network Acceleration

1) Image loading supports webp

WebP is an image file format that provides both lossy compression and lossless compression (reversible compression). It is derived from the image encoding format VP8. It was developed by Google after purchasing On2 Technologies and released under the terms of the BSD license.

The specific implementation process:

  1. The server supports webp loading of pictures;
  2. Download the API through the Hook file, add the suffix '.webp' to the image url;
  3. Load the webp resource file;
  4. SDWebImage has its own webp decoder, just register it when the APP starts;
  5. The webp is decoded into jpg/png, and the picture is displayed.

2) HttpDNS resolution

HttpDNS resolution uses the HTTP protocol for domain name resolution, replacing the existing UDP-based DNS protocol. The domain name resolution request is directly sent to Alibaba Cloud’s HTTPDNS server, thereby bypassing the operator’s Local DNS and avoiding domain name hijacking problems caused by Local DNS and Scheduling is not accurate.

httpDns resolution resolves the existing domain name into an IP address, and conducts network access through IP direct connection. Most of the APPs on the market are implemented through the SDK provided by Alibaba Cloud and Tencent Cloud.

HTTPDNS_Domain Name Resolution_Domain Name Hijacking Prevention_Development and O&M-Alibaba Cloud

Mobile resolution HttpDNS_Mobile Internet domain name resolution_Domain name anti-hijacking- Tencent Cloud

The specific implementation process:

  1. Redirect the request through NSURLProtocol;
  2. Obtain IP information after domain name resolution;
  3. Replace the domain name of the original request URL with the IP;
  4. Resend the request to achieve IP direct connection.

3) Use network cache + request data compression + interface split screen loading

3. Compile and package

1. Compile and package optimization

After a long-term iteration of the project, the duration of Run/Archive ranges from a few minutes at the beginning to a dozen or twenty minutes. On the one hand, it is due to the replacement of Mac equipment, on the other hand, it is the complexity of the engineering structure, or the project Bloat caused by unreasonable design.

  •  Project Configuration - Build Settings, set Optimization Level, Debug Information Format, Build Active Architecture Only ( how iOS speeds up compilation  ).
  • Use cocoaPod to avoid circular references, facilitate version iteration, and configure static loading. 
  • Reduce the number of dynamic libraries, or convert module code into static libraries (.a + bundle / framework).
  • To check for redundant resources, you can check through the project code, or use a third-party checking tool to remove redundant files.
  • Resource compression/merging, using TinyPNG for image compression, merging Assets, merging OC classes (a lot of tool classes, excessive API splitting, class libraries with repeated functions, etc.), avoiding excessive encapsulation.
  • Avoid excessive use of PCH. The reference of PCH means that every OC class under the same project can access the declared classes and methods. In this way, whenever the class in the PCH changes, it will be time-consuming to run again. Therefore, PCH only places reference declarations of relatively static and versatile classes.

2. Package size optimization 

Apps with more native services will have a relatively large ipa after a certain iteration, hundreds of megabytes, or even one or two hundred megabytes. At this time, it is necessary to consider the optimization of the package size. Two Byte official blogs are recommended: Toutiao iOS installation package size optimization , Douyin package iOS installation package size optimization .

  • To check for redundant resources, you can check through the project code, or use a third-party checking tool to remove redundant files.
  • Larger built-in resource files are stored in the cloud and replaced by distribution, including pictures, audio and video files, etc.
  • Resource compression/merging, using TinyPNG for image compression, merging Assets, merging OC classes (a lot of tool classes, excessive API splitting, class libraries with repeated functions, etc.), avoiding excessive encapsulation.
  • Project configuration - Build Settings, set Asset Optimization to space, and Link-Time Optimization to Incremental.

4. APP Stability

1. Crash problem

1.1 Collection/Measurement

  • Integrate Bugly or Fabric (higher timeliness and accuracy) to realize crash collection and analysis.
  • Use Instruments-Zombies that comes with Xcode to detect zombie objects, mainly for measurement before the application is launched.
  • Through Xcode's Organizer-Crashes, view the crash logs reported by users, and analyze the application after it is released and launched.

1.2 Common crash optimization

  • Data fault tolerance: such as array out of bounds and dictionary object type exception, the common practice is to add Array and Dictionary category methods to achieve fault tolerance, and implement logic fault tolerance before the original IMP call through aspect programming. 
  • System API exception: Every time the major version of IOS is updated, a comprehensive system compatibility test is required for the APP to fix the compatibility problems caused by the new system.
  • Page stack exception: Too frequent switching of page Push/Pop results in stack exception. You only need to limit the frequency of pages in BaseNavigationVC. If you do not integrate the base class, you can use Hook to implement it.
  • Missing method attributes: rewriting the system UI component structure leads to crashes caused by calling attributes/methods abnormally, only need to add attributes or methods to the class of the corresponding level (for example: to replace the internal element UITabBarItem of UITabBar, the QFTabBarItem class needs to add image and title Attributes).

2. Caton problem

1.1 Collection/Measurement

  • Integrate Bugly or FireBase Performance Monitor to realize the collection and analysis of freezes.
  • Use the Instruments-Core Animation / Time Profiler that comes with Xcode to detect FPS and time-consuming APIs.
  • Use the DoraemonKit-debug tool to collect application freeze information in debug mode.

1.2 Common freeze optimization

  • Time-consuming method optimization: including data encoding and decoding, system time-consuming API, IO operations, processing a large number of traversal logic and other operations that block the UI thread, which have been described in detail above, and will not be elaborated here.
  • Page rendering optimization: For details, please refer to the 'Pages' section above.

Attachment & code:   IOS performance optimization-extended graph.xmind , MemoryCache , APM-realization of simple functions

Guess you like

Origin blog.csdn.net/z119901214/article/details/120403321