Baidu APP iOS terminal package size 50M optimization practice (2) Image optimization

picture

I. Introduction

In the previous article, we introduced the necessity of package volume optimization, the components and generation process of the installation package, the analysis of the APP package volume of major domestic and foreign manufacturers, Baidu APP package volume optimization technical solutions and various benefits, and this article focuses on image optimization. , after decompressing the IPA package, it was found that the pictures in the asset and bundle in Baidu APP totaled 94M, which is our key optimization object.

The content of this series of articles is as follows:

"Baidu APP iOS terminal package size 50M optimization practice (1) overview" (click the title to jump)

Baidu APP optimizes different image resources in the following ways:

First, the optimization of useless pictures solves the problem that with the iteration of the version, some pictures have no reference relationship, but they are still kept in the IPA package. Mining these pictures and deleting them, this optimization has the highest ROI among all package volume optimization projects , the scope of influence is limited to a single component, and the quality is controllable. The key is to improve the accuracy of finding useless pictures;

Second, Asset Catalog optimization: The images managed by Asset Catalog can be processed by the App Store tool App Thinning. After processing, users will only download image resources that match the resolution of their devices, thereby reducing the package size downloaded by users;

Third, HEIC image optimization: Compared with PNG, JPEG, and WebP, HEIC image encoding format has the smallest size reduction. From the perspective of decoding efficiency, compared with WebP, HEIC hard decoding efficiency is higher.

2. Useless picture optimization

2.1 Overview of the scheme

First obtain all image resources, and then the development tool obtains Objective-C, Swift, xib, storyboard, html, js, css, json, plist files that may reference the static string of the image, and then perform diff on the first two sets to check for unreferenced In the end, a secondary filter is performed for the common case of string splicing. The more cases covered, the higher the accuracy. Of course, ROI should also be considered.

2.2 Get all pictures

Open the source code of each library, and use scripts to detect all pictures and their relationship, which provides convenience for subsequent distribution and landing optimization. Using binary libraries or ipa packages without open source code will bring a lot of troubles. For example, it is difficult to obtain image resources in asset.car. At the same time, you can only know the name of the image, and you cannot directly obtain which library the image belongs to. After opening the source code of each library, you can use the script to traverse recursively to get all the pictures, and you can know the relationship from the picture path. The reference code is as follows: 

def findAllPictures(path):
    pathDir = os.listdir(path)
    for allDir in pathDir:
        child = os.path.join('%s%s' % (path, allDir))
        if os.path.isfile(child):
            # 获取读到的文件的后缀
            end = os.path.splitext(child)[-1]
            if  end == ".png" or end == ".webp" or end == ".gif" or end == ".jpeg" or end == ".jpg":
                print("文件" + child + " 后缀 " + end)
        else:
            # 递归遍历子目录
            child = child + "/"
            findAllPictures(child)

2.3 Obtain static strings that may reference pictures

In this link, our focus is to find the collection of strings that may reference pictures in the code. For example, in the .m file of Objective-C, we often use the following code to load the picture account_login to create a UIImageView object. For Objective- The content of the .m file of C is filtered by regular expressions, and the matching expression is @"(.*?)", and all string sets that may load pictures can be obtained.

imgView.image = [UIImageimageNamed:@"account_login"];
br

For Swift files, we usually use the following code to load the image account_login, and the loading method is completely different. For Swift files, the regular expression should be "(.?)".

let imageView = UIImageView(frame: CGRectMake(100, 10, 200, 200))
imageView.image = UIImage(named:"account_login.jpg")
self.view.addSubview(imageView)

For html files, we usually use the following code to load images, and the regular expression should be img\s+src="'["'].

<html>
<body>
<img src="图片地址" alt="文本说明" width=** height=**>
</body>
</html>

Different files load images in different ways, such as Objective-C, Swift, xib, storyboard, html, js, plist, json, and css are all different. In the table below, I sort out the regular expressions used by common files to filter images Mode.

picture

2.4 Get unreferenced images

Through chapter 2.2 we get all the pictures in the project, through chapter 2.3 we get all the static strings that may reference pictures in the code, then for each picture, if it is not in the static string collection of reference pictures, this picture Probably just unreferenced pictures.

2.5 Secondary filtering for common cases of string splicing

After the above steps, we obtained unreferenced pictures through full string matching. In the actual development process, there are some common situations. The names of referenced pictures in the code are generated by string splicing, so we need to perform secondary filtering. The first common case is that when the mobile phone supports dark mode, there are usually two sets of pictures at the same location, which are distinguished by suffixes such as light, dark or day, night; the second common case is that the suffix is Digital picture sequence, there is an animation type on the iOS side that is ImageView loading multi-picture animation. The picture name is composed of strings and dynamic digital suffixes. When filtering, the following suffixes need to be covered: _%d, _%ld, _%zd, _%lu.

3. Asset Catalog image optimization

3.1 Background

Asset Catalog is a resource management tool provided by Xcode and introduced in the iOS7 system. It will store and centrally manage large and small resources scattered in the project, including but not limited to images, sprites, textures, ARKit resources and PDF. We can Put the pictures or other resources that were previously placed in the bundle into the Asset catalog, and XCode will finally compress them into an Assets.car file.

Before Asset Catalog, we usually put pictures directly into the project bundle. This method has some disadvantages: first, space is wasted, and different devices require pictures with different resolutions, so there are double pictures of the same picture in the bundle and triple images, wasting resources; second, image compression can only target a single file, and there is no unified compression function; third, information redundancy, each image resource will store its own metadata and other attribute information, if There are many resources of the same type, and the same information will generate redundancy and waste space.

3.2 Advantages of Asset Catalog

In response to the above problems in the bundle, Asset Catalog has made many optimizations. Whether it is package size optimization, unified image compression, convenient resource management, or efficient IO operations, each optimization has achieved the ultimate. The following details:

First, package size reduction : Asset Catalog provides customized resource downloads for different types of devices (different resolutions) or the same type of devices but different configurations (different disks and memory). Previously, it was necessary to place double and triple images in the bundle. In the end, there will be two copies of the same picture on the user's mobile phone. With the Asset Catalog, when the user downloads the App, only the resources that match the hardware device parameters of the user's mobile phone will be downloaded, and other resources will not be downloaded. For example, iphone8 Mobile phone users will only download twice the pictures, and iPhone 13 users will only download three times the pictures, which can significantly reduce the download package size.

Second, unified image lossless compression : Asset Catalog uses lossless compression for all images in the folder by default. The compression method is Apple Deep Pixel Image Compression, which is a new compression form introduced by Apple. It will be selected according to the color spectrum characteristics of the image The optimal algorithm is used for compression, and the compression ratio can be increased by 15~20%. WWDC2018: Optimizing App https://developer.apple.com/videos/play/wwdc2018/227/ has a detailed introduction. Specifically, different optimization methods are used for different types of pictures. One type is simple picture resources, such as many icon pictures, which only have relatively simple color matching and design. The other category refers to complex image resources. Apple Deep Pixel Image Compression has made different forms of optimization for these two forms. The larger the image resource size, the more obvious the optimization effect after using the Asset Catalog, and the unified compression is more effective. It is beneficial to realize the package volume reduction. The picture below is the optimization of the Apple Deep Pixel Image Compression volume compression ratio officially given by Apple.

picture

Third, convenient resource management : If you put the pictures directly under the project directory, the picture files will be scattered in the iPA package after the project is packaged, and if you use Asset Catalog to manage them and put them in xcassets, these pictures will be unified after packaging Compressed into an Assets.car file.

Fourth, efficient I/O operations : Assets Catalogs image loading takes two orders of magnitude less time than ordinary bundle loading images. This is because the final compiled Assets.car file contains the BOM file, and the BOM file provides the image. The rendition, renditionKey and attribute attribute values ​​required for loading, rendition is the collective name of different styles of an image resource in CoreUI.framework, such as @2x, @3x, each rendition has a renditionKey corresponding to it, obtained through renditionKey The corresponding attribute contains various attributes, such as image resolution, vertical size, horizontal size and other parameters.

Images managed by Assets Catalogs are loaded through the imageNamed method, because the above resource information in the car file is generated during XCode compilation. After parsing the car file, you can directly obtain the renditionKey and attribute attributes through the image name and read the image There are no redundant operations for resources. On the contrary, for images managed by bundles, too many additional operations lead to serious time-consuming. There are two ways to load images for bundle management. The first way is to load images by imageName, which needs to be checked in Assets.car first. Since image resources are not in Assets.car, canGetRenditionWithKey is called multiple times when obtaining rendition and renditionKey. canGetRenditionWithKey, and finally load and read image properties and image resources through mmap again to form rendition and renditionKey, which takes the most time overall; the second method loads images through imageWithContentsOfFile, without querying in Assets.car, and there is no correlation between rendition and renditionKey Operation and cache operation, only the operation of reading image attributes and image binary is time-consuming, but there is no related cache such as image attributes, so it takes a long time.

3.3 Assets.car generation process

Specifically, when Xcode processes the Asset Catalog node, the tool actool for building the Asset Catalog will first decode the png image in the Asset Catalog to obtain the Bitmap data, and then use the compression algorithm of actool to perform encoding and compression processing to generate the Assets.car file , which can explain the picture in jpg format in the Asset Catalog, but the final generated Assets.car file is a picture in png format.

3.4 Assets.car operation

For images managed by Assets Catalogs, XCode compiles not a simple copy operation, but an Assets.car file generated by packaging all resources. This is a compressed file, which cannot be operated by direct decompression. You can use the assetutil tool that comes with XCode. To analyze the .car file, the analysis command is as follows:

sudo xcrun --sdk iphoneos assetutil --info ./Assets.car > ./AssetsInfo.js

Obtain image-related properties through AssetsInfo.json, but cannot obtain the images inside.

picture

If you want to extract the pictures in the car file, I recommend an open source tool called Asset Catalog Tinkerer, which can be downloaded from github, here is the github address: https://github.com/insidegui/AssetCatalogTinkerer

picture

3.5 Compression algorithm of Asset Catalog

Using the XCode built-in tool assetutil introduced in Section 3.4, you can know the compression algorithm of each image. The Compression field value represents the different compression algorithms used by different images. Through practice, it is found that the compression algorithms supported by actool are deepmap2, deepmap_lzfse, zip, lzfse, palette_img , which compression algorithm to use is related to many factors, such as the characteristics of the image itself, the packaged XCode version, the minimum iOS version supported by the Framework, and the compilation configuration (Asset Catalog Compiler - Options Optimization) . The above factors select an algorithm with an optimal compression ratio, and these compression algorithms are lossless.

picture

3.6 Do not do lossless compression

Developers must not perform lossless compression before putting pictures into the Asset Catalog . The lossless compression algorithm achieves the purpose of reducing the size of the picture by changing the compression coding algorithm of the picture, and will not change the decoded Bitmap data. From Section 3.3, we know that Assets During the generation of .car files, Asset Catalog’s tool actool first decodes the Bitmap data, and then encodes and compresses it. The Bitmap data received by the lossless compression algorithm atool has not changed, so the lossless compression cannot optimize the package size. UI designers If the given PNG image is optimized by Asset Catalog, do not perform lossless compression.

3.7 Bundle Multiple Image Asset Optimization

Asset Catalog was introduced by Apple in the iOS7 system released in 2013. Since iOS 9, it has supported resource management. Old codes (especially codes from 16 years ago) use bundles to manage images. For this reason, We develop scripts to check multiple images of bundles, and then use Asset optimization. This optimization method can reduce the package size by half. The reference script is as follows:

def find_all_bundle_pic(app_package_path, all_pic_list):
    """
    将所有bundle图片存入list中
    """
    pathDir = os.listdir(app_package_path)
    for child_file in pathDir:
        child_path = os.path.join('%s/%s' % (app_package_path, child_file))
        # isfile:如果child是一个存在的文件则返回true,否则(bundle、文件夹会等)返回false
        if os.path.isfile(child_path):
            if child_path.endswith(".png") or child_path.endswith(".jpg") or child_path.endswith(".jpeg") or child_path.endswith(".gif") or child_path.endswith(".webp"):
                if child_path.find(".bundle") > 0:
                    all_pic_list.append(child_path)
        else:
            find_all_bundle_pic(child_path, all_pic_list)

def find_opt_pic(all_picture_list,final_opt_pic_list):
    """
    查找bundle中重复的多倍图片
    """
    for picture in all_picture_list:
        if picture.endswith("@2x.png"):
            prefix_2x = picture[0: len(picture) - 7]
            for picture1 in all_picture_list:
                # 前缀匹配
                if picture1 != picture and picture1.startswith(prefix_2x):
                    if (len(picture) == len(picture1) and picture1.endswith("@3x.png")) or len(picture) == len(picture1) + 3:
                        final_opt_pic_list.append(picture)
                        final_opt_pic_list.append(picture1)

3.8 Lossy image compression can reduce the size of Assets.car

From Section 3.6, we know that lossless compression has no volume optimization effect for Asset Catalog, but lossy compression can reduce the size of Assets.car, because Asset Catalog itself also compresses and optimizes images, so the benefits of lossy compressed images are not bundled The benefits of converting to Asset Catalog are obvious. The commonly used lossy compression tools are TinyPng and pngquant.

TinyPng is a web-based tool that compresses a 24-bit PNG image into an 8-bit image by merging similar colors in the image, and removes unnecessary metadata in the image to achieve compression. It is very convenient to use for single image compression. The link address is as follows: https://tinypng.com/ , but if you want to process batch image compression, problems such as upload failure are prone to occur during the upload process. This tool does not support custom compression configuration.

pngquant is a lossy PNG compression open source library, which provides two forms of command line and source library. Convert 24-bit or 32-bit RGBA PNG images to 8-bit PNG images and preserve the transparency channel. The conversion of this library can significantly reduce the size of png files. pngquant uses local script compression. The download address of the tool is: https://pngquant.org . It is more friendly to batch compressed images. pngquant supports custom compression quality and configuration compression If the quality is less than 90, the compression rate will be higher than that of TinyPng, and pngquant is open source and can be customized. This is the first choice for Baidu APP image compression.

4. HEIC image encoding optimization

4.1 Advantages of HEIC image encoding

HEIC (High Efficiency Image Coding) is an image coding standard, which can greatly improve the compression rate and effectively reduce storage usage. Since iOS 11 and macOS High Sierra (10.13), Apple has set HEIC as the default image storage format, which was developed by the Moving Picture Experts Group (MPEG) and defined in MPEG-H Part 12 (ISO/IEC 23008-12), the following are the characteristics of HEIC pictures:

High compression rate : The compression rate of HEIC images is 1.5 times higher than that of JPEG images, 3 times higher than that of PNG images, and 3 times higher than that of GIF images.

Save memory : HEIC images save 20% storage space compared to JPEG images, 50% storage space compared to PNG images, and 80% storage space compared to GIF images.

High decoding efficiency : In the iOS system, HEIC adopts hard decoding, which has high decoding efficiency. Compared with WebP (soft encoding), it is 100 times higher, but slightly slower than JPEG.

Preserve the original image quality : HEIC images are compressed in H.264 and JEP formats, which can retain the original image quality.

Support lossless magnification : HEIC pictures support lossless magnification, which can enlarge the picture twice without distortion.

**Color processing:**HEIC pictures can automatically brightness, contrast and saturation according to the brightness distribution of pixels, so as to better restore the true color of the image.

Good system compatibility : We know that HEIC is the default format for image storage starting from iOS 11, that is, all systems after iOS 11 support HEIC images, but our Baidu APP currently supports iOS10 system, how to deal with this part of users? In practice, we found that on the iOS10 system, when the HEIC image is placed in the xcasset file, the final image can be displayed normally. We did some troubleshooting and used the Asset Catalog Tinkerer tool to decompress the Assets.car file and found it in xcasset For the HEIC image in the iOS10 system, it will be converted into a png format image by the system when packaging. Asset Catalog solves the compatibility problem of HEIC image.

4.2 How to generate HEIC image

Use the built-in function of Mac to realize the method of converting png to HEIC: right-click the picture, quick operation-"convert image, select HEIF as the format, and select the original image.

picture

The optimization results are shown below, the left is the original png image (1.6M), and the right is the HEIC image (106KB).

picture

4.3 How to use HEIC pictures

4.3.1 Must be used in Asset Catalog

HEIC images must be placed in the Asset Catalog before they can be used, and the bundle method does not support loading HEIC images.

picture

The method of loading and using HEIC images is the same as that of ordinary asset images, as follows:

imgView.image = [UIImage imageNamed:@"account_login"];

4.3.2 The HEIC format for large images is obviously smaller

Theoretically speaking, the volume of HEIC format images is one-third of that of PNG format images, but the actual process found that for large images, this optimization effect is obvious, but for small images, especially images smaller than 10K, HEIC images may exceed PNG format images, so we do not recommend this method for small images when doing HEIC image encoding optimization.

4.3.3 Do not compress PNG images with alpha channels

In practice, it is found that when an original PNG image, especially with an Alpha channel, is compressed (TinyPng or ImageOptim) and then generated as a HEIC image, a green screen will be displayed on iOS12, 13, and 14 systems, so PNG images with alpha channels should not be compressed lossy, as there are compatibility issues.

V. Summary

Image optimization is the most important part of package size optimization. Baidu APP solved the problem of stock images after two Q optimizations and achieved 9.75M revenue. Then, it established image usage specifications and useless image detection pipelines to solve the problem of incremental images.

This article introduces the useless image detection scheme, Asset Catalog image optimization, and HEIC image optimization scheme in detail. We will introduce their principles and implementation in detail for other optimization types later, so stay tuned.

——END——

References:

[1] How to use Asset: https://developer.apple.com/library/archive/documentation/Xcode/Reference/xcode_ref-Asset_Catalog_Format/index.html#//apple_ref/doc/uid/TP40015170-CH18-SW1

[2] Asset introduction: https://help.apple.com/xcode/mac/current/#/dev10510b1f7

[3]WWDC2018:Optimizing App Assets:https://developer.apple.com/videos/play/wwdc2018/227/

[4]TinyPng:https://tinypng.com/

[5]pngquant:https://pngquant.org

[6] HEIC picture introduction: https://mobiletrans.wondershare.com/heic-convert/what-is-heic-file.html

Recommended reading:

On the recompute mechanism in distributed training

Analyze how the Dolly Bear business practices stability construction based on the distributed architecture

Software Quality and Testing Essays by Baidu Engineers

Baidu APP iOS terminal package size 50M optimization practice (1) overview

Web-side video frame interception scheme based on FFmpeg and Wasm

Baidu's R&D efficiency transformation from measurement to digitalization

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4939618/blog/8694837