iOS image loading speed optimization

FastImageCache is the Path team developed an open source library for lifting loads of pictures and rendering speed, based on the list so that the picture more smoothly sliding up and take a look at how it was done.

First, the optimization points

iOS load an image from disk, use UIImageVIew displayed on the screen, you need to go through the following steps:

  1. Disk to copy data from the kernel buffer
  2. Copying data from the kernel buffer to the user space
  3. UIImageView generated, the image data assigned to UIImageView
  4. If the image data is not decoded PNG / JPG, decoding bitmap data
  5. CATransaction capture to change UIImageView layer of the tree
  6. The main thread Runloop submit CATransaction, image rendering start
    • If the data is not byte-aligned, Core Animation will then copy the data byte aligned.
    • GPU processing bitmap data rendered.

2,4,6 FastImageCache were optimized (1) three steps:

  1. Using mmap memory map, omitted in step 2 above to copy data from user space to kernel space operation.
  2. Bitmap buffer the decoded data to disk, the next step 4 is omitted from decoding when a disk read operation.
  3. Byte aligned data generated, to prevent the first 6 (1) Step CoreAnimation then copy data when rendered.

Next, detailed description of these three points, and optimizing its implementation.

Second, memory mapping

We usually read a file on disk, the upper API calls to the method will eventually use the system read () to read data, the kernel disk data read into the kernel buffer, and then read the user to copy data to the user space from the kernel buffer memory , there is a memory copy time consuming, and after reading the entire file data already exists in the user memory, the process takes up memory space.

FastImageCache used another method of reading and writing files, is to use mapped mmap the files to the user's virtual memory space, the location of the file corresponding with the virtual memory address may be the same as the operation of the memory operation of this document, rather it has been put to the entire file into memory, but before the actual use of these data but does not consume physical memory, disk read and write operations will not be, when the only really use the data, which is ready to render an image on the screen , virtual memory management system VMS before loading mechanism according to the missing page is loaded from disk data blocks corresponding to physical memory, then render. Such file read and write data from the file copy less user space to the kernel cache way, very efficient.

Third, the decoded image

We use a general image is JPG / PNG, bitmap data is not the image, but rather is the result of the coded data compression, it is required to render the decoding bitmap data converted into the screen before, this decoding operation is more time-consuming and no GPU hardware decoding, only through the CPU, iOS default decode the image in the main thread. Many libraries have solved the problem of decoding an image, but due to the decoded image is too large, are generally not cached to disk, SDWebImage practice is to move the child thread decoding operation from the main thread, so time-consuming decoding operation is not the main occupation time thread.

FastImageCache also in the sub-thread decoded image, except that image to disk after it caches decoding. Because a large volume of the decoded image, FastImageCache made of a series of the image data buffer management, as detailed below to achieve portion. In addition cached image volume is large reason for using memory-mapped read the file, use memory-mapped file no small advantage, the less memory copy, the copy of user memory occupancy is not high, the greater the larger the memory-mapped files advantage.

Fourth, the byte alignment

Core Animation in the image data before rendering the case of non-byte aligned will first copy of the image data, there is no official document to copy this behavior indicated, there are simulators and Instrument highlighted "copied images" feature, but it seems there is bug, even if an image that has not been shown to be highlighted when rendered copy, from the call stack still be able to see calls CA :: render :: copy_image method:


What is that byte alignment it, as I understand it, for the performance of the underlying is not a pixel by pixel rendering when rendering the image, but a piece of rendering data is taken piece by piece, you may experience this piece of contiguous memory data in the end is not the content of the image data is other data memory, may lead to cross-border read some strange things mixed in, so before rendering CoreAnimation make a copy of data processing, to ensure that each piece is image data, for a lack of data blanking. It illustrates generally: (pixel is an image pixel data, data is other data memory)


The block size should be related with the CPU cache line, ARMv7 is 32byte, A9 is 64byte, the A9 at CoreAnimation by 64byte should be read as a piece of data to and rendering, so you can avoid the image data alignment 64byte CoreAnimation then copy data repair. FastImageCache do byte alignment is this thing.

Fifth, achieve

FastImageCache the same types and sizes are placed in an image file, a single image taken based on the file offset, web-like Sprite FIG css, referred to herein as ImageTable. This is mainly for the convenience of unified management of image cache, the cache control the size of the entire data management FastImageCache is in one of ImageTable. FIG overall data structure implemented:


And some additional explanation:

5.1 ImageTable

A ImageFormat corresponds to a ImageTable, ImageFormat specified ImageTable in image rendering format / size and other information, ImageTable in the image data by ImageFormat provides a uniform size, the size of each image are the same.

A ImageTable a physical file, save the file and have another meta information of this ImageTable.

图像使用 entityUUID作为唯一标示符,由用户定义,通常是图像url的hash值。ImageTable Meta的indexMap记录了entityUUID->entryIndex的映射,通过indexMap就可以用图像的entityUUID找到缓存数据在ImageTable对应的位置。

5.2 ImageTableEntry

ImageTable的实体数据是ImageTableEntry,每个entry有两部分数据,一部分是对齐后的图像数据,另一部分是meta信息,meta保存这张图像的UUID和原图UUID,用于校验图像数据的正确性。

Entry数据是按内存分页大小对齐的,数据大小是内存分页大小的整数倍,这样可以保证虚拟内存缺页加载时使用最少的内存页加载一张图像。

图像数据做了字节对齐处理,CoreAnimation使用时无需再处理拷贝。具体做法是CGBitmapContextCreate创建位图画布时bytesPerRow参数传64倍数。

5.3 Chunk

ImageTable和实体数据Entry间多了层Chunk,Chunk是逻辑上的数据划分,N个Entry作为一个Chunk,内存映射mmap操作是以chunk为单位的,每一个chunk执行一次mmap把这个chunk的内容映射到虚拟内存。为什么要多一层chunk呢,按我的理解,这样做是为了灵活控制mmap的大小和调用次数,若对整个ImageTable执行mmap,载入虚拟内存的文件过大,若对每个Entry做mmap,调用次数会太多。

5.4 缓存管理

用户可以定义整个ImageTable里最大缓存的图像数量,在有新图像需要缓存时,如果缓存没有超过限制,会以chunk为单位扩展文件大小,顺序写下去。如果已超过最大缓存限制,会把最少使用的缓存替换掉,实现方法是每次使用图像都会把UUID插入到MRUEntries数组的开头,MRUEntries按最近使用顺序排列了图像UUID,数组里最后一个图像就是最少使用的。被替换掉的图片下次需要再使用时,再走一次取原图—解压—存储的流程。

六、使用

FastImageCache 适合用于 tableView 里缓存每个 cell 上同样规格的图像,优点是能极大加快第一次从磁盘加载这些图像的速度。但它有两个明显的缺点:

  1. 占空间大。因为缓存了解码后的位图到磁盘,位图是很大的,宽高 100*100 的图像在 2x 的高清屏设备下就需要 200*200*4byte/pixel = 156KB,这也是为什么 FastImageCache 要大费周章限制缓存大小。
  2. 接口不友好,需预定义好缓存的图像尺寸。FastImageCache 无法像 SDWebImage 那样无缝接入UIImageView,使用它需要配置 ImageTable,定义好尺寸,手动提供的原图,每种实体图像要定义一个 FICEntity 模型,使逻辑变复杂。

FastImageCache 已经属于极限优化,做图像加载/渲染优化时应该优先考虑一些低代价高回报的优化点,例如 CALayer 代替 UIImageVIew,减少 GPU 计算(去透明/像素对齐),图像子线程解码,避免 Offscreen-Render 等。在其他优化都做到位,图像的渲染还是有性能问题的前提下才考虑使用 FastImageCache 进一步提升首次加载的性能,不过字节对齐的优化倒是可以脱离 FastImageCache 直接运用在项目上,只需要在解码图像时 bitmap 画布的 bytesPerRow 设为 64 的倍数即可。

文章

iOS图片加载速度极限优化—FastImageCache解析

Guess you like

Origin www.cnblogs.com/dins/p/ios-tu-pian-jia-zai-su-du-you-hua.html