Lossy compression algorithm for floating point data with complete C code

When I was working on the retouching APP algorithm a few years ago,

I once thought of compressing the 3D Lut preset data,

Mainly used to improve user experience.

There are also a lot of open source resources about the 3d lut algorithm, so I won't do more popular science.

Interested friends, you can go to check the relevant implementation code of the ffmepg project.

The earliest contact with the 3d lut algorithm was when I reversed the VSCO Cam film algorithm in 2014.

Of course, I didn't know its algorithm was 3d lut at first,

It is to repeatedly write various versions, optimize the algorithm,

Until one day I suddenly remembered that a constant is very strange,

Later, when I looked at the 3d lut algorithm data for a while, I felt that the algorithm was very familiar.

Later, of course, I knew what was going on.

At that time, when I was making an APP and considering compressing the preset resources,

At that time, due to the rush of the project, the LZ compression algorithm was used, and the natural compression ratio was not high.

As a result, the preset file is too large and occupies a bit of resource volume.

Originally, I expected to do a floating-point compression algorithm, but this delay will not be followed.

Many people are curious about how the film filter algorithm is implemented.

There are many versions circulating on the Internet. As a senior security researcher, I will tell you the general situation.

In the early days, most of the apps used 2d lut to simulate the effect of VSCO Cam.

The idea is relatively simple, that is to make a 2D color map for interpolation, generally a 512*512*3 color table,

There is a specific implementation in GPUImage . If you are interested, you can go to it. It will not be expanded here.

I still keep the original VSCO Cam film algorithm here.

In recent years, the deep learning neural network has become popular. When the author is doing forward propagation on the mobile phone,

Had a similar problem again.

Model quantization, model compression, etc.

The idea of ​​​​model quantization is actually quite simple, for example, 32-bit quantization to 16-bit,

Or quantize it to 8 bits, and obtain a certain performance improvement and resource compression by reducing the precision.

Just quantifying one operation can improve performance and compress model volume.

So it's definitely a good solution.

Of course, memory mapping can also be considered under IOS.

Map physical space into memory space to reduce memory usage and the like.

Of course, this method must be supported by the file type of the operating system.

Undoubtedly, once again encounter the problem of compression of floating point data.

The vast majority of deep learning models today use 32-bit floating point to store weights.

But it's very strange, I don't seem to see anyone compressing floating-point data.

 

Could it be that floating-point data really cannot be compressed? !

 

No, no, when I was doing image algorithms before, I was particularly interested in frequency domain algorithms.

Because this way of changing the perspective of thinking is indeed ingenious.

After reading a lot of information on the Internet, I feel that not many people can explain the frequency domain in plain language.

I'm not ashamed to speak up.

In fact, the core of the frequency domain is frequency, that is, it conforms to a certain frequency law.

It's a bit like counting, for example: eight 8s can be recorded as 88888888 or 88, or they can be directly recorded as 8.

This is frequency, and what is the frequency domain?

The frequency domain, or frequency, is actually an expression that describes the wave rate probability of a specific frequency and even the frequency domain.

In other words, counting based on specific expression criteria.

And Fourier transform or cosine transform, the transform here is actually an expression.

It's a bit like, you and your girlfriend agree on a secret code, such as a wink, to express, dear, you understand.

Well, the explanation has come to an end for now, and it will be a bit inappropriate for children.

And the most classic compression algorithm is jpeg, which is a well-known format.

Although there are also up-and-coming formats such as WebP FLIF,

But jpeg, like mp3, has become the default marker of an era.

JPEG is a compression algorithm based on dct8x8 transform.

The specifics are not expanded. If you are interested, you can go to see the jpeg codec.

For example: https://github.com/cpuimage/TinyJPEG

This is a bit long, so is it possible to perform lossy compression of floating point data based on dct 8x8?

The answer, yes, is that simple and rude.

 

 Data length: 8*8*8

Fill data from 0 - 511 in order.

Here is a reference data:

zlib 压缩:
miniz.c version: 10.0.2
Compressed from 2048 to 730 bytes
Decompressed from 730 to 2048 bytes
 

dct+ zlib 压缩:
miniz.c version: 10.0.2
Compressed from 2048 to 116 bytes
Decompressed from 116 to 2048 bytes

If it conforms to certain DCT laws, the compression ratio of dct+zlib is extremely high.

If the padded data is random data with no regularity, zlib compression ratio is higher in most cases.

And there is another trick in jpeg encoding, which is to use color space,

Convert RGB to YCBCR space for higher compression ratio.

Of course, there will be a certain loss of information, it seems a little too much.

Stop it. .

Attach the complete sample code:

#ifdef __cplusplus
extern "C" {
#endif

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include "miniz.h"
#include "dct.h"
#include <stdint.h>

int test_miniz(const unsigned char *s_pStr, uLong data_len) {
    int cmp_status;
    uLong src_len = data_len;
    uLong cmp_len = compressBound (src_len);
    uLong uncomp_len = src_len;
    uint8_t *pCmp, *pUncomp;
    printf("miniz.c version: %s\n", MZ_VERSION);
    // Allocate buffers to hold compressed and uncompressed data.
    pCmp = (mz_uint8 *) malloc((size_t) cmp_len);
    pUncomp = (mz_uint8 *) malloc((size_t) src_len);
    if ((!pCmp) || (!pUncomp)) {
        printf("Out of memory!\n");
        return EXIT_FAILURE;
    }
    // Compress the string.
    cmp_status = compress(pCmp, &cmp_len, (const unsigned char *) s_pStr, src_len);
    if (cmp_status != Z_OK) {
        printf("compress() failed!\n");
        free(pCmp);
        free(pUncomp);
        return EXIT_FAILURE;
    }
    printf("Compressed from %u to %u bytes\n", (mz_uint32) src_len, (mz_uint32) cmp_len);
    // Decompress.
    cmp_status = uncompress(pUncomp, &uncomp_len, pCmp, cmp_len);
    if (cmp_status != Z_OK) {
        printf("uncompress failed!\n");
        free(pCmp);
        free(pUncomp);
        return EXIT_FAILURE;
    }
    printf("Decompressed from %u to %u bytes\n", (mz_uint32) cmp_len, (mz_uint32) uncomp_len);
    // Ensure uncompress() returned the expected data.
    if ((uncomp_len != src_len) || (memcmp(pUncomp, s_pStr, (size_t) src_len))) {
        printf("Decompression failed!\n");
        free(pCmp);
        free(pUncomp);
        return EXIT_FAILURE;
    }
    free(pCmp);
    free(pUncomp);
    printf("Success.\n");
    return EXIT_SUCCESS;

}

int test_dct_miniz ( float * data, uLong len) {
    uLong nCount = len / 64;
    float *in_data = data;
    for (int i = 0; i < nCount; i++) {
        DCT(in_data, in_data);
        in_data += 64;
    }
    test_miniz((const unsigned char *) data, len * sizeof(float));
    float *out_data = data;
    for (int i = 0; i < nCount; i++) {
        IDCT(out_data, out_data);
        out_data += 64;
    }
}

int main(int argc, char *argv[]) {
    printf("float data loss compression algorithm base DCT 8X8.\n");
    printf("DCT implementation by Thomas G. Lane.\n");
    printf("miniz implementation by Rich Geldreich.\n");
    //http://developer.download.nvidia.com/SDK/9.5/Samples/vidimaging_samples.html#gpgpu_dct
    printf("blog:http://cpuimage.cnblogs.com/\n");
    int is_debug_output = 1;
    const uLong data_len = 8 * 8* 8;// blocksize
    float test_for_miniz[data_len];
    float test_for_dct[data_len];
    for (int i = 0; i < data_len; ++i) {
        test_for_miniz[i] = i;
    }
    memcpy(test_for_dct, test_for_miniz, data_len * sizeof(float));
    printf("\nonly miniz:\n");
    test_miniz((const unsigned char *) test_for_miniz, data_len * sizeof(float));
    printf("\nwith dct:\n");
    test_dct_miniz (test_for_dct, data_len);

    if (is_debug_output) {
        for (int i = 0; i < data_len; ++i) {
            if (test_for_miniz[i] != test_for_dct[i]) {
                printf("index %d: %f != %f \n", i, test_for_miniz[i], test_for_dct[i]);
            }
        }
    }

    printf("\n press any key to exit.\n");
    return EXIT_SUCCESS;
}

#ifdef __cplusplus
}
#endif

Project address: https://github.com/cpuimage/DCT_8X8

 

In addition, I would like to thank an anonymous netizen for a RMB 1 reward during the 5.1 holiday.

A trickle can become a river~

 

If you have any other questions or needs, please contact me by email.

 

The email address is: 
[email protected]

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325208651&siteId=291194637