Lossy vs Lossless Compression

data compression

Data Compression (Data Compression) is the process of reducing the storage size of any data or file to reduce the space it occupies on disk. It is a technique of modifying, reorganizing, encoding and transforming any schema or instance of data to reduce its size. Simply put, it converts files in a way that minimizes their size. Data compression is also known as bit rate reduction or source encoding.

Why is data compression needed? There are two main reasons for this:

  1. Storage: It helps in reducing the amount of data required to store data on disk.
  2. Time: The size is reduced to a certain extent, which saves data transfer time.

insert image description here

lossy compression

Lossy Compression is a technique that involves eliminating a specific amount of data. It helps reduce file size largely without any noticeable things. Also, once a file is compressed, it cannot be restored to its original form because the data in the file is greatly reduced. This technique is more useful when the quality of the file is not very important. Also, it helps to save disk space for storing data.
Lossy compression is useless when the quality of the file is very important. Also, this approach is not ideal if there is further analysis to be processed on the record. This method is generally used for audio and video compression. This type of compression has a large amount of data loss, and even the user cannot recognize it.

lossless compression

Lossless Compression is a technique that only involves the elimination of a certain amount of data. This technique also helps reduce file size, but not as much as lossy compression. On the contrary, in this method, if the file is compressed, it can be restored to its original form. Also, the quality of the data is not affected; therefore, the reduction is not significant.
Lossless compression is not useful when you want to reduce size for extra storage space. Also, lossless compression is of no benefit if any further analysis is to be performed on the file. It is useful to maintain the originality of the file by removing only unnecessary data. This technique is often used for text files, sensitive documents, and classified information.

The difference between lossy and lossless compression

Base lossy compression lossless compression
definition Lossy compression is a technique that involves eliminating a specific amount of data. It helps to reduce the file size considerably without any noticeable notice Lossless compression is a technique that only involves eliminating a certain amount of data. This technique also helps reduce file size, but not to a greater extent
Compression ratio high Low
file quality Low high
Eliminate data It is not obvious that even necessary data is deleted Only a specific amount of harmful data is removed
recover cannot be restored to its original form can be restored to its original form
information loss This technique involves some loss of information This technique does not cover any loss of information
data adjustment More data accommodation Reduced data capacity
distortion file deformation no distortion
data storage capacity More less
algorithm used Transform coding, DCT, DWT, fractal compression, RSSMS RLW, LZW, Arithmetic Coding, Huffman Coding, Shannon Fano Coding
file type JPEG,GIF,MP3,MP4,MKV,OGG等 RAW,BMP,PNG,WAV,FLAC,ALAC等

Which One to Use?

Although both are types of data compression, they are useful in different situations. For example, lossy compression helps reduce file size, which means it's helpful for those who store large amounts of data in databases. Therefore, this technique is very useful in storing data of greatly reduced size. Also, for web pages, such a small file size facilitates faster loading.
Furthermore, this process does not allow for any post-mortem analysis of the data once the compression is complete. Also, files cannot be restructured in their original form as this involves loss of data.
Unlike lossy compression, lossless compression does not involve any data loss. Neither affect the quality of the data nor excessively reduce the size of the data. It remains in its original format so that it can be restored and further operations performed. This approach is helpful for those who need to access data again without compromising data quality.

Final Words

Both lossy and lossless compression help compress data in their own unique ways. While lossy compression can store data by destroying it, lossless compression cannot. Lossless compression techniques are good for maintaining the originality of data, while lossy compression is not. Both of these methods help database management to identify and compress the appropriate files.

Lossy compression, lossless compression (picture, audio, video)

Lossless compression : It is the compression of the file itself. Like the compression of other data files, it is to optimize the data storage method of the file. A certain algorithm is used to represent repeated data information. The file can be completely restored without affecting the file content. For As far as digital images are concerned, there will be no loss of image details.

The basic principle of lossless compression is that the same color information only needs to be saved once. Software that compresses images first determines which areas of the image are the same and which are different. Images that include repeated data (such as blue sky) can be compressed, and only the start and end points of the blue sky need to be recorded. But the blue may also have different shades, and the sky may sometimes be obscured by trees, mountains, or other objects, which need to be recorded separately. Essentially, a lossless compression method that removes some duplicate data, greatly reducing the size of the image to be saved on disk. However, lossless compression does not reduce the image's memory footprint because software fills in the lost pixels with the appropriate color information when the image is read from disk. If you want to reduce the amount of memory an image takes up, you must use a lossy compression method.

Lossy compression : It is a change to the image itself. When saving the image, more brightness information is retained, and the information of hue and color purity is merged with the surrounding pixels. The ratio of merging is different, and the ratio of compression is also different. Because The amount of information is reduced, so the compression ratio can be high, and the image quality will decrease accordingly.
Lossy compression reduces the amount of space an image takes up in memory and on disk, and it won't notice that it has much detrimental effect on the image's appearance when viewed on a screen. Because human eyes are more sensitive to light, the effect of light on the scene is more important than the effect of color, which is the basic basis of lossy compression technology.

Image file format

Lossy compression format: JPEG, JPG, WMF, WebP (jpeg, jpg, wmf)
Lossless compression format: BMP, PCX, TIFF, GIF, TGA, PNG, RAW (bmp, pcx, tiff, gif, tga, png, raw )

Lossy compression detailed format:

  1. WebP is a new image technology launched by Google. It can effectively compress web page images without affecting the compatibility of image formats and actual clarity, thereby speeding up the overall web page download speed. Like JPEG, WebP is a lossy compression utilizing predictive coding techniques. But Google says the format's main advantage is efficiency. They found that "WebP images are 40% smaller than JPEG images at the same quality, and the catch is that the encoding time of WebP images is "eight times longer than JPEG images."
  2. JPEG is also the most common image format. It is a lossy compression format that can compress images in a small storage space, so it is easy to cause damage to image data. JPEG is a very flexible format that has the function of adjusting image quality, allowing files to be compressed with different compression ratios, and supports multiple compression levels. The larger the compression ratio, the lower the quality; conversely, the smaller the compression ratio , the better the quality. The JPEG format mainly compresses high-frequency information, and retains color information well. It is suitable for use on the Internet, can reduce image transmission time, can support 24bit true color, and is also widely used in images that require continuous tone. The JPEG format is currently the most popular image format on the web. The JPEG format is widely used, especially on the Internet and CD-ROM reading materials. JPEG is an image format supported by various browsers because of its smaller file size and faster download speed. JPEG is not suitable for simpler pictures that contain few colors, have large areas of similar color, or have significant differences in brightness.
  3. As an upgraded version of JPEG, JPEG2000 has a compression rate about 30% higher than that of JPEG, and supports both lossy and lossless compression. Compared with JPEG, JPEG2000 has obvious advantages and is backward compatible, so it can replace the traditional JPEG format. JPEG2000 can be applied to traditional JPEG markets, such as scanners, digital cameras, etc., and can also be applied to emerging fields, such as network transmission, wireless communication, and so on.
  4. WMF (Windows Metafile Format) is a common metafile format in Windows, which belongs to the vector file format. It has the characteristics of short files and stylized patterns. The whole graph is often spliced ​​by various independent components, and its graphics are often rough.

Lossless compression detailed format:

  1. BMP is an image file format that has nothing to do with hardware devices and is widely used. It uses a bitmap storage format, and does not use any other compression except for the optional image depth, so the BMP file takes up a lot of space. The image depth of BMP files can be 1bit, 4bit, 8bit and 24bit. When BMP files store data, images are scanned from left to right and from bottom to top. BMP can store a single raster image in any color depth (from black and white to 24-bit color). The Windows bitmap file format is compatible with other Microsoft Windows programs. It does not support file compression, nor is it suitable for web pages. Overall, the disadvantages of the Windows bitmap file format outweigh its advantages. To ensure the quality of photo images, please use PNG, JPEG, TIFF files. BMP files are suitable for wallpapers in Windows. BMP does not support compression, which can result in very large files.
  2. PCX is the earliest file format that supports color images, and now it can support up to 256 colors. The PCX designers had the foresight to preemptively introduce the color image file format, making it a very popular image file format. PCX is the image file format for PC brushes. The image depth of PCX can be selected as l, 4, 8bit. Due to the early days of this file format, it does not support true color. The PCX file adopts RLE run-length encoding, and the compressed image data is stored in the file body. Therefore, when the collected image data is written into the PCX file format, it needs to be encoded by RLE; when reading a PCX file, it must first be decoded by RLE before it can be further displayed and processed. PCX is not supported by web browsers.
  3. The TIF format is a relatively common image file format developed by Aldus and Microsoft for desktop publishing systems. The TIFF format is flexible and changeable, and it defines four different formats: TIFF-B is suitable for binary images; TIFF-G is suitable for black and white grayscale images; TIFF-P is suitable for color images with a palette; TIFF- R is for RGB true-color images. TIFF supports a variety of encoding methods, including RGB compression, RLE compression, JPEG compression, etc. TIFF is the most complicated one among the existing image file formats. It has scalability, convenience, and modifiability, and can be provided to run and image editing programs in environments such as IBM PCs.
  4. The GIF format is a continuous tone lossy compression format based on the LZW algorithm. Its compression rate is generally around 50%, and it does not belong to any application. It is supported by almost all relevant software, and there are tons of software in the public domain that use GIF image files. The data of the GIF image file is compressed, and a compression algorithm such as variable length is used. Therefore, the image depth of GIF ranges from 1bit to 8bit, that is, GIF supports images with up to 256 colors. Another feature of the GIF format is that it can store multiple color images in a GIF file. If the data of multiple images stored in a file is read out one by one and displayed on the screen, it can constitute the simplest format. animation. GIF decoding is faster, but GIF does not support alpha transparency channel.
  5. The TGA format was developed by Truevision Corporation of the United States for its display card, and the file extension is ".tga", which has been accepted by the international graphics and image industries. The structure of TGA is relatively simple. It is a common format for graphics and image data. It has a great influence in the multimedia field and is a preferred format for converting computer-generated images to television. The biggest feature of the TGA image format is that it can make irregularly shaped graphics and image files. Generally, graphics and image files are square. If you need circular, diamond-shaped or even hollow image files, TGA can come in handy. Useful. The TGA format supports compression and uses a lossless compression algorithm. It is a better image format.
  6. The PNG format is the newest image file format accepted on the web. PNG can provide lossless compressed image files that are 30% smaller in length than GIF. It also provides 24-bit and 48-bit true color image support and many other technical supports. PNG is very new, so not all programs can use it to store image files, but Photoshop can handle PNG image files and can also be stored in the PNG image file format. PNG supports a high level of lossless compression, but as an Internet file format, PNG offers less compression than JPEG's lossy compression. PNG supports alpha channel transparency, but as an Internet file format, PNG does not provide any support for multi-image files or animation files. The GIF format supports multiple image files and animation files. PNG supports gamma correction. PNG supports interlacing. PNG is supported by recent web browsers, but older browsers and programs may not support PNG files.
  7. The RAW format contains all the photo information of the original picture file after it is generated by the sensor and before it enters the camera image processor. Users can use some specific software on the PC to process pictures in RAW format. Many image processing software can process the RAW files output by the camera. These software provide adjustments to the sharpness, white balance, level and color of the RAW format photo. Also, since RAW has 12-bit data, you can use software to extract photo detail from the highlights or dark areas of a RAW image that is impossible to find in an 8-bit-per-channel JPEG or TIFF image. Insufficient compatibility is still the biggest obstacle restricting the development of RAW format.

audio file format

Lossy compression format: AIFF, MPEG, MP3, MPEG-4, MIDI, MIDI, WMA
Lossless compression format: WAV, APE (wav, ape, flac)

Lossy compression detailed format:

  1. AIFF is the standard audio format on Apple computers and is part of the QuickTime technology. Due to the inclusive nature of AIFF, it supports many compression techniques.
  2. MPEG is a lossy compression format, but its biggest advantage is that the extremely small sound distortion is exchanged for a high compression ratio. MPEG contains formats including: MPEG-1, MPEG-2, MPEG-Layer3, MPEG-4.
  3. Mp3 refers to the audio part of the MPEG standard, which is the MPEG audio layer. It is divided into 3 layers according to the compression quality and encoding processing, corresponding to the 3 types of sound files *.mp1 / *.mp2 / *.mp3 respectively. What needs to be reminded is that the compression of MPEG audio files is a kind of lossy compression, and MPEG3 audio encoding has a high compression ratio of 10:1~12:1, while basically keeping the low audio part undistorted, but sacrificing the sound file The quality of the middle 12KHz to 16KHz high-frequency part is exchanged for the size of the file. Music files of the same length are stored in *.mp3 format, which is generally only 1/10 of the .wav file, so the sound quality is inferior to CD format or WAV format. sound file. The size of the Mp3 file is small and the sound quality is good; so there is no other audio format that can match it at the beginning of its advent, thus providing good conditions for the development of the .mp3 format. The Mp3 format is still very popular, and its status as a mainstream audio format is hard to shake.
  4. The MPEG-4 standard is a video compression standard for multimedia applications announced by the International Motion Picture Experts Group in October 2000. It uses object-based compression coding technology. MPEG-4 has been widely used in image transmission systems such as network multimedia, video conferencing and multimedia monitoring due to its high quality and low transmission rate. Most of the mature MPEG-4 applications in China and abroad are based on the client and server modes at the PC level, and there are not many applications on embedded systems, and most embedded MPEG-4 decoding systems mostly use commercial embedded operating systems , such as WindowsCE, VxWorks, etc., have high cost and poor flexibility. For example, using embedded Linux as an operating system is not only convenient for development, but also can save costs, and can be cut according to actual conditions. It occupies less resources, has strong flexibility, good network performance, and a wider range of applications.
  5. The MIDI format is used by people who play music a lot, and MIDI allows digital synthesizers and other devices to exchange data. The MID file format is inherited from MIDI. A MID file is not a recorded sound, but a set of instructions to record sound information and then tell the sound card how to reproduce the music. MIDI files only use about 5-10KB for each minute of music stored. MID files are mainly used for original musical instrument works, amateur performances of popular songs, game audio tracks, and electronic greeting cards, etc. The playback effect of .mid files depends entirely on the grade of the sound card. The biggest use of the .mid format is in the field of computer composition. The .mid file can be written by music software, or the music played by an external sequencer can be input into the computer through the MIDI port of the sound card to make a .mid file.

Lossless compression detailed format:

  1. A WAV file is a wave file, an audio storage format introduced by Microsoft, mainly used to save audio sources under the Windows platform. The WAV file stores the binary data of the sound waveform. Since it is not compressed, the WAV waveform sound file has a large volume. The formula for calculating the space occupied by WAV files is [(sampling frequency × quantization digits × number of channels) ÷ 8] × time (seconds), and the unit is Byte. Theoretically, the higher the sampling frequency and quantization bits, the better, but the more disk space required. The common WAV format (that is, CD-quality WAV) has a sampling frequency of 44100Hz, 16Bit quantization digits, and two channels. Such a WAV sound file needs about 10MB to store one minute of music, which takes up too much space and is generally not professional. People (such as professional recording studios and other occasions that require extremely high sound quality) will not choose to use WAV to store sound.
  2. APE, the most original file (WAV) is usually very large. For example, the music on a CD is about 700M. If it is separated into each song, the file size of each song is 20-60M. Such a large file takes up hard disk space and is not suitable for transmission on the Internet. Therefore, this original large file is usually compressed. There are many kinds of compression methods, which can be divided into two categories. One is compression without loss. For example, monkey.exe software can be used to compress the original music file (WAV file) to 50-60% of the original size, the file format is APE. More and more people choose the APE format, and the Internet communication is indispensable. Many music enthusiasts exchange APE format music on the Internet.
  3. FLAC stands for Free Lossless Audio Compression, which means audio is compressed in FLAC mode without losing any information. This compression is similar to Zip, but FLAC will give you a greater compression ratio, because FLAC is a compression method specially designed for the characteristics of audio, and you can use the player to play FLAC compressed files, just as you usually play your MP3 files are the same. FLAC pays more attention to the speed of decoding. The decoding speed is fast. Each data frame of FLAC contains all the information needed for decoding. The current frame is decoded without reference to previous or subsequent data frames. FLAC uses a synchronization code and CRC (similar to encoding formats such as MPEG), so that the decoder can have minimal time delay when jumping in the data stream. Fluidization is possible. Ideal for archiving applications: FLAC is an open encoding format, and without any data loss, you can convert it to any other format you need. In addition to the CRC and MD5 marks of each data frame to ensure data integrity, FLAC (annotation: the command-line encoding tool provided by the FLAC project) also provides a verify (check) option, when using this option for encoding , while encoding, it will immediately decode the encoded data and compare it with the original input data. Once it finds a difference, it will exit and give an alarm. Easy to back up CDs: FLAC has a "cue table" metadata data block for storing the CD's table of contents and index points for all tracks. You can save a CD to a single file, and import the cue table of the CD, so that a FLAC file can completely record all the information of the entire CD. When your original CD is damaged, you can use this file to restore a copy of the CD that is exactly the same as the original. Anti-damage: Due to the frame structure of FLAC, once the data stream is damaged, the loss will be limited to the damaged data frame. Generally only a very short fragment will be lost. When many other lossless audio compression formats encounter damage, one damage will cause the loss of all subsequent data.

video file format

Lossy compression format: MPEG, AVI, ASF, MOV
Lossless compression format: At present, there are lossless compressed videos in the windows operating system, but it is not limited by format restrictions. For example, MP4 format can support high Intensive video compression can also support lossless compression formats, but most of the ones we generally use are lossy compression.

Lossy compression detailed format:

  1. The MPEG file format is an international standard for moving image compression algorithms, which uses a lossy compression method to reduce redundant information in moving images. The MPEG compression method goes a little deeper, which is to retain most of the same parts of two adjacent pictures, and remove the redundant parts of the subsequent images and the previous images, so as to achieve the purpose of compression. At present, the main compression standards of MPEG are MPEG-1, MPEG-2, MPEG-4, MPEG-7 and MPEG-21.
  2. AVI is an acronym for Audio Video Interleave, which encapsulates video and audio in one file and allows audio to be played synchronously with video. The advantage of this video format is that the image quality is good, and it can be used across multiple platforms; its disadvantage is that it is too large, and what is worse is that the compression standard is not uniform, so when playing some AVI format videos, it often occurs due to video encoding. The video caused by the problem cannot be played or even if it can be played, there are some inexplicable problems such as the inability to adjust the playback progress and only sound but no image during playback. If users encounter these problems when playing AVI format videos, they can download the corresponding Similar to the DVD video format, AVI files support multiple video streams and audio streams. It uses a lossy compression method for video files, but the compression ratio is relatively high, so although the picture quality is not very good, its application range is still very wide.
  3. ASF format is a file compression format that can directly watch video programs on the Internet. Users can directly use the Windows Media Player that comes with Windows to play it. It uses the compression algorithm of MPEG-4, and its compression rate and image quality are very good. Because ASF exists in a video stream format that can be watched instantly on the Internet, its image quality is a little worse than that of VCD, but better than the RAM format that is also a video stream format.
  4. The MOV format is an audio and video file format developed by Apple for storing commonly used digital media types. When QuickTime (w.mov) is selected as the save type, the animation will be saved as a .mov file. Quick Time was originally an image and video processing software used by Apple on Mac computers. QuickTime provides two standard image and digital video formats, which can support static *.PIC and *.JPG image formats, and dynamic * based on the Indeo compression method. MOV and *.MPG video formats based on the MPEG compression method. Due to its technical characteristics of cross-platform (MacOS/Windows) and small storage space requirements, QuickTime uses MOV format files with lossy compression, and the picture effect is slightly better than that of AVI format. Now this format can also be processed by some non-editing software, including Adobe's professional multimedia video processing software After Effect and Premiere.
  5. The WMV format is a streaming media format introduced by Microsoft, which is an upgrade and extension of the ASF format. Under the same video quality, the size of WMV format is very small, so it is very suitable for playing and transmitting on the Internet.
  6. The 3GP format is a multimedia standard formulated by the "Third Generation Partnership Project", that is, a video encoding format for 3G streaming media. A video format.
  7. The FLV/F4V format is also a video streaming format. Because the file it forms is small and the loading speed is fast, it makes it possible to watch video files on the Internet. Its appearance effectively solves the problem that after the video file is imported into Flash, the exported SWF file is bulky and cannot be used well on the Internet. And other shortcomings, the application is more extensive.

Guess you like

Origin blog.csdn.net/ProgramNovice/article/details/128159793