Chapter 3 Principle and Technology of Image Coding

3.1.1 Statistical properties of images in spatial domain

  1. The concept of statistical properties of images in the spatial domain
  2. image correlation function
  3. Histogram of the image

 1. Spatial domain statistical characteristics of images

  • The statistical characteristics of the image refer to the random statistical characteristics of the image signal (brightness, chrominance) itself, or the output value after they are processed in a certain way.
  • Just like between adjacent pixels in a row, between pixels corresponding to adjacent rows, and corresponding positions between video frames often have a strong correlation;
  • The purpose of compressing and encoding image information is to remove the inherent statistical characteristics of image signals.

The image can be regarded as a random field  , which also has corresponding random characteristics.

In the space domain, the digital image appears as a spatially distributed dot matrix , and its statistical properties are used to reflect the correlation between any two pixels. To calculate the similarity between them in the sense of statistical average, it is mainly represented by the probability distribution function that reflects the pixel value and the correlation function that marks the relationship between pixels .

2. Image related functions

 The correlation characteristic between adjacent pixels, which decreases as the distance between two pixels increases

3. Histogram of the image

Histogram ----- reflects the probability distribution of pixels, so the histogram is also called the probability distribution function of pixels.

Gray histogram (Histogram) is a function of gray level.

The histogram of any given image is unique, but the histogram cannot uniquely determine the image.

3.1.2 Statistical properties of images in the frequency domain

1. The frequency domain statistical characteristics of the image

In the frequency domain, the image appears as the distribution of coefficients of different frequency components.

2. Power Spectrum Statistics of TV Signals

The power spectrum energy of the TV signal is mainly concentrated in the low frequency part, and as the frequency increases, its energy becomes smaller and smaller.

3. Statistical characteristics of image difference signal

The definition of adjacent pixel difference signal in the frame :

        Differences for the same row:\Delta f_{H}(i,j) = f(i,j) - f(i,j-1)

        Differences in the same column:\Delta f_{V}(i,j) = f(i,j) - f(i-1,j)

 Inter-frame difference definition:

        Let it f_{k}(i,j)represent a certain pixel in the kth frame, f_{k-1}(i,j)which represents the k-1th frame, and f_{k}(i,j)is at the same geometric position as the pixel. Then: d_{t}(i,j)=f_{k}(i,j)-f_{k-1}(i,j).

The statistical characteristics of inter-frame difference signals provide an important basis for inter-frame compression coding of TV signals.

When the difference between adjacent pixels of the image and the difference between frames is 0, the probability distribution is the largest.

3.2.1 Basic principles of predictive coding

 1. Classification of predictive coding

  1. Delta modulation or DM coding method
  2. Differential Pulse Code Modulation or DPCM coding method

2. The principle of predictive coding 

Predictive Coding

It is to use a model to predict the current pixel value based on the pixel value at the "past" moment. Predictive coding usually does not encode the signal directly, but encodes the prediction error (the difference between the predicted value and the value that has occurred) . When the prediction is relatively accurate and the error is small, the purpose of coding compression can be achieved.

principle

For the real value of the discrete amplitude of a pixel of the image, use the correlation of its adjacent pixels to predict the possible value of its next pixel, and then calculate the difference between the two, quantify the difference with predictive properties, Encoding can achieve the purpose of compression.

3. Flow chart of predictive coding

 4. Predictive coding error

The prediction process of the receiving end is the same as that of the sending end, and the predictor used is the same. The signal output by the receiving end is an approximate value of the sending end, and the error between the two is

 The final error \Delta _{n}^{'}-\Delta _{n}is the difference between the output of the quantizer and the prediction error, so the key to the error of the two lies in the quantizer.

Notice:

  1. Multi-point prediction: For an image, it is a two-dimensional matrix. For the current pixel x, its predicted value is predicted by the points around it, and it can be predicted by x1, x2, and x3.
  2. The first few pixels of each line cannot be predicted , and these pixels need to be encoded in other ways, which is an additional operation required by predictive encoding
  3. The prediction coefficient varies with different images , but it is too troublesome and unrealistic to calculate the prediction coefficient for each image, so it can be used by referring to the data obtained by the predecessors. In the international standard for still image compression (JPEG), there is a recommended value for reference to the form of the leading point and the prediction coefficient of this method.

 3.2.2 Lossless predictive coding

1. Coding idea

It is considered that the information of adjacent pixels is redundant. The current pixel value can be obtained with the previous pixel value (remove pixel redundancy)

 2. Coding process

Use the current pixel value fn to obtain a predicted value f^n through the predictor, calculate the difference between the current value and the predicted value, encode the difference, and use it as the next element in the compressed data stream.

Since the difference is smaller than the original data, the encoding is smaller, and variable-length encoding can be used. Most of the time, the prediction of fn is generated by a linear combination of m previous pixels.

3. One-dimensional linear predictor

In one-dimensional linear (row predictive) predictive coding, the predictor is:

 round is the nearest integer, ai is the prediction coefficient (may be 1/m), and y is the row variable.

The first m pixels cannot be encoded by this method, but Huffman encoding can be used, and the predictive encoding can be performed after m pixels.

4. Lossless predictive encoding and decoding process

 

 3.2.3 Lossy predictive coding

 Lossy compression coding increases the compression rate by sacrificing the accuracy of the image.

 Compression ratio for lossy compression methods:

  • When the image compression ratio is greater than 30:1, the image can still be reconstructed
  • When the image compression ratio is 10:1 to 20:1, the reconstructed image is almost indistinguishable from the original image
  • The compression ratio of lossless compression rarely exceeds 3:1

 The fundamental difference between lossy and lossless compression methods is whether there is a quantization module.

1. The basic idea of ​​the quantizer 

 The easiest way to reduce the amount of data is to quantize the image into fewer gray levels , and achieve image compression by reducing the gray level of the image. This quantization is irreversible , so the image is lost when decoding.

The gray level before quantization is from 0 to 255, a total of 256 gray levels. When quantizing, I quantize the gray level between 0-S1 as t1. In this way, 256 gray levels are quantized into 4 gray levels, the data volume is greatly reduced, and image information must be lost.

2. The basic idea of ​​lossy prediction

Quantify the error  of lossless predictive compression , and achieve the purpose of further image compression by eliminating visual psychological redundancy.

Introducing the step of quantization on the basis of lossless predictive coding , the lossy predictive coding is obtained:

fn' is the recovered image at the receiving end

3. Lossy predictive coding process

 4. Lossy predictive decoding process

3.3.1 Basic Principles of Orthogonal Transform Coding

  1. The basic principle of transform coding, the basic principle of orthogonal transform coding
  2. Properties of Orthogonal Transform Coding

The key technology of orthogonal transformation:

The purpose of data compression is achieved by eliminating the correlation in the source sequence.

Predictive coding: coding in the spatial domain

Transform coding: coding in the transform domain

1. The basic principle of transform coding

 The most important type of transformation coding method is the orthogonal transformation coding method (or function transformation coding method):

 Image encoding and decoding process

 2. Properties of Orthogonal Transform Coding

  1. Has entropy preserving properties
  2. Has the property of energy preservation (whether in the space domain or the transform domain, its energy is equal)
  3. energy redistribution and concentration
  4. Decorrelation feature (After transformation, the correlation between pixels becomes smaller)
  • In summary, as a result of the orthogonal transformation, the spatial domain of correlated images may become an energy-preserving, concentrated, and uncorrelated transformation domain.
  • If transform coefficients are used instead of spatial sample codes for transmission, only the energy-concentrated part of the transform coefficients needs to be coded, so that the code rate required for digital image transmission or storage can be compressed. 

3.3.2 Discrete Fourier Transform (DFT)

  1.  Advantages of DFT
  2. One-dimensional DFT
  3. Two-dimensional DFT
  4. Properties of 2D DFT
  5. Application of DFT in Image Processing

        In the orthogonal transformation of image signals, there are mainly discrete Fourier transform (DFT), discrete cosine transform (DCT), discrete Walsh transform (DWT), discrete Hadamard transform (DHT) and so on.

Advantages of image signal orthogonal transformation:

        The amount of image data is large. If it is directly processed in the spatial domain, the calculation amount will be large , and the calculation amount will increase sharply as the number of image samples increases, making it difficult to process in real time. Using the orthogonal transformation of image signals to convert the input image signal from the space domain to the frequency domain , the convolution or correlation operation in the space domain can be simplified to the multiplication process in the frequency domain, which greatly reduces the amount of calculation and improves the processing speed. It can change the difficult real-time Handle the situation.

1. Advantages of DFT 

 2. One-dimensional DFT

         Usually the Fourier transform is in complex form , that is, F(u)=R(u)+jI(u)R(u) and I(u) in the formula are the real and imaginary parts of F(u) respectively.

        Usually the Fourier transform can also be in exponential form , namely: F\left ( u \right )=|F(u)|e^{j\Phi (u)}, |F(u)| is the modulus of F(u), where:

Often \left | F(u) \right |referred to as the spectrum of f(x) or the Fourier magnitude spectrum , is the phase spectrum\varphi \left ( u \right ) of f(x) .

3. Two-dimensional DFT

 1. Definition of two-dimensional DFT

  •  The complex form of the 2D Fourier transform, that is:F(u,v)=R(u,v)+jI(u,v)
  • The Fourier spectrum of the two-dimensional Fourier transform, namely:
  • The phase spectrum of the 2D Fourier transform, namely:

4. Properties of two-dimensional DFT 

  • Separability
  • Translational properties
  • rotation invariance
  • linear
  • conjugate symmetry
  • scalability
  • Convolution Theorem

 1. Separability

 For the forward transformation , it can be decomposed into the following two formulas:

         In the above formula, each formula is a one-dimensional discrete Fourier transform, so the two-dimensional discrete Fourier transform F(u, v) can be performed by f(x, y) first by one-dimensional discrete Fourier transform by row, and then by column It is obtained by performing a one-dimensional discrete Fourier transform.

 2. Translational properties

 3. Rotation invariance

         Indicates that if the discrete function is rotated by \Theta _{0}an angle in the view domain, the discrete Fourier transform function will be rotated by the same angle in the transform domain.

 DFT spectrum analysis:

         In the image after Fourier transform, the middle part is bright, which is the low-frequency part-the part where the spectrum energy is concentrated , and the closer to the outside, the higher the frequency.

5. Application of DFT in image processing

  •  Application of DFT in Image Filtering

 The image after DFT transformation has a low-frequency part in the middle, and the higher the outer frequency, therefore, the required high-frequency or low-frequency filtering can be selected.

  •  Application of DFT in Image Compression

         The transformation coefficient just represents the amplitude at each frequency point. When the wavelet transform is not proposed, it is used for compression coding. Taking into account the characteristics of high frequency reflecting details and low frequency reflecting the general appearance of the scene. It is often believed that the high-frequency coefficient can be set to 0, which is fooling the human eye. Some high-frequency coefficients are discarded when errors are allowed.

  • Application of DFT in convolution

The convolution of the time domain is converted to the frequency domain as the multiplication of the spectrum 

3.3.3 Discrete Cosine Transform DCT

  1. One-dimensional Discrete Cosine Transform DCT
  2. Two-dimensional Discrete Cosine Transform DCT
  3. Spectrogram Analysis of DCT
  4. Coefficients of Orthogonal Transformation

         One of the biggest problems with the Fourier transform is that its parameters are all complex numbers, which are equivalent to twice the real number in the description of the data (a complex number is expressed as the real part plus j multiplied by the imaginary part, so the amount of data is equivalent to twice the real number) . For this reason, we hope to have a transformation that can achieve the same function but with a small amount of data. Therefore, a DCT transform is produced.

1. One-dimensional discrete cosine transform

 Complex numbers do not appear in the definitions of forward and inverse transformations.

2. Two-dimensional discrete cosine transform

 3. Spectrum analysis of DCT

 4. Properties of Orthogonal Transformation

3.3.4 Still Image Compression Standard JPEG 

  1. Introduction to the JPEG Standard
  2. Key Technology Introduction
    1. lossless compression coding
    2. DCT-based sequential coding mode

1. Introduction to JPEG standard

          JPEG (Joint Photographic Experts Group) is the abbreviation of the Joint Photographic Experts Group, a committee engaged in the formulation of still image compression standards. JPEG is now also used to represent the still image compression standard.

 There are several operating modes to choose from when designing and using:

Lossless compression coding mode : This mode guarantees accurate restoration of all sample data of the digital image without any distortion compared with the original digital image.

DCT-based sequential coding mode : It is based on DCT transformation, and compresses and codes the original image data in order from left to right and from top to bottom. When the image is restored, it is also carried out in accordance with the above sequence.

DCT-based progressive coding mode : Based on DCT transformation, but using multiple scans to encode image data, it is carried out in a gradual and cumulative manner from coarse to fine. When decoding, you can first see the general appearance of the image on the screen, and then gradually refine it until it is fully restored.

DCT-based layered coding mode : image coding with multiple resolutions, starting with low resolution, and gradually increasing the resolution until it is the same as the original image resolution. When decoding, so is the process of reconstructing the image.

Compression process of JPEG standard

 2. Introduction of key technologies

      (1) Lossless compression coding

                JPEG Selects Differential Pulse Modulation (DPCM) as the Method of Lossless Compression Coding

The predictor adopts the 3-neighborhood prediction method, using three adjacent sampling points (A, B, and C) to predict the current coded sampling point X

There are many prediction methods. In the lossless coding mode, 7 predictors are provided for users to choose:

         (2) DCT-based sequential coding mode

  1. data unit
  2. 8*8 DCT transformation
  3. Quantify
  4. DC coefficient and AC coefficient sweep
  5. entropy coding

 1. Data unit

        Before encoding, each component of the input image (if the image is a color image and is stored in RGB space, then its components are R component, G component, and B component) are divided into non-overlapping 8×8 Sub-block, 64 data in the block form a data unit (DU). If the number of rows or columns of the image is not a multiple of 8, copy the bottom row and the rightmost column to the required multiple.

        Although JPEG can compress the usual RGB components, the compression effect is better in the luma/chroma space (YUV space).

        The conversion between RGB and YUV is not included in the codec, but is done by the application as needed before encoding and after decoding.

 Two, 8*8 DCT

 JPEG uses 8*8 sub-image blocks for two-dimensional discrete cosine transform:

 After the 8*8 f(x,y) is transformed by DCT, we can get the coefficients in the DCT domain, which are represented by F(u,v).

After the DCT transformation, the energy will be transferred, generally concentrated on a few low-frequency coefficients in the upper left corner.

 When f(x,y) is an 8-bit pixel, its value range is 0~255, so the value range of the DC coefficient F(0,0) can be obtained from 0~2040.

Before the transformation, the digital image sampling data should be converted from unsigned integers to signed integers, that is, the 2^{8}integers in the range [0, -1] are mapped to integers in the range [ -2^{8-1}, 2^{8-1}-1] (that is, 0 The range of ~255 is mapped to the range of -128~127). The method of transformation is to subtract the input data 2^{8-1}.

3. Quantification

 Quantification process:

 Dequantization process:

         After we transform the color image into the YUV space, it has a Y luminance component, and U and V chrominance components. The quantization tables corresponding to different components are different. There are luma quantization step table and chroma quantization step table.

After quantization, the DCT coefficient matrix becomes sparse, and most of the high-frequency component coefficients located in the lower right corner of the matrix are quantized to 0.

The purpose of quantization is to make F(u,v) more sparse and energy more concentrated.

4. DC coefficient and AC coefficient scan

        Considering that after DCT transformation, the DC coefficient reflects the measure of the average value of 64 pixels in the 8×8 sub-block, which contains an important part of the total energy of the entire image, so the DC coefficient and the remaining 63 AC coefficients are coded separately.

To encode the DC coefficients: To encode the AC coefficients:

 5. Entropy coding

         There are two entropy coding methods suggested by JPEG: Huffman coding and adaptive binary arithmetic coding. The former uses the Huffman code table, and the latter uses the conditional code table of the arithmetic code.

        When coding, DC coefficients and AC coefficients use different Huffman coding tables, and luminance and chrominance also require different Huffman coding tables, so a total of 4 coding tables are required.

(1) DC coefficient encoding

        The form of " prefix code (SSSS) + tail code " is adopted: the prefix code indicates the effective number of digits of the tail code (set to B), and standard Huffman coding is used; the tail code directly adopts B-bit natural binary code.

        For the 8-bit precision JPEG basic system, the value range of the prefix code SSSS is 0~11, and the SSSS code table has 12 items in total.

 After finding out the digits of its prefix code word and tail code from the table according to the amplitude range of Diff, the tail code word can be directly written according to the following rules:

 (2) AC coefficient encoding

After quantization and zigzag scanning, there will be more 0s in the AC coefficients, so the run-length encoding of 0 coefficients is adopted.

        First, all AC coefficients are represented as 00...0x, 00...0X, ..., 00...0X, where X is a non-zero value. Several 0s and a non-zero value form a basic coding unit.

        A series of 0s in the basic coding unit can be represented by the number of runs, that is, the number of 0s, followed by a prefix code + tail code similar to DC coefficient coding, forming a "zero run length/category/non-zero value" structure .

Several Code Tables in AC Coefficient Coding

 Eg:

 3.3.5 JPEG codec example

1. Coding

 2. Decoding

 The decoder dequantizes the received DCT coefficients to obtain:

Perform the inverse DCT process, and add 128 to each element to obtain the reconstruction block:

 3. Coding analysis

 

 3.3.6 JPEG encoding extension system

  1.  Progressive Coding Mode Based on DCT
  2.  Layered Coding Mode Based on DCT
  3. JPEG encoding summary

(1) DCT-based progressive coding mode 

        The compression coding algorithm used in this mode is the same as the DCT-based sequential coding mode, but the coding of each image component needs to be scanned multiple times , and a part of the DCT quantization coefficients are transmitted in each scan .

  • The first scan only performs rough compression, and the rough image is transmitted at a fast speed, and the receiver can reconstruct a lower-quality but recognizable image based on this.
  • In the next few scans, the image is compressed more carefully. At this time, only some additional information is transmitted, and the quality of the reconstructed image is gradually improved by the receiver after receiving it.
  • This is done step by step until all image information is processed.

How progressive encoding works:

  1. Spectrum selection method: It refers to compressing, encoding and transmitting only the coefficients of certain frequency bands among the 64 DCT coefficients each time the DCT coefficients are scanned. In subsequent scans, the remaining frequency bands are encoded and transmitted until all coefficients are processed.
  2. Continuous approximation method: It refers to gradually progressive encoding along the direction from the high bit to the low bit of the DCT coefficient.

 (2) DCT-based layered coding mode 

The operation mode of layered coding is to divide the spatial resolution of an original image into multiple low-resolution images for "cone" coding.

The process of layered encoding:

  1. Reduce the resolution of the original image in layers.
  2. Compression coding is performed on the image with reduced resolution using any one of lossless predictive coding, sequential coding based on DCT or progressive coding based on DCT.
  3. Decode the low-resolution image and reconstruct the image.
  4.  Use interpolation and filtering methods to increase the resolution of the reconstructed image to the size of the image resolution of the next layer.
  5. The image with increased resolution is used as the predicted value of the original image, and the difference between it and the original image is performed in any of the three ways (distortion-free predictive coding, DCT-based sequential coding or DCT-based progressive coding) coding.
  6. Repeat the above steps (3), (4), (5) until the image reaches the resolution of the original image.

(3) JPEG encoding summary

The JPEG standard stipulates that the JPEG algorithm structure consists of three main parts:

  1. Independent lossless compression coding: Linear predictive coding and Huffman coding (or arithmetic coding) are used to ensure that the reconstructed image is completely consistent with the original image (mean square error is zero).
  2. Basic system: Provide the simplest image encoding/decoding capability, realize lossy compression of image information, and subjectively evaluate the image to the extent that the damage is difficult to detect. Adopt 8×8 DCT transformation linear quantization and Huffman coding and other technologies, only sequential operation mode.
  3. Extended system: On the basis of the basic system, a set of functions is expanded, such as entropy coding adopts binary arithmetic coding, and uses progressive composition operation mode, progressive lossless coding mode, etc. It is an extension or enhancement of the base system, so it must also include the base system.

A quality control factor Q is set in the JPEG standard . When quantizing, this factor is multiplied by the quantization step size in the quantization table as the actual quantization step size. By using the Q factor to change the quantization step size, the control of coding quality or coding bit rate is realized to adapt to the requirements of users or channels.

3.4.1 Skip white block coding (WBS)

  1. Introduction to Binary Image Coding
  2. Skip white block coding (WBS coding)
    1. One-dimensional WBS coding
    2. Two-dimensional WBS coding
    3. Adaptive WBS Coding

(1) Introduction to binary image coding

         A binary image has only two brightness values, so each pixel is represented by one bit during acquisition, with "0" representing black, "1" representing white, or vice versa, which is usually called direct coding. In direct encoding, the number of symbols representing a frame of image corresponds to the number of pixels of the image (that is, how many pixels the image has, and how many symbols it has).

        The most common and typical form of communication for transmitting binary images is fax.

Common encoding methods:

  • Run length coding (RLC coding)
  • Skip white block coding (WBS coding)
  • block code

(2) Skip White Block Code (WBS)

The basic idea of ​​skipping white block (WBS) encoding:

In practice, most binary images have a white background, and black pixels only occupy a small part of the image pixels. Therefore, if the white area can be skipped and only the black pixels are coded, the transmission bit rate can be reduced .

1. One-dimensional WBS coding

        One-dimensional WBS encoding is to divide each scan line into several segments , each segment contains N pixels. If the N pixels in this segment are all white , then only 1-bit codeword 1 is used to represent it ; if not all white, even if only one black pixel is included, it is also represented by (N+1)-bit codeword. In this (N+1) bit code word, the first bit is represented by 0, and the remaining N bits are directly coded, that is, white is represented by 1, and black is represented by 0.

 2. Two-dimensional WBS coding

        Two-dimensional WBS coding is to divide the image to be transmitted into several blocks , and each block contains M×N pixels. A block of all white pixels is represented by a one-bit code word 1, and a non-all white block is represented by a code word of (MN+1) bits, where the first bit is 0, and the remaining MN bits are directly coded.

3. Adaptive WBS coding

         According to the local structure or statistical characteristics of the image, the size of the pixel block is changed (the pixel block is not fixed but can be changed to be large or small) to further improve the compression effect. This is the so-called adaptive WBS coding. Adaptive WBS coding can reduce the number of bits representing an image a lot, but it increases the complexity of the device for self-adaptation.

3.4.2 Block encoding

  1. Block Code (BTC) Principle
  2. BTC encoding steps
  3. example
  4. Choice of decision threshold

(1) BTC coding principle 

         Block Truncation Coding (BTC, Block Truncation Coding) is to divide the image into sub-blocks of equal size , find two representative luminances in each sub-block to approximately represent the brightness of each pixel in the block, and form a single A bitmap containing two brightness values.

(2) BTC encoding steps

  1. First divide the image into nxn (n usually takes 4) non-overlapping squares;
  2. Then determine the decision threshold (which can be mean \overline{x}and variance \sigma) and two representative brightness values ​​a, b and bitmap B;
  3. Then send \overline{x}, \sigmaand B to the receiving end;
  4. The receiving end restores the image according to B and a, b.

 How to choose a and b?

Assuming the average value of the brightness of all pixels in the square \overline{x}, the variance is \sigma ^{2}, then:

m represents the total number of pixels

 

         According to the principle of BTC, the two brightness components a and b should be able to approximately represent the brightness of each pixel in the block. If the average value and variance before and after encoding are required to be unchanged, then a set of a and b is determined based on this principle.

        

 (3)Eg:

 The block encoding method is a non-information-preserving encoding, and there will definitely be information loss. But the two important statistical characteristics of average brightness and contrast in the square are kept unchanged. Its mean and variance are unchanged before and after encoding.

 (4) Choice of decision threshold

 In order to keep more information of the image before and after encoding from being lost, an appropriate decision threshold must be selected.

 In the BTC basic coding method, the mean and variance are chosen as the judgment criterion.

 In fact, other selection methods are also possible. For example, the third-order moments before and after encoding are kept unchanged .

 In this way, q is obtained by keeping the third-order moment unchanged. Although the amount of calculation is increased, more information is kept before and after encoding.

 3.4.3 Bit-plane coding

 Coding principle: bit plane is based on decomposing gray value images into a series of binary images, and then compressing each binary image with a binary compression method.

More effective than the Huffman method for correlated sources.

There are two main steps: bit-plane decomposition and bit-plane encoding .

1. Bit plane decomposition 

(1) Binary decomposition

         For an image that uses multiple bits to represent its gray value, each bit can be regarded as representing a binary plane, also called bit-plane.

        An image represented by 8 bits in gray scale has 8 bit planes, generally, bit plane 0 is used to represent the lowest bit plane, and bit plane 7 is used to represent the highest bit plane.

 Disadvantages of bit-plane decomposition:

Small changes in pixels may have a more obvious impact on the complexity of the bit plane. 

 (2) Decomposition of gray code (Gray code)

 The high-order bit planes of an image carry a large number of visible relevant details, and the low-order planes are distributed with some fine details.

A grayscale code bit-plane is less complex than a corresponding binary bit-plane.

2. Bit-plane coding 

  • Bit-plane encoding: that is, the encoding of the decomposed bit-plane image
  • There are mainly constant block coding and 1-D run-length coding methods: in order to eliminate redundancy between pixels
  • 1-D run-length (one-dimensional run-length) and 2-D run-length are the basis for the techniques used in two binary image compression standards (G3 and G4) used in facsimile machines.

(1) Constant area coding (CAC)

         Use special codewords to express the connected regions that are all 0 or 1.

        Divide the image into all black, all white or mixed m×n size blocks. The category with the highest frequency of occurrence is assigned 1-bit codeword 0; the other two categories are assigned 2-bit codewords 10 and 11, respectively. But it should be noted that the code of the mixed block is only used as a prefix, and the mode represented by mxn bits is still attached.

        The pixels in the constant block originally required to be represented by M×N bits (each using 1 bit) are now represented by only 1 bit or 2 bits, so that the purpose of compression can be achieved.

(2) 1-D run length coding (run length coding, RLC)

        Also known as forming codes, it is effective for binary images

        Basic principle: Use a symbol value or string length to replace consecutive symbols with the same value (continuous symbols constitute a continuous "run", hence the name of the run-length encoding), and record once only when the symbols of each row or column change The number of times the symbol is repeated to achieve data compression.

· The image compression effect is remarkable for a large area of ​​the same gray scale, and it is not suitable for images with a lot of gray scale changes and non-adjacent images.

 

Guess you like

Origin blog.csdn.net/m0_46303430/article/details/125935888