Matlab simulation of Huffman coding compression and decompression algorithm of speech signal, output coded data size, coding tree and other indicators

Table of contents

1. Algorithm simulation effect

2. Algorithms involve an overview of theoretical knowledge

3. MATLAB core program

4. Complete algorithm code file


1. Algorithm simulation effect

The matlab2022a simulation results are as follows:

 

 

 

2. Algorithms involve an overview of theoretical knowledge

        Using Huffman coding for information communication can greatly improve channel utilization, shorten information transmission time, and reduce transmission costs. However, this requires pre-coding the data to be transmitted through a coding system at the sending end; decoding (recovering) the incoming data at the receiving end. For a duplex channel (that is, a channel that can transmit information in both directions), each end requires a complete encoding/decoding system. Try to write a Huffman code encoding and decoding system for such a messaging station.
       ​​​​​​Huffman Coding (Huffman Coding), also known as Huffman Coding, is a coding method, and Huffman Coding is a type of Variable Word Length Coding (VLC). Huffman proposed a coding method in 1952. This method constructs the codeword with the shortest average length of different prefixes entirely based on the occurrence probability of characters, sometimes called the best coding, generally called Huffman coding (sometimes also called Huff Mann coding).

       Assuming that a source generates five symbols u1, u2, u3, u4 and u5, the corresponding probabilities are P1=0.4, P2=0.1, P3=P4=0.2, P5=0.1. First, the symbols are lined up in descending order of probability, as shown in Figure 1. When coding, start from the two symbols with the smallest probability, and one of the branches can be selected as 0 and the other branch as 1. Here, we choose the upper branch as 0 and the lower branch as 1. The encoded probabilities of the two branches are then combined and requeued. The above method is repeated many times until the merge probability is normalized. It can be seen from (a) and (b) in Figure 1 that although the average code lengths of the two are equal, the same symbol can have different code lengths, that is, the encoding method is not unique. When queuing, there may be several branches with equal probability, resulting in non-unique queuing method. Generally, if the newly merged branches are arranged to the uppermost branch with equal probability, it will help to shorten the variance of the code length, and the compiled code is closer to the equal-length code. Here the encoding of (a) in Figure 1 is better than (b).

        The codewords (codes of each symbol) of the Huffman code are different prefix codewords, that is, any codeword will not be the front part of another codeword, which allows each codeword to be transmitted together without the need for With the addition of isolation symbols, as long as there is no error in transmission, the receiving end can still separate each codeword without confusion.
       In practical applications, in addition to using timing cleaning to eliminate error diffusion and buffer storage to solve rate matching, the main problem is to solve the statistical matching of small symbol sets, such as the statistical matching of black (1) and white (0) fax sources, The source of the enlarged symbol set is composed of run lengths of 0 and 1 with different lengths. Run length refers to the length of the same symbol (such as the length or number of consecutive strings of 0s or strings of 1s in the binary code). According to the CCITT standard, 2×1728 kinds of run lengths (lengths) need to be counted, so the amount of storage required for implementation is too large. In fact, the probability of a long run is very small, so CCITT also stipulates that if l represents the length of the run, then l=64q+r. Among them, q is called the main code, and r is the base code. When encoding, a run length not less than 64 consists of a main code and a base code. And when 1 is an integer multiple of 64, only the code of main code is used, and the code of base code does not exist anymore.
       Both the main code and the base code of the long run are coded by the Huffman rule, which is called the modified Huffman code, and the result is available in a table. This method has been widely used in document facsimile machines.

3. MATLAB core program

for i = 1 : size(Vbits,2)
    if(Vbits(i)==0)
        s2 = long_CBs(1);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==1)
        s2 = long_CBs(2);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==2)
        s2 = long_CBs(3);
%         disp("PP");
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==3)
        s2 = long_CBs(4);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==4)
        s2 = long_CBs(5);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==5)
        s2 = long_CBs(6);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==6)
        s2 = long_CBs(7);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==7)
        s2 = long_CBs(8);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==8)
        s2 = long_CBs(9);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==9)
        s2 = long_CBs(10);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==10)
        s2 = long_CBs(11);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==11)
        s2 = long_CBs(12);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==12)
        s2 = long_CBs(13);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==13)
        s2 = long_CBs(14);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==14)
        s2 = long_CBs(15);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==15)
        s2 = long_CBs(16);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==16)
        s2 = long_CBs(17);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==17)
        s2 = long_CBs(18);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==18)
        s2 = long_CBs(19);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==19)
        s2 = long_CBs(20);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==20)
        s2 = long_CBs(21);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==21)
        s2 = long_CBs(22);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==22)
        s2 = long_CBs(23);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==23)
        s2 = long_CBs(24);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==24)
        s2 = long_CBs(25);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==25)
        s2 = long_CBs(26);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==26)
        s2 = long_CBs(27);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==27)
        s2 = long_CBs(28);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==28)
        s2 = long_CBs(29);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==29)
        s2 = long_CBs(30);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==30)
        s2 = long_CBs(31);
        Hencodes = strcat(Hencodes,s2);
    end
    if(Vbits(i)==31)
        s2 = long_CBs(32);
        Hencodes = strcat(Hencodes,s2);
    end
end
A734

4. Complete algorithm code file

V

Guess you like

Origin blog.csdn.net/hlayumi1234567/article/details/130374918