1.1 What is the Huffman coding
In the computer, Huffman coding using variable length coding tables for source symbols (e.g. a letter file) is encoded, wherein the variable length coding table is achieved by a method of evaluation of the probability of occurrence source symbols obtained appears high probability letters using the short code, whereas a low probability of occurrence are encoded using longer, which would make the average length of the string after encoding, to reduce the expected value, so as to achieve lossless compression data.
Coding step 1.2
- Initialized, the symbol probability by large to small sort;
- The smallest two symbols of a new symbol, the new symbol is the sum of these two probabilities;
- 2 is repeated, until the formation of a symbol with probability 1.
- Code: Start from the root down, left branch 0; 1 right branch. (Left branch or 1; right branch 0)
1.3 Examples
Encoding the final form:
symbol | a1 | a2 | a3 | a4 | a5 | a6 | a7 | a8 |
---|---|---|---|---|---|---|---|---|
numbers | 00 | 01 | 100 | 101 | 110 | 1110 | 11110 | 11111 |
Code length |