Algorithm - Huffman coding

1. Introduction to algorithm

Huffman coding is a data compression algorithm proposed by David A. Huffman. It is based on the principle that characters with higher frequency are represented by shorter codes, and characters with lower frequency are represented by longer codes, thereby achieving efficient compression of data.

The core idea of ​​Huffman coding is to construct a Huffman tree. First, count the frequency of occurrence of each character in the data to be compressed. Then, a Huffman tree is constructed based on the frequencies, where more frequent characters are on shorter paths and less frequent characters are on longer paths. Finally, the encoding corresponding to each character is generated based on the Huffman tree. A common approach is to encode more frequent characters into shorter bit strings, and to encode less frequent characters into longer bit strings.

2. Code implementation

The following is a simple C# example of implementing Huffman coding:

using System;
using System.Collections.Generic;

public class HuffmanNode
{
    
    
    public char Character {
    
     get; set; }
    public int Frequency {
    
     get; set; }
    public HuffmanNode Left {
    
     get; set; }
    public HuffmanNode Right {
    
     get; set; }
}

public class HuffmanTree
{
    
    
    private PriorityQueue<HuffmanNode> priorityQueue;

    public HuffmanTree(Dictionary<char, int> frequencies)
    {
    
    
        priorityQueue = new PriorityQueue<HuffmanNode>();

        foreach (var kvp in frequencies)
        {
    
    
            var node = new HuffmanNode
            {
    
    
                Character = kvp.Key,
                Frequency = kvp.Value
            };

            priorityQueue.Enqueue(node, node.Frequency);
        }

        while (priorityQueue.Count > 1)
        {
    
    
            var left = priorityQueue.Dequeue();
            var right = priorityQueue.Dequeue();

            var parent = new HuffmanNode
            {
    
    
                Frequency = left.Frequency + right.Frequency,
                Left = left,
                Right = right
            };

            priorityQueue.Enqueue(parent, parent.Frequency);
        }
    }

    public Dictionary<char, string> GetCodeTable()
    {
    
    
        var codeTable = new Dictionary<char, string>();
        TraverseHuffmanTree(priorityQueue.Peek(), "", codeTable);
        return codeTable;
    }

    private void TraverseHuffmanTree(HuffmanNode node, string code, Dictionary<char, string> codeTable)
    {
    
    
        if (node == null)
            return;

        if (node.Left == null && node.Right == null)
        {
    
    
            codeTable[node.Character] = code;
            return;
        }

        TraverseHuffmanTree(node.Left, code + "0", codeTable);
        TraverseHuffmanTree(node.Right, code + "1", codeTable);
    }
}

public class Program
{
    
    
    public static void Main(string[] args)
    {
    
    
        string data = "Hello World!";
        var frequencies = CalculateFrequencies(data);

        var huffmanTree = new HuffmanTree(frequencies);
        var codeTable = huffmanTree.GetCodeTable();

        foreach (var kvp in codeTable)
        {
    
    
            Console.WriteLine("Character: {0}, Code: {1}", kvp.Key, kvp.Value);
        }
    }

    private static Dictionary<char, int> CalculateFrequencies(string data)
    {
    
    
        var frequencies = new Dictionary<char, int>();

        foreach (char c in data)
        {
    
    
            if (frequencies.ContainsKey(c))
                frequencies[c]++;
            else
                frequencies[c] = 1;
        }

        return frequencies;
    }
}

at last

Good luck!

Guess you like

Origin blog.csdn.net/sixpp/article/details/134986998