Huffman tree and its codec

Huffman tree - Codec

Introduction:

  Huffman tree may be set to a weight value according to the number of a character string in a character input appears, then, a string of encoded or decoded in accordance to the size of the weight of a given string code, It may be used in compressed or decompressed data, and the character of the codec.

  But the advantages of Huffman trees where?

  1, is that it has a large number of characters (ie the weight of large character) encoding appears little short of character encoding than appears, that appears more the number, the shorter the coding to ensure that the compression of the data.
  2, compiled code will not guarantee covers each other, that is, will not be ambiguous, such a code is 00100, b code is 001, while the c code is 00 ,, this is the case, it is possible for 00100 is a, it could be bc, and Huffman coding tree will not face this problem.

How to achieve

  Huffman tree codec need to achieve three types of data, a priority queue is used to save the node in the tree, the second tree, for decoding, the third is the table, used as a code table coding. Let us first introduced one by one about the three data structures:

1, priority queue

  Priority queue is stored in a node of a tree to determine its priority, the weight, the smaller the priority, the more forward position of the discharge according to the weight value stored in the tree node characters. That is the first node priority stored minimum, the minimum weight.
Priority queue

type of data
//优先级队列,struct TNode表示树的结点,在后面介绍
typedef struct QNode
{
    struct TNode* val;          //树的结点,其实也就是数据域
    int priority;               //优先级
    struct QNode* next;         //指针域
}*Node;

typedef struct Queue
{
    int size;           //队列大小
    struct QNode* front;        //队列头指针
}queue;

2, tree

  Tree is stored inside the character, as well as pointing to his left and right child nodes of the pointer. For example, the following figure, although the figure looks at the book store the priority of the character, but in fact can not add, feel more complicated, so I took, but in order to understand the convenience, I marked out on the map.
Here Insert Picture Description

type of data
//树
typedef struct TNode
{
    char data;              //字符值
    struct TNode* left;         //左孩子
    struct TNode* right;                //右孩子
}*Tree;

3, Table

  This table is actually a code table, stored inside a coded character and the character encoding used when viewing.
Here Insert Picture Description

type of data
//表
typedef struct BNode
{
    char code[256];             //编码
    char symbol;                //字符
    struct BNode* next;         //指向下一个
}*bNode;

typedef struct Table
{
    struct BNode* first;                //表头
    struct BNode* last;             //表尾
}*table;

Thinking

  For simplicity we tell when he set the weights for the first user input rather than the frequency of occurrence statistics, because we work also happens to be user input, the last article I posted statistics based on frequency of occurrence of the code, are interested can look . Because it uses a lot of data types may be written half feel a little dizzy, so let's haircut ideas Before you begin:

Previously set a, b, c three data, their weights were 6,1,2

  1、首先要根据用户输入的每个字符的权值,创建出一个一个的树结点,然后将其按照优先级的大小存入优先级队列中,按从小到大的顺序,具体实现我会在后面贴。

  2、根据优先级队列中存放的树的结点构建起一棵树。

  先出队前两个结点,然后创建一个新的树的结点,新的树的结点的权值就等于出队的两个结点的权值之和,但其没有字符域,也就是说它不是一个真正的树的结点,我们称其为假树结点,对应称为真树结点。
  让出队的两个真树结点作为新得到的假树结点的左右孩子,优先级小的真树结点(也就是先出队的真树结点)作为左孩子,另一个为右孩子。

Here Insert Picture Description
出队后
Here Insert Picture Description
b和c为真树结点,最上面权值为3的为假树结点
Here Insert Picture Description

  最后将新创建的假树结点又入队,继续循环操作,直到队列只剩一个结点,那个结点就是假树结点,最后也要作为Huffman树的根节点root。

新的假树结点入队后
Here Insert Picture Description
到最后就是下面这样
队列只剩最后一个假树结点,而且作为所构建Huffman树的根节点root
Here Insert Picture Description
Here Insert Picture Description

  3、遍历整棵树建起一张码表,通过观察我们发现,真正有意义的真树结点其实都是叶子节点,所以我们在遍历的时候将所有的叶子节点的编码和字符存入表中即可。
  我们规定遍历树建立表的时候,往左孩子访问一层给码值加0,往右就加1。比如刚刚介绍树的时候贴的那张图,b是00,c是01,a是1。

下面是建立起来的码表
Here Insert Picture Description

构建Huffman树和创建编码表的实现过程

  看完思路之后再看实现过程,我们先看创建队列时候的一系列操作:

  因为为了方便我用了部分C++语法,所以分配内存会是用new,释放内存就是delete,就和C语言里malloc和free是一个作用,其他的都一样。

 队列的初始化:

queue Init_queue()
{
    queue q;
    q.size = 0;
    q.front = new struct QNode;
    if (!q.front)
    {
        printf("分配失败!\n");
        exit(1);
    }
    q.front->next = NULL;
    return q;
}

 队列的插入:

//插入,根据优先级
bool EnQueue(queue& q, Tree avl, int weight)
{
    Node newp = new struct QNode;
    newp->val = avl;
    newp->priority = weight;
    if (q.size == 0 || q.front == NULL)         //空表
    {
        newp->next = NULL;
        q.front = newp;
        q.size = 1;
        return true;
    }
    else        //中间位置,需要迭代
    {
        if (weight <= q.front->priority)    //比第一个都小
        {
            newp->next = q.front;
            q.front = newp;
            q.size++;
            return true;
        }
        else    //中间位置
        {
            Node beforp = q.front;
            while (beforp->next != NULL)
            {
                if (weight <= beforp->next->priority)
                {
                    newp->next = beforp->next;
                    beforp->next = newp;
                    q.size++;
                    return true;
                }
                else
                {
                    beforp = beforp->next;
                }
            }
            //需要插在队列最后
            if (beforp->next == NULL)
            {
                newp->next = NULL;
                beforp->next = newp;
                q.size++;
                return true;
            }
        }
    }
    return true;
}

创建一个队列:
需要用户输入每个字符和对应的优先级

//创建队列
queue Create_Queue()
{
    queue q = Init_queue();
    while (1)
    {
        char symbol;
        int weight;
        cin >> symbol >> weight;    //C++里的输入,输入symnol和weight
        if (weight == 0)  //如果输入的权值为0,表示输入结束
            break;
        Tree t = new struct TNode;
        t->data = symbol;
        t->left = NULL;
        t->right = NULL;
        EnQueue(q, t, weight);
    }
    return q;
}

The minimum priority queue pop node:

//弹出队列优先级最小的
Tree Dequeue(queue& q)
{
    if (q.front == NULL)
    {
        cout << "空队!" << endl;
        exit(1);
    }
    Node p = q.front;
    q.front = p->next;
    Tree e = p->val;
    q.size--;
    delete[] p;
    return e;
}

Function tree, a tree is created according to the priority queue:

//树的函数
//创建一棵树
Tree Create_Tree(queue& q)
{
    while (q.size != 1)
    {
        int priority = q.front->priority + q.front->next->priority;
        Tree left = Dequeue(q);
        Tree right = Dequeue(q);

        Tree newTNode = new struct TNode;
        newTNode->left = left;
        newTNode->right = right;

        EnQueue(q, newTNode, priority);
    }
    Tree root = new struct TNode;
    root = Dequeue(q);
    return root;
}

Function table, create a table from a tree:

//创建一张表
table Create_Table(Tree root)
{
    table t = new struct Table;
    t->first = NULL;
    t->last = NULL;
    char code[256];
    int k = 0;
    travel(root, t, code, k);
    return t;
}

Function table for achieving travel function:
travel function is represented traversal of the tree, in order to establish the table, using interpolation footer

void travel(Tree root, table& t, char code[256], int k)
{
    if (root->left == NULL && root->right == NULL)
    {
        code[k] = '\0';

        bNode b = new struct BNode;
        b->symbol = root->data;
        strcpy(b->code, code);
        b->next = NULL;

        //尾部插入法
        if (t->first == NULL)       //空表
        {
            t->first = b;
            t->last = b;
        }
        else
        {
            t->last->next = b;
            t->last = b;
        }
    }
    if (root->left != NULL)
    {
        code[k] = '0';
        travel(root->left, t, code, k + 1);
    }
    if (root->right != NULL)
    {
        code[k] = '1';
        travel(root->right, t, code, k + 1);
    }
}

Codec

  Thus, Huffman coding table, and the tree has finished building, now codec implementing the function of the initial test of the Huffman tree.

Code:
need to pass coding table for encoding

void EnCode(table t, char* str)
{
    cout << "EnCodeing............./" << endl;
    int len = strlen(str);
    for (int i = 0; i < len; i++)
    {
        bNode p = t->first;
        while (p != NULL)
        {
            if (p->symbol == str[i])
            {
                cout << p->code;
                break;
            }
            p = p->next;
        }
    }
    cout << endl;
}

Decoding:
need to pass Huffman tree to encode

void DeCode(Tree root, char* str)
{
    cout << "DeCode............./" << endl;
    Tree p = root;
    int len = strlen(str);
    for (int i = 0; i < len; i++)
    {
        if (p->left == NULL && p->right == NULL)
        {
            cout << p->data;
            p = root;
        }
        if (str[i] == '0')
            p = p->left;
        if (str[i] == '1')
            p = p->right;
        if (str[i] != '0' && str[i] != '1')
        {
            cout << "The Input String Is Not Encoded correctly !" << endl;
            return;
        }
    }
    if (p->left == NULL && p->right == NULL)
        cout << p->data;
        cout << endl;
}

Test Data

int main()
{
    queue q = Create_Queue();
    Tree root = Create_Tree(q);
    table t = Create_Table(root);
    char str[256];
    cout << "请输入要编码的字符:" << endl;
    cin >> str;
    EnCode(t, str);
    cout << "请输入要解码的码值:" << endl;
    char str1[256];
    cin >> str1;
    DeCode(root, str1);
}

Attach screenshot:
Here Insert Picture Description

Guess you like

Origin www.cnblogs.com/vfdxvffd/p/11622261.html