Autumn recruitment is in progress~ Let’s review the basics of data structure together (the article is a bit long~ will be continuously updated)

Table of contents

1 Data structure theory

1.1 Data

1.2 Data structure concepts

1.3 Concept of algorithm

1.3.1 The difference between algorithms and data structures

1.3.2 Comparison of algorithms

1.3.3 Characteristics of the algorithm

1.4 Data structure classification

1.4.2 Physical structure 

2 Linear table

2.1 Basic concepts of linear tables

2.2 Sequential storage of linear tables

2.2.1 Design and implementation of linear table sequential storage (dynamic array)

2.2.2 Advantages and Disadvantages

2.3 Linked storage of linear lists (one-way linked lists)

2.3.1 Design and implementation of linked storage of linear lists (single-linked lists)

2.3.2 Advantages and Disadvantages

3 Stack and Queue

 3.1 stack

3.1.1 Basic concepts, features, and operations

 3.1.2 Sequential storage of stack

3.1.3 Stack chain storage

3.1.4 Stack case (application)

3.2 Queue

3.2.1 Basic concepts

 3.2.2 Queue sequential storage

 3.2.3 Queue chain storage

 4 Trees and Binary Trees

4.1 Basic concepts of trees

4.2 Tree representation

 4.2.1 Graphical representation

4.2.2 Generalized table representation

4.2.3 Representation of left child and right brother

 4.3 Binary tree concept

4.3.1 Basic concepts

4.3.2 Representation of binary tree

 4.3.3 Binary tree traversal

 4.3.4 Non-recursive traversal of binary trees

4.3.5 Level traversal of binary trees

 4.4.6 Application of binary trees: Huffman tree – Huffman coding


1 Data structure theory

1.1 Data

Data: It is a symbol that describes objective things. It is an object that can be manipulated in the computer. It is a collection of symbols that can be recognized by the computer and input to the computer for processing. Data not only includes numerical types such as integers and real types, but also non-numeric types such as characters and sounds, images, and videos.

1.2 Data structure concepts

Data structures are the way computers store and organize data. A data structure refers to a collection of data elements that have one or more specific relationships with each other. Often, carefully selected data structures can lead to higher operating or storage efficiency. Data structures are often related to efficient retrieval algorithms and indexing techniques.

Data structures are the way computers store and organize data . Is a collection of data elements that have one or more specific relationships with each other

1.3 Concept of algorithm

An algorithm is a description of the steps to solve a specific problem . It is represented as a limited sequence of instructions in a computer . An algorithm is an independent method and idea for solving a problem.

For algorithms, language is not important, what is important is the idea.

1.3.1 The difference between algorithms and data structures

The data structure only statically describes the relationship between data elements. Efficient programs need to design and select algorithms based on the data structure.

  • Algorithms are designed to solve real problems.
  • Data structures are the problem carriers that algorithms need to deal with.
  • Data structures and algorithms complement each other. ,

1.3.2 Comparison of algorithms

Now we need to write a program to find the result of 1 + 2 + 3 + ... + 100. How should you write it?

Most people will immediately write the following code

	var i int
		sum := 0
		n := 100
		for i = 0; i <= n; i++ {
			sum = sum + i
		}
		fmt.Printf("%d", sum)

Of course, if Gauss was asked to solve this problem, he might write the following code:

sum := 0
	n := 100
	sum = (1+n)*n/2
	fmt.Printf("%d",sum)

Obviously, whether from a human or computer perspective, the following algorithm will be much more efficient. This is that a good algorithm will make your program more efficient.

1.3.3 Characteristics of the algorithm

Algorithms have five basic characteristics: input, output, finiteness, certainty and feasibility

  • Input-output: An algorithm has zero or more inputs and at least one or more outputs.
  • Finiteness: means that the algorithm ends automatically after executing a limited number of steps without infinite loops, and each step is completed within an acceptable time.
  • Deterministic: Each step of the algorithm has a definite meaning and there will be no ambiguity.
  • Feasibility: Each step of the algorithm must be feasible, that is, each step can be completed by executing a limited number of times.

1.4 Data structure classification

According to different viewpoints, we divide data structures into logical structures and physical structures.

1.4.1 Logical structure

  • Set structure: The data elements in the set structure have no other relationship between them except that they belong to the same set. All data elements are equal. They belong to the same set together. The set relationship in the data structure is similar to the set in mathematics, as shown in the figure below
  • Linear structure: There is a one-to-one relationship between data elements in a linear structure. As shown in the picture.

  • Tree structure : In the tree structure, there is a one-to-many hierarchical relationship between data elements, as shown in the figure.

  • Graph structure: The data elements of the graph structure are many-to-many relationships, as shown in the figure.

  • 1.4.2 Physical structure 

  • Sequential storage: Data elements are stored in storage units with consecutive addresses. The logical relationship and physical relationship of the data are consistent, as shown in the figure.

  • If all data structures were simple and regular, everything would be easy to handle. But in fact, there are always people who want to jump in the queue or give up, so members will be added and deleted from the element collection. Obviously, in the face of such constant changes, Structure and sequential storage are unscientific, so what should we do?

  • Chained storage structure: stores data elements in any storage unit. This group of storage units can be continuous or discontinuous. The storage relationship of data elements does not reflect its logical relationship, so a pointer needs to be used to store the address of the data element, so that the location of the relevant data can be found through the address. As shown in the picture.

  • 2 Linear table

  • 2.1 Basic concepts of linear tables

  • Linear structure is one of the simplest and commonly used data structures. The basic characteristic of linear structure is that the nodes satisfy linear relationships. The dynamic arrays, linked lists, stacks, and queues discussed in this chapter are all linear structures. What they have in common is that there is only one start node and one terminal node in the node. According to this relationship, all their nodes can be arranged into a linear sequence. However, they belong to several different abstract data type implementations, and the difference between them is mainly the difference in operations.

    A linear table is a finite sequence of zero or more data elements. The data elements are in order . The number of data elements is limited . The types of data elements must be the same.

    Example: Let’s first look at a topic that everyone is interested in. Is the list of zodiac signs in a year a linear list? as the picture shows:

  • Properties of linear tables:

  • a0 is the first element of the linear list and has only one successor.
  • an is the last element of the linear list and has only one predecessor.
  • The other elements ai except a0 and an have both precursors and successors.
  • Linear tables can be accessed item by item and sequentially.

  • Abstract data type definition of linear table: 

  • ADT线性表(List)
    Data
    线性表的数据对象集合为{ a1, a2, ……, an },每个元素的类型均为DataType。其中,除第一个元素a1外,每个元素有且只有一个直接前驱元素,除了最后一个元素an外,每个元素有且只有一个直接后继元素。数据元素之间的关系是一一对应的。
    
    Operation(操作)
    // 初始化,建立一个空的线性表L。
    InitList(*L);
    // 若线性表为空,返回true,否则返回false
    ListEmpty(L);
    // 将线性表清空
    ClearList(*L);
    // 将线性表L中的第i个位置的元素返回给e
    GetElem(L, i, *e);
    // 在线性表L中的第i个位置插入新元素e
    ListInsert(*L, i, e);
    // 删除线性表L中的第i个位置元素,并用e返回其值
    ListDelete(*L, i, *e);
    // 返回线性表L的元素个数
    ListLength(L);
    // 销毁线性表
    DestroyList(*L);

    2.2 Sequential storage of linear tables

  • Usually linear tables can use sequential storage and chained storage. We mainly discuss the implementation of sequential storage structures and corresponding operation algorithms.

    Sequential storage is the simplest way to represent a linear table. The specific method is to store the elements in the linear table one after another in a continuous storage area. This sequentially represented linear table is also a sequential table.

  • 2.2.1 Design and implementation of linear table sequential storage (dynamic array)

    Operation points:

  • Insertion element algorithm
    1. Determine whether the linear table is legal
    2. Determine whether the insertion position is legal
    3. Determine whether the space is satisfied
    4. Move the last element to the insertion position one position behind
    5. Insert new element
    6. Add 1 to the length of the linear table

  • Get element operation
    1. Determine whether the linear table is legal
    2. Determine whether the location is legal
    3. Get elements directly through array subscripts
  • Delete element algorithm
    1. Determine whether the linear table is legal
    2. Determine whether the deleted location is legal
    3. Remove the element
    4. Move the elements after the deleted position forward one position each
    5. Decrease the length of the linear table by 1

Note:  The capacity of the linked list and the length of the linked list are two different concepts.

2.2.2 Advantages and Disadvantages

  • advantage:
  1. No need to add extra space for logical relationships in linear tables.
  2. You can quickly obtain elements in legal positions in the table.
  • shortcoming:
  1. Insertion and deletion operations require moving a large number of elements.
  • 2.3  Linked storage of linear lists (one-way linked lists)

  • The biggest disadvantage of the linear table sequential storage (dynamic array) case we wrote earlier is that a large number of elements need to be moved during insertion and deletion, which obviously takes time. Can we find a way to solve it? linked list.

  • In order to represent the logical relationship between each data element and its direct successor elements, each element in a linked list needs to store information indicating its direct successors in addition to its own information.

  •  Single list:

  • In the linked storage structure of a linear list, each node contains only one pointer field. Such a linked list is called a singly linked list.
  • The data elements of the linear table are linked together in their logical order through the pointer field of each node (as shown in the figure).

Concept explanation:

Header node: the first node in the linked list, containing a pointer to the first data element and some information about the linked list itself

Data node: A node representing a data element in a linked list, containing a pointer to the next data element and information about the data element.

Tail node: The last data node in the linked list, its next element pointer is empty, indicating no successor.

2.3.1  Design and implementation of linked storage of linear lists (single-linked lists)

  •  Insert operation:

  • node->next = current->next;

    current->next = node;

  •  Delete operation:

  • current->next = ret->next;

2.3.2 Advantages and Disadvantages

  • advantage:
  1. No need to customize the capacity of the linked list at one time
  2. Insertion and deletion operations do not require moving data elements
  • shortcoming:
  1. Data elements must hold position information for subsequent elements
  2. Obtaining the element operation of the specified data requires sequential access to the previous elements

3 Stack and Queue

 3.1 stack

3.1.1 Basic concepts, features, and operations

Concept: First of all, it is a linear list, that is to say, the stack elements have a linear relationship, that is, the predecessor and successor relationship. It's just that it is a special linear table . The definition says that insertion and deletion operations are performed at the end of the linear table. The end of the table here refers to the top of the stack, not the bottom of the stack.

Features: Last in, first out. Its special feature is that it limits the insertion and deletion positions of this linear list. It is always performed only at the top of the stack. This also means that the bottom of the stack is fixed, and the first thing to go on the stack can only be at the bottom of the stack.

operate:

  • The insertion operation into the stack is called pushing into the stack, also known as pushing onto the stack. Similar to a bullet entering the magazine (as shown in the picture below)
  • The deletion operation of the stack is called popping the stack, and some are also called popping the stack or popping the stack. Like the bullet ejection clip in the magazine (as shown in the picture below)

 3.1.2 Sequential storage of stack

  • basic concept

The sequential storage structure of the stack is referred to as the sequential stack, which is a sequential table with restricted operations. The storage structure of the sequential stack is to use a set of storage units with consecutive addresses to store data elements from the bottom of the stack to the top of the stack in sequence. At the same time, the attached pointer top is just the position of the top element of the stack in the sequence table.

  • Design and implementation

Because the stack is a special linear list, the sequential storage of the stack can be implemented through a sequential linear list.

3.1.3 Stack chain storage

  • basic concept

The chain storage structure of the stack is referred to as the chain stack.

Consider the following questions :

The stack is just the top of the stack for insertion and deletion operations. Should the top of the stack be placed at the head or the tail of the linked list?

Since the singly linked list has a head pointer, and the stack top pointer is also necessary, why not combine them into one, so a better way is to put the stack top at the head of the singly linked list. In addition, the top of the stack is already at the head, and the more commonly used head node in a singly linked list loses its meaning. Generally, for a chain stack, a head node is not needed.

  • Design and implementation

The link stack is a special linear list, and the link stack can be implemented through a linked linear list.

3.1.4 Stack case (application)

  • function call model

        The stack is the basis for the implementation of the nested calling mechanism.

  • Nearest match

Almost all compilers have the ability to detect whether parentheses match, so how to implement symbol pair detection in the compiler? The following string: 5+5*(6)+9/3*1)-(1+3)

Algorithm idea:

从第一个字符开始扫描
当遇见普通字符时忽略,
当遇见左括号时压入栈中
当遇见右括号时从栈中弹出栈顶符号,并进行匹配
匹配成功:继续读入下一个字符
匹配失败:立即停止,并报错
结束:
成功: 所有字符扫描完毕,且栈为空
失败:匹配失败或所有字符扫描完毕但栈非空
  • When you need to detect things that appear in pairs but are not adjacent to each other, you can use the "last in first out" feature of the stack.
  • The stack is very suitable for situations where "nearest matching" is required.
  • Infix expressions and postfix expressions

        Postfix expressions (proposed by Polish scientists in the 1950s)

  • Put the operator after the number ===》 In line with computer operations
  • The mathematical expressions we are used to are called infix expressions ===》In line with human thinking habits

Example:

Infix to suffix algorithm:

遍历中缀表达式中的数字和符号:
对于数字:直接输出
对于符号:
左括号:进栈  
运算符号:与栈顶符号进行优先级比较
若栈顶符号优先级低:此符号进栈  
(默认栈顶若是左括号,左括号优先级最低)
若栈顶符号优先级不低:将栈顶符号弹出并输出,之后进栈
右括号:将栈顶符号弹出并输出,直到匹配左括号,将左括号和右括号同时舍弃
遍历结束:将栈中的所有符号弹出并输出
  • Calculate based on postfix expression

How do computers calculate based on postfix expressions? For example: 8 3 1 – 5 * +

Calculation Rules:

遍历后缀表达式中的数字和符号
对于数字:进栈
对于符号:
从栈中弹出右操作数
从栈中弹出左操作数
根据符号进行运算
将运算结果压入栈中
遍历结束:栈中的唯一数字为计算结果

3.2 Queue

3.2.1 Basic concepts

A queue is a special restricted linear list.  

A queue is a linear list that only allows insertion operations at one end and deletion operations at the other end.

The queue is a linear table of t (First In First Out), referred to as FIFO. The end that allows insertion is the tail of the queue, and the end that allows deletion is the head of the queue. The queue does not allow operations in the middle! Assume that the queue is q=(a1, a2,...,an), then a1 is the head element of the queue, and an is the tail element of the queue. In this way, we can always start from a1 when deleting, and always be at the end of the queue when inserting. This is also more in line with our daily habits. Those who are first in line will be given priority, and those who arrive last will of course be at the end of the queue. As shown below:

 3.2.2 Queue sequential storage

  • basic concept

  • sequential circular queue

 3.2.3 Queue chain storage

  • basic concept

        The queue is also a special linear table; linear table chain storage can be used to simulate the chain storage of the queue.

3.2.4 golang queue implementation 

Code source network: Implementation of golang queue queue

package main

import "fmt"

//首先定义每个节点Node结构体,Value的值类型可以是任意类型,节点的前后指针域指针类型为node
type node struct {
	value interface{}
	prev *node
	next *node
}
//继续定义链表结构,定义出头结点和尾节点的指针,同时定义队列大小size
type LinkedQueue struct {
	head *node
	tail *node
	size int
}
//获取队列大小,只需要获取LinkedQueue中的size大小即可
func (queue *LinkedQueue) Size() int  {
	return queue.size
}
//Peek操作只需要获取队列队头的元素即可,不用删除。返回类型是任意类型,用接口实现即可。
//另外如果head指针域为nil,则需要用panic抛出异常,一切ok的话,返回队头节点的数值即可.
func (queue *LinkedQueue) Peek() interface{} {
	if queue.head == nil {
		panic("Empty Queue!")
	}
	return queue.head.value
}
//添加操作在队列中是比较重要的操作,也要区分队尾节点是否为nil,
//根据是否为nil,执行不同的连接操作,最后队列的size要加1,
//为了不浪费内存新增节点的指针变量要置nil
func (queue *LinkedQueue) Add(value interface{})  {
	newnode := &node{value,queue.tail,nil}
	if queue.tail == nil{
		queue.head = newnode
		queue.tail = newnode
	}else {
		queue.tail.next = newnode
		queue.tail = newnode
	}
	queue.size++
	newnode = nil
}
//队列的删除操作也是很简单,无非是节点的断开操作。
//在此之前,需要判断链表的状态即是否为nil?
//而后移除的队列最前端的节点,先用一个新的变量节点保存队列前面的节点,进行一系列操作之后,至nil,并将长度减少即可。
func (queue *LinkedQueue) Remove() {
	if queue.head == nil {
		panic("Empty queue.")
	}
	first_node := queue.head
	queue.head = first_node.next
	first_node.next = nil
	first_node.value = nil
	queue.size--
	first_node = nil
}
func main()  {
	queue := &LinkedQueue{head: nil,tail: nil,size: 0}
	for i:= 1;i<=5;i++{
		queue.Add(i)
	}
	fmt.Println("初始化后队列大小:",queue.size)
	fmt.Println("初始化后队首:",queue.Peek())
	queue.Remove()
	fmt.Println("执行删除后队列大小:",queue.size)
	fmt.Println("执行删除后队首:",queue.Peek())
}

Output result:

 Queue size after initialization: 5
Queue head after initialization: 1
Queue size after deletion: 4
Queue head after deletion: 2

 4 Trees and Binary Trees

4.1 Basic concepts of trees

tree definition: 

A finite set T consisting of one or more (n≥0) nodes has one and only one node called the root. When n>1, the remaining nodes are divided into m (m≥0 ) mutually disjoint finite sets T1, T2, ..., Tm. Each set is itself a tree, called a subtree of the root.

Structural characteristics of trees:

  • Non-linear structure, there is a direct predecessor, but there may be multiple direct successors (1:n)
  • The definition of tree is recursive, and there are trees within trees.
  • The tree can be empty, that is, the number of nodes is 0.

Some terms:

  • Root  - that is, the root node (no predecessor)
  • Leaf  - terminal node (no successor)
  • Forest - refers to a collection of m disjoint trees (for example, the number of subtrees after deleting A)
  • Ordered tree - the subtrees of nodes are ordered from left to right and cannot be interchanged (left is first)
  • Unordered tree - nodes in each subtree can interchange positions.
  • Parents  - that is, the upper node (direct predecessor) parent
  • Child  - the subtree of the lower node (direct successor) child
  • Brothers - Nodes of the same level under the same parents (children call each other brothers) sibling
  • Cousins ​​- nodes whose parents are on the same level (but not the same parents) cousin
  • Ancestors - all nodes from the root to the branches of this node
  • Descendants——that is, any node in the subtree below the node
  • Nodes - the data elements of the tree                                              
  • The degree of the node  - the number of subtrees attached to the node (the number of direct successors is how many degrees)
  • The level of the node - the number of levels from the root to the node (the root node counts as the first level)  
  • Terminal node - that is, the node with degree 0, that is, the leaf             
  • Branch nodes - nodes other than the tree root (also called internal nodes)
  • Degree of the tree  - the maximum value among the degrees of all nodes (Max{degree of each node})        
  • The depth (or height) of the tree -  refers to the maximum number of levels among all nodes (Max{level of each node})

The number of nodes in (c) below = 10, the degree of the tree = 3, and the depth of the tree = 3

4.2 Tree representation

 4.2.1 Graphical representation

4.2.2 Generalized table representation

Represent the above graph in generalized table notation:

China (Hebei (Baoding, Shijiazhuang), Guangdong (Guangzhou, Dongguan), Shandong (Qingdao, Jinan))

The root is written to the left of the table as the name of the table consisting of a forest of subtrees.

4.2.3 Representation of left child and right brother

 The left child right sibling notation can convert a multi-tree into a binary tree:

 Node structure:

 A node has two pointer fields, one pointer points to the child node and the other points to its sibling node.

 4.3 Binary tree concept

4.3.1 Basic concepts

  • definition

A binary tree is a finite set of n (n ≥ 0) nodes, consisting of a root node and two disjoint binary trees called left subtree and right subtree respectively.

  • logical structure

One to two (1:2)

  • Basic Features

        Each node has at most two subtrees ( no nodes with degree greater than 2 exist );

        The order of the left subtree and the right subtree cannot be reversed ( ordered tree ).

  • basic form

  • nature

Property 1: If the level of the root node is 1, then there are at most 2i-1 nodes on the i-th level of the binary tree (i>0)

Property 2: In a binary tree with height k, there are at most 2k-1 nodes (k≥0).

Property 3: For any binary tree, if there are n2 nodes with degree 2, then the number of leaves (n0) must be n2+1 (i.e. n0=n2+1)

Property 4: The depth of a complete binary tree with n nodes must be\lfloor log_2 n\rfloor+1

Property 5: For a complete binary tree, if the node is numbered from top to bottom and from left to right, then the number of the node numbered i must be 2i, its left child number must be 2i, and its right child number must be 2i+1; the number of its parents must be i /2 (it is the root when i=1, except)

This property enables sequential storage of trees using a complete binary tree.

What if it’s not a complete binary tree???

             ------Convert it into a complete binary tree

  • full binary tree

        A binary tree with depth k and 2k -1 nodes.

        Features: Each layer is "full" of nodes

  • complete binary tree

The number of nodes on each layer reaches the maximum except for the last layer; only a few nodes on the right are missing on the last layer .

         Understanding: The k-1 level is exactly the same as the full binary tree. The k-th level node tries its best to stay to the left.

4.3.2 Representation of binary tree

Binary trees mainly use chained storage structures. Sequential storage structures are only applicable to complete binary trees and full binary trees, as well as static linked lists.

  • The sequential storage structure of a binary tree :

Linked storage structure of binary tree:

There are two main types of linked storage structures of binary trees: binary linked lists and trifurcated linked lists .

Binary linked list:

The node structure of the binary linked list is as follows:

//节点数据类型定义
type BinaryNode struct {
	Ch byte

	LChild *BinaryNode
	RChild *BinaryNode
}

 Three-pronged linked list:

The node structure of the three-way linked list is as follows:

Each node has three pointer fields, two of which point to child nodes (left child, right child), and a total pointer points to the parent node of the node.

//节点数据类型定义
type BinaryNode struct {
	Ch byte
	LChild *BinaryNode
	RChild *BinaryNode
	Parent *BinaryNode
}

 4.3.3 Binary tree traversal

Definition of traversal:

Refers to visiting every node according to a certain search route without repeating it (also called tour).

Traversal purposes:

It is the premise for the insertion, deletion, modification, search and sorting operations of the tree structure, and is the basis and core of all operations on the binary tree.  

Traversal method:

Keep in mind a convention that viewing each node is "left first, then right"  .

Note: "First, middle, and last" means whether the visited node D appears before the subtree or after the subtree.

From a recursive perspective, these three algorithms are exactly the same, or the access paths of these three traversal algorithms are the same, but the timing of accessing nodes is different.

On the path from the starting point of the dotted line to the end point, each node is passed three times.

Access on first pass = preorder traversal

Second pass access = in-order traversal

3rd pass-through access = post-order traversal

 4.3.4 Non-recursive traversal of binary trees

The three recursive algorithms for binary tree traversal can be implemented using non-recursive algorithms by setting up a stack. Taking mid-root order traversal as an example, its non-recursive algorithm description and stack changes are shown in the figure below.

 

 First, set a flag for each node. The default flag is false. The following process is performed according to the status of the node.

 By executing the above process, you can get the results of pre-order traversal. If you want to get other binary tree traversal results, just modify step 2.4.

4.3.5 Level traversal of binary trees

The level traversal process and queue changes of the binary tree are shown in the figure below.

 4.4.6  Application of binary trees : Huffman tree – Huffman coding 

 Path length of a binary tree: The sum of the path lengths from the root node to all nodes is called the path length of the binary tree, that is

 A complete binary tree has the shortest path length, and vice versa.

The outer path length of a binary tree: The sum of the path lengths from the root node to all leaf nodes is called the outer path length of the binary tree. The total coding length of a coding scheme is the length of the outer path of the corresponding coding binary tree, and the length of the outer path of a complete binary tree is the shortest.

Weighted outer path length of a binary tree: When characters have different usage probabilities, the character usage probability is used as the value of the leaf node in the binary tree, which is called weight . The length of the weighted path from the root to the X node is the product of the weight of the X node and the length of the path from the root to the X node. The sum of the weighted path lengths of all leaf nodes is called the weighted outer path length of the binary tree , that is

 In the formula, wi is the weight of the i-th leaf node, and li is the path length from the root to the i-th leaf node.

Construct a Huffman tree and obtain the Huffman code:

A Huffman tree is defined as a binary tree with the shortest weighted outer path length. If n leaf nodes and their weight sets are given, the corresponding Huffman tree is not unique  .

 The idea of ​​constructing the Huffman tree algorithm is to make the leaf nodes with larger weights closer to the root node, so that the binary tree constructed in this way has the smallest weighted outer path length.

Decoding of Huffman coding:

The process of decoding using Huffman trees: given a binary bit string S, starting from the first bit of the string S, match the 0 and 1 marked on the edge of the binary tree bit by bit, starting from the root node of the Huffman tree Start by going left when encountering 0 and going right when encountering 1. Several consecutive 0s and 1s determine a path from the root to a leaf node. Once a leaf node is reached, a character is translated.

  • Use static linked list to store Huffman tree

        The node structure of the static linked list is as follows:

 A one-dimensional array is used to store the above nodes. For a Huffman tree with n leaf nodes, the array length is 2n-1. The first n elements store leaf nodes, and the last n1 elements store 2-degree nodes. Suppose a Huffman tree is constructed from the weight set {5,29,7,8,14,23,3,11}. The initial state and final state of the node array are as shown in the figure below.

 If the weight set {5,29,7,8,14,23,3,11} corresponds to characters A~H, the Huffman tree and Huffman coding are as shown in the figure below.

Guess you like

Origin blog.csdn.net/weixin_41551445/article/details/126329195