A brief discussion of data structures - Introduction to trees

For more details, please search " " on WeChat前端爱好者 and click me to view .

data structure

Data structure is one of the fundamental concepts of computer science and involves how data is organized and stored so that it can be accessed, managed, and manipulated efficiently. A data structure is a way of describing or representing data. It defines how data is stored, relationships, and methods of data manipulation.

Here are some common data structure types:

  1. Tree : This is a non-linear data structure used to represent data with hierarchical relationships. A common example of a tree is a binary tree, where each node has at most two child nodes.
  2. Graph : This is a non-linear data structure used to represent any number of nodes and the connections between them. Graphs can be undirected or directed, can contain cycles, and can be self-connected.
  3. Heap : This is a data structure used to store unique elements. The main operations of collections are adding elements (add) and removing elements (remove).
  4. Hash table/hash table (Hash) : This is a data structure used to implement fast lookup operations. Hash tables map keys into buckets through a hash function, and then lookup and store them in the buckets.
  5. Stack : This is a last-in-first-out (LIFO) data structure that can only be inserted and deleted from one end. The main operations of the stack are push (adding elements) and pop (removing elements).
  6. Queue : This is a first-in-first-out (FIFO) data structure that can add elements from one end and delete elements from the other end. The main operations of the queue are enqueue (adding elements) and
  7. Array : This is a linear data structure that can be viewed as a series of elements of the same type. Arrays are contiguous in memory and elements at any position can be accessed by index.
  8. Likend List : This is a linear data structure consisting of a series of nodes, each node contains a value and a pointer to the next node. A linked list does not require all elements to be contiguous in memory.

These common data structures each have their own advantages and disadvantages, and the appropriate data structure should be selected according to the specific application scenario.

For example, for the storage and access of a large number of elements, arrays and hash tables may be better choices; for data that needs to keep element insertion and deletion operations efficient, linked lists and dynamic arrays may be better choices; for data that needs to keep element search operations For efficient data manipulation, search trees may be a better choice.

Furthermore, data structures and algorithms are often closely related.

Understanding and mastering various data structures and their operation methods can help us design and implement algorithms more effectively and optimize program performance.

Tree

tree definition

树是n(n>=0)个结点的有限集。当n = 0时,称为空树

In any non-empty tree it should satisfy:

  • There is only one specific node called the root.
  • When n>1, the remaining nodes can be divided into m (m>0) disjoint finite sets T1, T2,...,Tm, where each set itself is a tree and is called a subtree of the root. .

Obviously, the definition of tree is recursive, that is, itself is used in the definition of tree. Tree is a recursive data structure.

As a logical structure and a hierarchical structure, a tree has the following two characteristics:

  • The root node of the tree has no predecessor, and all nodes except the root node have one and only one predecessor.
  • All nodes in the tree can have zero or more successors.

Therefore, there are n-1 edges in a tree of n nodes.

basic terminology

Below, we will explain some basic terms and concepts of trees with illustrations.

结点、祖先、子孙、双亲、兄弟

Consider node K. Any node on the unique path from root A to node K is called the ancestor of node K. For example , node B is the ancestor of node K, and node K is the descendant of node B. The node E closest to node K on the path is called the parent of K, and K is the child of node E. Root A is the only node in the tree that has no parents. Nodes with the same parents are called brothers . For example, node K and node L have the same parent E, that is, K and L are brothers.

Degree of node, degree of tree

The number of children of a node in a tree is called the degree of the node , and the maximum degree of a node in the tree is called the degree of the tree .

For example, the degree of node B is 2, the degree of node D is 3, and the degree of the tree is 3.

Branch node/non-terminal node, leaf node/terminal node
  • Nodes with degree greater than 0 are called branch nodes (also called non-terminal nodes );
  • A node with degree 0 (no child nodes) is called a leaf node (also called a terminal node ).

In branch nodes, the number of branches of each node is the degree of the node.

Node depth, height and level
  • The hierarchy of nodes is defined starting from the root of the tree. The root node is level 1, its child nodes are level 2, and so on. Nodes whose parents are on the same level are cousins ​​of each other . In the figure, nodes G and E, F, H, I, and J are cousins ​​of each other.
  • The depth of a node is accumulated layer by layer starting from the root node and going from top to bottom.
  • The height of the node is accumulated layer by layer starting from the leaf node and going from bottom to top.
  • The height (or depth) of a tree is the maximum number of levels of nodes in the tree. The height of the tree in the picture is 4.
Ordered trees and unordered trees.
  • Each subtree of a node in the tree from left to right is 有次序,不能互换 , the tree is called 有序树, otherwise it is called 无序树.

Assume that the picture is an ordered tree. If the positions of the sub-nodes are interchanged, it becomes a tree 不同的树.

path and path length.

The path between two nodes in the tree is composed of the sequence of nodes passed between the two nodes, and the path length is the number of edges passed on the path.

Note: Since the branches in the tree are directed, that is, from the parents to the children, the path in the tree is from top to bottom, and there is no path between two children of the same parent .

forest.

The forest is 互不相交composed of集合 m (m≥0) trees .

The concept of forest is very similar to the concept of tree, because as long as the root node of the tree is deleted, it becomes a forest.

On the contrary, as long as a node is added to m independent trees and these m trees are used as subtrees of the node, the forest becomes a tree.

tree nature

Trees have the following most basic properties:

tree storage structure

In the process of introducing the following three storage structures, we take the following tree as an example.

parent representation

We assume that the nodes of the tree are stored in a set of continuous spaces, and at each node, an indicator is attached to indicate 双亲the position of the node in the linked list . In other words, in addition to knowing who it is, each node also knows where its parents are.

Among them, data is the data field, which stores the data information of the node. And parent is a pointer field that stores the subscripts of the node's parents in the array.

The following is the node structure definition code for our parent representation.

/*树的双亲表示法结点结构定义*/
#define MAX_TREE_SIZE 100
typedef int TElemType;	//树结点的数据类型,目前暂定为整型

/*结点结构*/
typedef struct PTNode{
    
    
	TElemType data;	//结点数据
	int parent;	//双亲位置
}PTNode;

/*树结构*/
typedef struct{
    
    
	PTNode nodes[MAX_TREE_SIZE];	//结点数组
	int r, n;	//根的位置和结点数
}PTree;

With such a storage structure, we can easily find its parent node based on the node's parent pointer. The time complexity used is 0(1). When parent is -1, it means that the root of the tree node has been found.

But if we want to know what the children of the node are, sorry, please 遍历整个结构.

child representation

The specific method is: arrange the child nodes of each node and use a singly linked list as the storage structure. Then n nodes have n child linked lists. If it is a leaf node, the singly linked list is empty. Then the n head pointers form a linear table, which uses a sequential storage structure and is stored in a one-dimensional array , as shown in the figure.

For this purpose, two node structures are designed:

One is the child node of the child linked list.

in:

child是数据域,用来存储某个结点在表头数组中的下标。
next 是指针域,用来存储指向某结点的下一个孩子结点的指针。

The other is the header node of the header array.

in:

data是数据域,存储某结点的数据信息。
firstchild 是头指针域,存储该结点的孩子链表的头指针。

Below is the structure definition code for our child representation.

/*树的孩子表示法结构定义*/
#define MAX_TREE_SIZE 100

/*孩子结点*/
typedef struct CTNode{
    
    
	int child;
	struct CTNode *next;
}*ChildPtr;

/*表头结点*/
typedef struct{
    
    
	TElemType data;
	ChildPtr firstchild;
}CTBox;

/*树结构*/
typedef struct{
    
    
	CTBox nodes[MAX_TREE_SIZE];	//结点数组
	int r, n;	//根的位置和结点数
}

With such a structure, if we want to find a child of a node, or find the brothers of a node, we only need to find the singly linked list of children of this node. It is also very convenient for traversing the entire tree, just loop through the array of the head node.

However, there are also problems. How to know who the parents of a certain node are? It is more troublesome. It requires traversing the entire tree. Isn't it possible to combine the parent representation and the child representation? Of course it is possible. , readers can try to combine it themselves and will not go into details here.

child brother representation

Just now we studied the storage structure of the tree from the perspective of parents and the perspective of children. What if we looked at the perspective of the brothers of the tree nodes?

Of course, for a hierarchical structure like a tree, it is not possible to only study the brothers of nodes. After observation, we found that for any tree, its nodes 第一个孩子如果存在就是唯一的,它的右兄弟如果存在也是唯一的. Therefore, we set two pointers, pointing to the first child of the node and the right sibling of the node.

The structure of the node is as follows:

in:

  • data is the data field,
  • firstchild is a pointer field that stores the storage address of the first child node of the node.
  • rightsib is a pointer field that stores the storage address of the node's right sibling node.

This representation brings convenience to finding a child of a node.

The structure definition code is as follows.

/*树的孩子兄弟表示法结构定义*/
typedef struct CSNode{
    
    
	TElemtype data;
	struct CSNode *firstchild, *rightsib;
} CSNode, *CSTree;

So through this structure, we turned the original tree into something like this:

Isn't this just a binary tree?

Yes, in fact, the biggest advantage of this representation is that it turns a complex tree into a binary tree.

Reference documentation

  • https://blog.csdn.net/Bb15070047748/article/details/119208588
  • https://blog.csdn.net/Real_Fool_/article/details/113930623

Guess you like

Origin blog.csdn.net/BradenHan/article/details/135258400