[Data structure] Analysis of common data structures


foreword

In the tide of the information age, the front-end development industry is developing rapidly, and new technologies, new tools and new frameworks are emerging one after another. As a front-end developer, we need to constantly learn new knowledge and master new skills in order to remain invincible in the fierce market competition. As the core knowledge field of computer science, data structure can improve our core competitiveness by learning it.

1. What is a data structure

We can analyze folk definitions from two aspects
:

A data structure is a data object and the various relationships that exist between an instance of that object and the data elements that make up the instance. These linkages can be given by defining related functions. "
— "Data Structures, Algorithms and Applications"

Understand it from your own perspective:
A data structure is the way a set of data is organized so that it can be manipulated and retrieved more efficiently.

2. Common data structures

1. Array (Array)

Array (Array) is a linear data structure used to store data elements of the same type. Almost all programming languages ​​support array types natively. Arrays are typically used to store a series of values ​​of the same data type. But in JavaScript, different types of values ​​can be stored in the array.

2. Stack

Stack (Stack) is a last-in-first-out data structure, usually used to implement function calls, expression evaluation and other operations that follow the last-in-first-out principle. Stacks are widely used in computer science because they can help us solve many practical problems.

Features of the stack:

  1. Insertion and deletion are only allowed at the top of the stack.
  2. The last element of the data structure is the top of the stack (top), and the first element to enter the stack is at the bottom of the stack (bottom).
  3. When inserting or deleting, the element at the top of the stack remains at the last inserted or deleted position.
  4. The stack has the characteristics of first-in-last-out (FILO), that is, the elements that enter the stack last are accessed and deleted first.
    insert image description here

Stacks can be implemented with a variety of data structures, such as arrays, linked lists, static arrays, or dynamic arrays. Which implementation to choose depends on the application scenario and performance requirements.

Some common operations on stacks include:

  1. Push: Add an element to the top of the stack.
  2. pop stack (pop): delete the element from the top of the stack.
  3. Take the top element of the stack (peek): Return the top element of the stack without deleting it.
  4. Determine whether the stack is empty (isEmpty): Check whether the stack is empty.
  5. Get the size of the stack (size): returns the number of elements in the stack.

3. Queue

Queue (Queue) is a first-in-first-out data structure, usually used to implement task scheduling, work queues and other operations that follow the first-in-first-out principle.

Some features of queues:

  1. Allows insertion (enqueue, enqueue) at one end of the queue and deletion (dequeue, dequeue) at the other end.
  2. The first element in the queue is at the head of the queue, and the first element to enter the queue is at the end of the queue.
  3. When inserting or deleting, the element at the head of the queue is always kept at the position where it was first inserted or deleted.
  4. The queue has the characteristics of first-in-first-out (FIFO), that is, the elements that enter the queue first are accessed and deleted first.
    insert image description here

Some common operations on queues include:

  1. Enqueue: Add an element to the end of the queue.
  2. Dequeue: Removes elements from the beginning of the queue.
  3. Get the head element (front): returns the element at the head of the queue, but does not delete it.
  4. Determine whether the queue is empty (isEmpty): Check whether the queue is empty.
  5. Get the size of the queue (size): returns the number of elements in the queue.

4. Priority queue (Heap)

Priority Queue (PriorityQueue), also known as a heap, is a special queue whose elements are sorted and accessed according to a specific priority. The priority queue has the characteristics of first-in-first-out (FIFO), but each time an element is accessed, the element with the highest priority will be accessed first. This makes priority queues very suitable for solving task scheduling problems with priority requirements.

Scenarios similar to priority queues in daily life:
● Those who queue up first, are dealt with first. (buy tickets, checkout).
● In the queue, those with emergency (special circumstances) can be dealt with first.

A priority queue has the following properties:

  1. First in, first out (FIFO): Elements are accessed and removed in the order they entered the queue.
  2. Priority: Each element has a priority, and elements are sorted according to the priority. Elements with the highest priority will be visited first.
  3. Insert operation: When inserting an element, the position of the element in the queue needs to be determined according to the priority of the element.
  4. Delete operation: When deleting an element, it is necessary to find and delete the element with the highest priority according to the priority of the element.

In programming, priority queues are often used to implement task scheduling, network flow control, priority data structures, and more. Understanding and mastering the basic operations and application scenarios of priority queues is very important for writing efficient programs.

5. Linked List

A linked list (Linked List) is a dynamic data structure consisting of nodes (Node), each node contains data (value) and a pointer to the next node (next). Linked lists can be directed or undirected, complete, partial, or bidirectional. A linked list can dynamically add or remove elements when needed, so it is very useful in practical programming.

Linked list is divided into one-way linked list and two-way linked list

5.1 Singly linked list

A singly linked list (Singly LinkedList) is a simple form of linked list where each node contains data (value) and a pointer to the next node (next). A singly linked list can only be traversed in one direction, that is, it can only be accessed from the head of the linked list to the tail of the linked list.

Following are some characteristics of singly linked list:

  1. The nodes in the linked list are connected by pointers, and each node contains data and a pointer to the next node.
  2. The pointer of the last node of the linked list is empty (null), indicating the end of the linked list.
  3. The first node of the linked list has no direct predecessor pointer (prev), indicating the beginning of the linked list.

insert image description here

When performing insertion and deletion operations in the singly linked list, it is necessary to modify the predecessor pointer and successor pointer of the corresponding node. For example, when inserting an element in the middle of the linked list, you need to find the previous node of the position to be inserted, and then modify its successor pointer.

5.2 Doubly linked list

A doubly linked list (Doubly LinkedList) is a type of linked list, in which each node contains data (value), a pointer to the predecessor node (prev) and a pointer to the successor node (next). Unlike singly linked lists, doubly linked lists can be accessed and manipulated anywhere in the linked list, not just the beginning or end.

insert image description here

Some characteristics of doubly linked list:

  1. The nodes in the linked list are connected by pointers, and each node contains data, a pointer to the predecessor node (prev) and a pointer to the successor node (next).
  2. The pointer of the last node of the linked list is empty (null), indicating the end of the linked list.
  3. The first node of the linked list has no direct predecessor pointer (prev), indicating the beginning of the linked list.
  4. Due to having two pointers, a doubly linked list is more efficient at inserting and deleting than a singly linked list.

When performing insertion and deletion operations in the doubly linked list, only the predecessor pointer and the successor pointer of the corresponding node need to be modified, instead of searching the corresponding predecessor node in the entire linked list like a singly linked list.

6. Hash table (Hash)

Hashtable (HashTable) is a data structure based on key-value storage, which can quickly add, delete, check and modify. The core idea of ​​the hash table is to map the key (key) to a fixed hash address (hasha address) through the hash function, and then perform data access according to the hash address. Hash tables are often used when implementing data structures such as dictionaries, associative arrays, and maps.

Hash tables are usually implemented based on arrays, but compared to arrays, it has more advantages:
● Hash tables can provide very fast insert-delete-find operations.
● Inserting and deleting values ​​takes nearly constant time, or O(1), no matter how much data there is. In fact, it only takes a few machine instructions to do it.
● The speed of the hash table is faster than that of the tree, and the desired element can be found almost instantly.
• Hash tables are much simpler to encode than trees.

The hash table also has disadvantages:
● The data in the hash table is out of order, so the elements in it cannot be traversed in a fixed way (such as from small to large).
● Normally, the key in the hash table is not allowed to be repeated, and the same key cannot be placed to store different elements.

6.1 Some concepts of hash table

  1. Hashing The process of converting large numbers into subscripts within the array range is called hashing.

  2. Hash function We usually convert words into large numbers, and put the code implementation of hashing large numbers in a function, which is called a hash function.

  3. The hash table encapsulates the entire structure of the array into which the final data is inserted, and the result is a hash table.

6.2 Address conflicts

Separate Chaining and Open Addressing are two commonly used conflict resolution strategies for resolving conflicts in key-value mappings in hash tables. In a hash table, collisions occur when multiple keys map to the same hash address. The goal of the conflict resolution strategy is to organize conflicting key-value pairs so that they can be quickly accessed and updated when needed.
There are two common solutions to conflicts: chain address method (zipper method) and open address method.

1. Chain address method (zipper method)

The chain address method is a conflict resolution strategy based on a linked list. In the hash table, each hash address is regarded as a bucket, and each bucket contains a linked list. When multiple keys are mapped to the same hash address, these key-value pairs will be added to the corresponding linked list. The advantage of the chain address method is that it can make full use of the space of the hash table and is suitable for scenarios with a high probability of collision. However, the disadvantage of the chain address method is that the time complexity of insertion and deletion operations is O(n), and when there are many elements in the hash table, the performance is low.

2. Open address method

Open addressing is a collision resolution strategy based on techniques such as linear probing, quadratic probing, or double hashing. In a hash table, when multiple keys are mapped to the same hash address, collisions can be avoided by looking for the next empty hash address. The advantage of the open address method is that the time complexity of insertion and deletion operations is O(1), which is suitable for scenarios with a high probability of conflict. However, the disadvantage of the open address method is that the space utilization rate of the hash table is low, and a part of the space may be wasted.

7. Tree

A tree is a nonlinear data structure consisting of nodes and edges. Tree data structures are widely used in computer science, such as representing organizational structures, representing file systems, representing binary search trees, etc.

insert image description here
Key features of the tree include:

  1. Each node can have zero or more child nodes, and a node without child nodes is called the root node (root).
  2. The tree structure has a hierarchical relationship, and the path from the root node to each leaf node represents a specific relationship.
  3. Edges in a tree data structure have no direction and are only used to connect adjacent nodes.

Depending on the number of child nodes, trees can be divided into the following basic types:

  1. Nodes without children are called leaf nodes.
  2. A node with only one child node is called a degree-1 node.
  3. A node with two children is called a degree-2 node.
  4. The degree of a tree is the sum of the degrees of all nodes.
  5. The depth of the tree refers to the number of edges passed from the root node to the leaf node.

8. Graph

Graph is a non-linear data structure composed of vertices and edges, which is used to represent various relationships in the real world.

Figure features:

  1. A set of vertices: V (Vertex) is usually used to represent a collection of vertices
  2. A set of edges: E (Edge) is usually used to represent the set of edges
  3. An edge is a connection between vertices and vertices
  4. Edges can be directed or undirected. (For example, A — B, usually means undirected. A --> B, usually means directed)
    insert image description here
    Graphs are widely used, including network analysis, traffic planning, social networks, logistics networks, etc. In programming, graphs are usually used to represent and process various complex relationships and structures in order to solve practical problems. Common implementations of graphs include adjacency lists, adjacency matrices, and heaps.

Summarize

In this article, we have introduced a variety of data structures, but only a brief introduction to it. Later, we will explain each data structure separately and use javaSript to implement it. If you are looking forward to the follow-up articles, please pay attention to me!

Data structures are a cornerstone of computer science, and they provide us with an efficient way to organize and manipulate data. By learning different types of data structures, we can solve practical problems more efficiently and improve the performance of computer systems. I hope this article can stimulate your interest in data structures and guide you to continuously explore and progress in the process of learning and practice.

Guess you like

Origin blog.csdn.net/m0_63831957/article/details/130767686