Data Structure Series Learning (1) - An Introduction to Data Structure

Table of contents

introduction:

study:

concept:

What is data?

What are data elements?

What is a data item?

What are data objects?

What is a data structure?

What is a logical structure?

What is the physical structure (storage structure)?

Logical structure classification:

gather:

Linear structure (linear table):

Linear table:

Features of linear tables:

Linear tables are divided into two types according to the storage method of data:

Tree structure :

Graph structure (network structure):

Time Complexity:

Constant order O(1):

Linear order O(n):

Square order O(n^2):

Logarithmic order (Ologn):

Summarize:

References:


introduction:

In the previous learning process of C language, systematic learning was carried out on sequential structure design, selection program design, loop structure design, array, pointer, and structure. So far, the learning of C language has basically ended.

C language learning catalog and classic problem analysis:

C language- four methods to solve the Yang Hui triangle problem

 C language - detailed explanation of multi-layer for loop

 C Language - August 5th - Structures and Variables

C language - July 31 - Summary of pointers and typedef keywords

C Language - August 1 - Recursion and Dynamic Memory Management

C language-July 21-In-depth pointers

C Language-July 19th-Learning of Pointers

C language-July 18-learning of two-dimensional arrays

Last night, I took the first class of systematic study of data structure. Data structure itself is a course that plays a pivotal role in computer science. The articles I wrote before are all scattered knowledge points in data structure, such as various A sorting algorithm, but I have never conducted a step-by-step in-depth discussion on various data structures such as collections, queues, trees, and graphs. Starting from this article, I will learn about the data structure content in the Each article in this series is summarized in order to quickly recall the thoughts at that time when reviewing these knowledge points in the future.

study:

To learn the data structure, we must know the meaning of each noun in the data structure, so that we can describe it in professional and rigorous terms when explaining the various components in the program to others in the future.

concept:

What is data?

Data is a symbol that describes objective things, an object that can be manipulated in a computer, and a collection of symbols that can be recognized by a computer and input to the computer for processing. The data includes not only numerical types such as integers and real types, but also non-numeric types such as characters, sounds, images, and videos.

What are data elements?

A data element is a basic unit that makes up data and has a certain meaning. It is usually treated as a whole in a computer, and is also called a record.

What is a data item?

A data element can consist of several data items, and a data item is also the smallest unit of indivisible data.

What are data objects?

A data object is a collection of data elements of the same nature, which is a subset of data.

What is a data structure ?

A data structure is a collection of data elements that have one or more specific relationships with each other.

The data structure is divided into logical structure and physical structure:

What is a logical structure?

The abstract relationship between data and data has nothing to do with physical addresses.

What is the physical structure (storage structure)?

Between data and data, how they are stored in memory.

Let's talk about the logical structure classification between data and data in detail:

Logical structure classification:

According to Yan Weimin's "Data Structure (C Language Edition)", any data element does not exist in isolation, but there is a certain relationship between them. This relationship between data elements is called structure. According to the different characteristics of the relationship between data elements, there are usually the following four structures:

gather:

There is no relationship between data elements in the structure other than the "belongs to the same set" relationship.

The set mathematics problem we have learned in high school mathematics compulsory one is a typical example. For example, here we give a set C, and the elements contained in C are {1,2,3,4,5,6}.

Collection icon:

Linear structure (linear table):

There is a one-to-one relationship between the data elements in the structure.

Stacks, linear tables, queues, and one-dimensional arrays in computers are typical examples. For example, here I give an array ar with 5 elements, int ar[5] = {1,2,3,4,5 }; There is a one-to-one relationship among the five integer elements in ar, element 3 in ar, the previous element is 2, and the next element is 4.

Linear table:

A linear table has one and only one start node, and one and only one end node, and except for the start node, all other nodes have direct predecessors, and except for the end node, all other nodes have direct successors, which can be expressed as {a1,a2, a3,a4...,a(n - 1),a(n)};

Features of linear tables:

1. The only head

2. The only tail

3. Except for the first node, the other nodes have only one predecessor and one successor.

4. One-to-one logical relationship: linear table, stack, queue, string, one-dimensional array.

Linear tables are divided into two types according to the storage method of data:

Sequence table: data nodes are logically adjacent, not necessarily physically adjacent.

Linked list: Data nodes are logically adjacent, not necessarily physically adjacent.

Sequence table: Store logically adjacent data elements in physically adjacent storage units.

Linear structure diagram:

Tree structure:

There is a one-to-many relationship between the data elements in the structure.

A family tree in real life is a typical example of a tree structure. Parents give birth to two children, and the two children have their own families; a mind map is also a good example of a tree structure.

Tree structure diagram:

Graph structure (network structure):

The data in the structure has a many-to-many relationship.

When we usually drive or go to a certain place, we often look at the map. The map is a good example. The highways across the country are densely staggered.

Graphical structure diagram:

Time Complexity:

From the efficiency analysis of the algorithm: time efficiency, space efficiency

Time Complexity (Time Complexity): The relationship between the number of basic operations performed in the algorithm and the problem size n, generally recorded as T(n) = O (f(n))

When analyzing the algorithm, the total number of executions T(n) of the statement is a function of the problem size n. Then analyze the variation of T(n) with n and determine the magnitude of T(n). The time complexity of the algorithm, that is, the time measure of the algorithm, is written as: T(n) = O (f(n)). It means that as the problem scale n increases, the growth rate of the algorithm execution time is the same as the growth rate of f(n), which is called the asymptotic time complexity of the algorithm, or time complexity for short. where f(n) is a function of the problem size n.

In general, as n increases, the algorithm with the slowest growth of T(n) is the optimal algorithm.

Derive the big O order:

Step 1: Replace all the addition constants in the running time with the constant 1;

Step 2: In the modified running times function, only the highest order item is kept;

Step 3: If the highest-order term exists and is not 1, remove the constant multiplied by this term.

After these three steps, the big O order is obtained. 

Constant order O(1):

#include<stdio.h>
int main()
{
    int a = 10,b = 20;//执行一次
    int sum = a + b;//执行一次
    printf("%d\n",sum);//执行一次
    return 0;//正常退出
}

Obviously, this is a sequential structure, in which there are a total of four statements, among which return 0 is the normal exit of the program, so the function of the number of runs is: f(n) = 3, according to the method of deriving O-order, we will The constant 3 is reduced to 1, and the highest-order term is retained, but this function has no highest-order term at all, so the final time complexity of this program is: O(1)

At the same time, we also call O(1) a constant order.

Linear order O(n):

#include<stdio.h>
int main()
{
int n,temp;
for(int i=1; i<=n; i++){
    tmp += arr[i];
}
return 0;
}

In this program, I defined the temporary variable temp and the loop condition i < n, and the loop statement starts from 1 and executes n times.

So the function of running times is: f(n) = n. According to the method of deriving order O, we should convert the constant in O to 1 in the first step, but we found that there is no constant term in this function of running times, so proceed directly In the second step, the highest-order term of this function is n, we retain n to remove other constant terms, and the final big-O order is: T(n) = O(n)

At the same time, we also call O(n) linear order.

Square order O(n^2):

#include<stdio.h>
int main()
{
int n,temp;
for(int i=1; i<=n; i++){
    for(int j = 1;j <= n;j++){
    tmp += arr[i];    
    }    
}
return 0;
}

In the previous C language - detailed explanation of multi-layer for loop , this article once talked about the logic principle of multi-layer for loop. When we execute the outer loop condition, we directly penetrate the loop variable of the next layer, and then go to the loop statement. Until the inner loop is completely executed, jump to the outer loop, add 1 to the loop variable, and then enter the next loop to execute n times. So the number of runs function is: f(n) = n^2

According to the method of deriving the big O order, we first convert the constants in the O order to 1, but there is no constant in the O order at this time, so jump to the next step, we keep the highest order item n^2, now execute the first The three-step operation removes the constant multiplied by the highest term, but there is no constant multiplied by n^2 in the formula, and the final big O order is: T(n) = O(n^2)

At the same time, we also call O(n^2) the square order.

Logarithmic order (Ologn):

#include<stdio.h>
int main()
{
    int n = 0;
    scanf("%d",&n);
    int x = 1;
    for(int i = 1;i < n;i*=2)
    {
        ++x;
        printf("x = %d,i = %d\n",x,i);
    }
    return 0;
}

For example, here I give a program, define the integer variable n as the loop condition and let the user decide the value of n, define the integer value x, in the loop, accumulate the value of x and output the value of x, every execution In one loop, the value of the loop variable i becomes twice the original value. For example, in the program program, I input the value of n as 10, and the running result is:

Let's analyze the running steps of the program now:

The first cycle: when i = 1, we accumulate the value of x and output the value of x, at this time the value of x is 2;

The second loop: At this time, the loop operation is performed, the loop variable i * 2, the value of i becomes 2, and the value of x is accumulated and output again, and the value of x is 3 at this time;

The third cycle: continue the loop operation, the loop variable i * 2, the value of i becomes 4, and the value of x is accumulated and output again, and the value of x is 4 at this time;

The fourth cycle: continue the loop operation, the loop variable i * 2, the value of i becomes 8, and the value of x is accumulated and output again, and the value of x is 5 at this time;

The fifth cycle: continue the loop operation, the loop variable i * 2, the value of i becomes 16, 16 > 10, the loop variable has not satisfied the loop variable, and jump out of the loop at this time.

In summary, the loop is executed a total of four times until the loop is jumped out when i does not satisfy the condition of i < n, and the loop is executed a total of four times.

Let's analyze the relationship between i and n again:

The i values ​​corresponding to each of the four loops are: 2^0, 2^1, 2^2, 2^3, until the loop ends when 2^4 is greater than n, the loop condition and the loop operation determine together The number of cycles, that is, under the premise of rounding, the number of times the program runs is:

Number of program runs =    + 1 = 3 + 1(...)log2^{10} 

Note: Here... represents the remainder that is not perfectly close to the required value. 

We think, if the value of n becomes 100 at this time, then under the same premise of rounding up, the number of times the program runs at this time should be the 6th power of 2 closest to 100 plus one, that is, 7 times, we Run the program to verify:

The result is correct. At this time, if I change the loop operation from the original i *= 2 (i = i * 2) to i *= 3 (i = i * 3), what will be the number of times the program runs at this time? ? It's very simple, we just need to change the base in the log function from 2 to 3, that is:

Number of program runs =   log 3^{100} + 1 = 4 + 1(...)

Note: Here... represents the remainder that is not perfectly close to the required value. 

Run the program to verify:

The result is correct. 

That is to say, we can express the number of times a program runs through a mathematical relationship, and this mathematical relationship is a logarithmic function. If we use the big O order to express it, we can make the time complexity of this type of program The degrees are summarized as:

T( n )=  O( logn )

 At the same time, we also call O(logn) the logarithmic order.

So far, we have roughly analyzed the common time complexity, but the program is ever-changing, and there are many uncommon big O-order representations of time complexity. Here we give a table for the description of time complexity :

There are several uncommon big-O orders in this table, such as exponential order and nlogn order. These orders actually have very typical examples in data structures and algorithms, such as recursive form to find the time of Fibonacci sequence (Fibonnaci) The complexity is O(2^n), and for example, the time complexity of the two-way merge sort algorithm (including the worst case and the best case) is O(nlogn).

Summarize:

Data structure is one of the core courses of computer at the undergraduate level. This is also the first article in the data structure series I wrote. It aims to formally cut into the learning of data structure from the learning of C language and introduce the basic concepts of data structure. In the article Introduced respectively: data, data element, data item, data object, data structure, physical structure and logical structure in data structure, and logical structure classification. The concept of time complexity is also introduced later, and the common big O (order) is deduced. The previous introduction of data structure has been basically completed. Later, I will introduce typical data structures such as sequence list, singly linked list, and doubly linked list in detail. And implement it with code.

References:

Yan Weimin - "Data Structure (C Language Edition)" - Tsinghua University Press

Cheng Jie- "Big Data Structure" - Tsinghua University Press

Guess you like

Origin blog.csdn.net/weixin_45571585/article/details/127258129