[Data structure] Summary of data structure knowledge points

1 Introduction

  • In the data structure, the data structure is logically divided into linear structure and nonlinear structure
  • The representation of data structure in computer memory refers to the storage structure of data
  • A data item is the smallest unit of data, and a data element is the basic unit of data
  • The data processed by the computer generally has some kind of internal connection, which means that there is a certain relationship between data elements and data elements
  • The purpose of algorithm analysis is to analyze the efficiency of the algorithm for improvement
  • When storing data, not only the value of each data element should be considered, but also the relationship between data elements should be stored
  • The main task of algorithm analysis is to analyze the relationship between the execution time of the algorithm and the problem size
  • The operation efficiency of data is related to the storage structure adopted
  • The calculation amount of the algorithm is called the time complexity of the algorithm
  • When continuous storage is allocated, the address of the storage unit must be continuous
  • The logical structure of data refers to the logical relationship between data elements
  • The representation of a data structure in computing is called a storage structure
  • The physical logic of the data only needs to include the representation of the sequential storage structure and the representation of the linked storage structure
  • The data logic structure includes four types: set, linear structure, tree and graph. Tree structure and graph structure are collectively referred to as nonlinear structure
  • The sequential storage method stores logically adjacent elements in storage units adjacent to the physical location; the logical relationship between nodes in the chained storage method is represented by pointer fields.
  • The data structure studies the logical structure and physical structure and the relationship between them, and defines the corresponding operations for this structure, and designs the corresponding algorithms.
  • The execution time of the algorithm is a function of the problem size n.
  • Briefly describe the relationship between the logical structure of data and the storage structure.

         Answer: In the data structure, the logical structure and the storage structure are closely related. The storage structure not only stores the data elements in the computer, but also expresses the logical relationship between the data elements. The logical structure has nothing to do with the computer, and the storage structure is the representation of the relationship between data elements in the computer. Usually, a logical structure can have multiple storage structures. For example, a linear structure can be represented by a sequential storage structure or a chained storage structure.

  • What is the difference between data structures and data types?

        Answer: A data structure is a collection of data elements that have one or more specific relationships with each other. It generally includes three aspects: logical structure of data, storage structure, and multi-data operations. A data type is a general term for a set of values ​​and a set of operations defined on this set of values. The data structure focuses on the relationship between elements, and the data type focuses on the individual characteristics of the data.

  • When the logical structure of the data has been selected to solve a certain problem, what aspects should be considered when choosing the storage structure of the data?

       Answer: Usually, it is considered from two aspects: the first is the storage space complexity of algorithm implementation; the second is the time complexity of algorithm execution. If the storage space is difficult to determine, a chained storage structure should be selected, otherwise a sequential storage structure should be selected. If the insertion and deletion operations are frequent, the chain storage structure is selected, otherwise the sequential storage structure is selected.

linear table 

  • The linked list has the characteristics that insertion and deletion do not require moving elements, no need to estimate storage space in advance, and the required space is proportional to the length
  • If the most commonly used operation is to take the i-th node and its predecessors, the storage method of the sequential table saves the most time
  • A linear list is a finite sequence with n data elements (n >= 0)
  • The sequential storage structure of the linear table is a storage structure of random storage
  • The purpose of adding a head node in the singly linked list is to facilitate the realization of the operation
  • The condition for a singly linked list without a head node (the head pointer is h) is empty is h==NULL
  • The condition for the singly linked list with the head node (the head pointer is h) to be empty is h->next==NULL
  • The condition for the circular doubly linked list with the head node (the head pointer is L) is empty is L->next==L
  • The tail node (pointed by p) of a non-empty circular singly linked list (head pointer is head) satisfies p->next==head
  • Assuming that the most common operation of a linked list is to insert a node at the end and delete a tail node, then choosing a double circular linked list with a head node saves the most time.
  • If the most commonly used operations of a linear table are to access any element with a specified sequence number and insert and delete at the end of the table, the storage method of the sequential table is the most time-saving.
  • If a linear table of length n adopts a sequential storage structure, the time complexity of the algorithm for inserting a new element at the i-th position is O(n)
  • For a sequentially stored linear table, the time complexity of accessing nodes and adding and deleting nodes is O(1)O(n)
  • The linear table is stored in a linked manner, and the time complexity of accessing the i-th node is O(n)
  • The characteristic of the end node p of circular linked list H is that p->next==H
  • When inserting an element before the i-th element in a sequence table of length n, it is necessary to move n-i+1      elements backward.
  • When deleting the i-th element in a sequential list of length n, it is necessary to move forward   ni  elements.
  • The role of setting the head node in the singly linked list is   to simplify the insertion and deletion algorithms     .
  • To delete a specified node in a single chain, the direct predecessor  node of the node must be found.
  • Access to the nodes in the singly linked list must be performed sequentially along   the pointer field    .
  • Each node in the double-linked list has two pointer fields, one pointing to the immediate predecessor node   and one pointing to the direct successor node     .
  • In a   doubly linked     list, the time complexity of the algorithm for deleting the last node is O(1).
  • The time complexity of accessing a given value in a linear list is on the order of   O(n)           .
  • A sequence table is generated from n data elements. If the insertion algorithm is called every time to insert an element into the header, the time complexity of the entire algorithm is O(n). If the insertion algorithm is called every time to insert an   element    into The time complexity of the whole algorithm is    O(n2)      .
  • In a   doubly       linked list, the tail pointer can be used instead of the head pointer.
  • According to n data elements to establish the corresponding sequence list and single linked list storage structure, the time complexity of the algorithm is   O(n)          at best and    O(n2)         at worst .
  • The time complexity of the algorithm for finding the length of sequential storage and chain storage of the linear table is   O(1)     and   O(n)     respectively .
  • In a singly linked list with a head node, is the operation process the same for inserting or deleting at the head and inserting or deleting at other positions?     the same         .
  • In a singly linked list without a head node, is the operation process the same for inserting or deleting at the head of the list as for inserting or deleting at other positions?    Not the same          .
  • Describe the characteristics of the storage methods of sequential lists and linked lists.

    Answer: The sequential table storage method allocates continuous storage units for the data, and the data elements are stored in the corresponding storage units in a logical order, so that the logically adjacent data elements are also physically adjacent, so random access to the data elements of the linear table can be realized. That is, the time complexity of data access is O(1).

    The storage units allocated by the linked list storage method can be discontinuous, and the logical relationship between data elements is represented by the pointer field of each node, and the data elements in the linear list can only be accessed sequentially.

  • If a linear table is frequently inserted and deleted, which storage structure should the linear table adopt, and why?

    Answer: If a linear table is frequently inserted and deleted, the linear table should adopt a chain storage structure. Because the linked storage structure does not need to move data elements when inserting and deleting data elements, the logical relationship between data elements can be changed only by modifying the pointer field of the node.

  • In a single linked list, a double circular linked list and a single circular linked list, if only the pointer p points to a certain node but not the head pointer, can the node p be deleted from the corresponding linked list? If possible, what is the time complexity of each.

    Answer: To realize the operation of deleting node p, you must find its predecessor node, modify the value of its pointer field to point to the successor node of p, and delete node p. Singly linked lists don't work, because you can't find the predecessor node of node p without knowing the head pointer. The two-way circular linked list and the single circular linked list can delete p nodes. The time complexity of deleting p nodes in a single circular linked list is O(n), and the time complexity of deleting P nodes in a double circular linked list is O(1).

  • What is the effect of setting the head node on the linked list?

    Answer: For the linked list with the leading node, all you need to do is to modify the pointer field of the previous node before inserting a node or deleting a node at any position in the list, because any link in the linked list with the leading node Element nodes have predecessor nodes. If there is no head node, inserting a node before the head node or deleting the head node must modify the head pointer, and the algorithm is more complicated than that of the head node.

    Secondly, for the linked list structure with the head node, the head pointer after initialization is fixed. Except for the undo algorithm, all algorithms will not modify the head pointer, which can reduce the possibility of errors.

It is known that a linear list uses a singly linked list with a head node as its storage structure, write an algorithm to find the length of the singly linked list.

Solution: The basic idea of ​​the algorithm: develop from the next node of the head node, traverse each node of the singly linked list, and add 1 to the node calculator every time a node is encountered.

int listlenght(linklist L)
 { int length=0;
  P=L->next;
  while(p) 
    { length++;
      p=p->next;
      }
   return(length);
  }

A sequence list L is known, in which the elements are arranged in order of increasing value, and an algorithm is designed to keep the sequence list still in increasing order after inserting an element with value x, and the space complexity is 0 (1).

void insertsq(sqlist L,elemtype x)
 { n=L.length-1;
   while(n>=0&&LT(x,L.elem[n])
      { L.elem[n+1]=L.elem[n];
       n--;
       }
L.elem[n+1]=x;
}
L.lenght++;
return;
}

Write an algorithm to remove all elements with value x from a sorted list.

void delallsq(Sqlist &L)
        { int i=0,j=0;
          while(j<L.length)
            { if(L.elem[j]!=x)
               L.elem[i++]=L.elem[j];
             j++;
             }
          L.longth=i;
          }

Stacks and Queues

  • The principle of stack operation is first in last out
  • The first-in-first-out feature of the queue means that the last element inserted into the queue is always the last to be deleted
  • Compared with the sequential stack, the chain stack has an obvious advantage that the stack is usually not full
  • What both stacks and queues have in common is that insertion and deletion are only allowed at endpoints
  • The characteristic of the stack is last in first out, and the characteristic of the queue is first in first out;
  • The head of the chain queue represented by a single linked list is at the head of the chain of the linked list
  • The input sequence is ABC, if the output sequence becomes CBA, the stack operations passed are push, push, push, pop, pop, pop
  • The stack is used in recursive calls, function calls, and expression evaluation
  • Design an algorithm for judging whether the left and right parentheses in an expression are paired, and the stack data structure is the best.
  • The queue is stored in A[0..M-1], then the operation when entering the queue is rear=(rear+1)%M
  • The queue is stored in A[0..M-1], then the operation when leaving the queue is front=(front+1)%M
  • The maximum capacity of the circular queue is M, and the condition for the queue to be empty is rear==front
  • The maximum capacity of the circular queue is M, and the condition for full queue is (rear+1)%M==front
  • The introduction of the circular queue is to overcome   the false overflow of the sequential queue             .
  • There are three ways to distinguish the empty and full of the circular queue. They are to   use one less element   , set an empty and full flag  , and use a counter to record the number of elements in the queue       .
  • The difference between a stack and a queue is that   the stack can only perform insertion and deletion operations at one end of the table, while the queue is limited to insert operations at one end of the table and delete operations at the other end      .
  • Assuming that the stack adopts a sequential storage structure, and there are already i-1 elements in the stack, then the algorithm time complexity of pushing the i-th element into the stack is   O(1)     .
  • If the stack is represented by a singly linked list without a head node, the operation to be performed to create an empty stack is   top=NULL    .
  • If the maximum length of the stack is difficult to estimate, it is better to use   a linked stack        .
  • Why is the stack a last-in-first-out table?

    Answer: Because the stack is limited to insert and delete operations at one end of the table, the data elements that are pushed into the stack are always popped out first, so the stack is a last-in, first-out table.

  • For a stack whose input sequence is A, B, C, try to give all possible output sequences.

    Answer: The possible popping sequences are: ABC, ACB, BAC, BCA, CBA.

  • What is queue overflow? What is false overflow phenomenon? What are the methods to solve the false overflow problem, and explain their working principle respectively.

    Answer: Queue overflow means that in the sequential storage allocation of the queue, there are already elements in all units, and the insertion operation is called queue overflow.

    False overflow refers to the phenomenon that in the sequential storage allocation of the queue, some storage units in the storage space allocated to the queue are not occupied, but the data elements entering the queue cannot be entered according to the operation rules.

    The method to solve the problem of false overflow is that in the sequential storage allocation of the queue, the storage space allocated to the queue can be recycled. The basic principle is to use the pointers representing the head and tail of the queue and the length of the storage space allocated to the queue to perform a modulo operation . Right now:

    Join operation: Q.rear=(Q.rear+1)%MSize

    Dequeue operation: Q.front=(Q.front+1)%MSize

  • The queue can be implemented with a single circular linked list, so only one head pointer or one tail pointer can be set. Please analyze which scheme is most suitable.

    Answer: Use a circular linked list to represent the queue. It is more appropriate to set the tail pointer, because the operation of entering the queue can be directly inserted after the tail node. The algorithmic time complexity of the team operation is O(1). If only the head pointer is set, the time complexity of the dequeue operation algorithm is O(1), and the algorithm time complexity of the enqueue operation is O(n).

  • Briefly describe the similarities and differences between linear lists, stacks and queues?

    Answer: Both the stack and the queue are linear tables with limited operation positions, that is, the positions of insertion and deletion operations are restricted. A stack is a linear table that only allows insertion and deletion at one end of the table, and is therefore a last-in, first-out table. A queue is a linear table that allows insertion at one end of the table and deletion at the other end of the table, so it is a first-in, first-out table. Linear tables can be inserted and deleted at any position.

string

  • A string is any sequence of finite characters
  • A string is a special kind of linear table, and its particularity is reflected in that the data element is a character
  • If two strings are equal, the length of the strings must be equal and the characters in each position in the strings must be equal  
  • Given two strings p and q, the operation of finding the first occurrence of q in p is called pattern matching
  • An empty string is a string               of length 0 .   

  •    A subsequence composed of any consecutive characters      in a string is called a substring of the string.

  • If s=“abcd”, then after executing the statement s2=Substr(s,2,2), s2= “bc”            . 

  • A blank string is   a string of one or more space characters    whose length is equal to   the number of space characters it contains     .

array 

  • The difference between a one-dimensional array and a linear table is that the former has a fixed length and the latter has a variable length
  • The relationship between the array elements of a multidimensional array is linear
  • There is an array A[8][10], each element occupies 3 storage units, and the number of storage units storing the array is 240
  • There is an array A[8][10], each element occupies 3 storage units, and the first address is SA, then the starting address of element [7][5] is SA+225
  • If there is an n*n symmetric matrix, and compressed storage is used, the number of elements stored in memory is n*(n+1)/2
  • Suppose A is a symmetric matrix of n*n, compressed and stored in a one-dimensional array B[0..n(n+1)/2-1], then the position of elements ai and j in the lower triangle part in B is i(i-1)/2+j-1
  • There are two general compression methods for sparse matrices, namely triplets and cross-linked lists
  • There is a symmetric matrix A of 10*10, compressed and stored in row-major order, each element occupies a storage unit, the address of a1,1 is 1, then the starting address of A8,5 is 33
  • Suppose the array A[50][80] has a base address of 2000, each element occupies 2 storage units, and is stored in row-major order, answer the following questions:

       (1) How many elements does the data set consist of?

       (2) How many storage units does this array occupy?

       (3) What is the storage address of the array element a[30][30]?

      answer:

         (1) The array has: 50*80=4000 elements

         (2) The array occupies 4000*4=8000 storage units

         (3)loc(30,30)=2000+(30*80+30)*2=2000+4860=6860

Trees and Binary Trees 

  • A complete binary tree of depth k has at least   2^k-1       nodes and at most 2^k-1    nodes.
  • In a binary tree, the number of nodes with degree 0 is n0, and the number of nodes with degree 2 is n2, then n0=n2 +1    .
  •   There are at most 2^(i-1)     nodes in the i-level of a binary tree , and a full binary tree with n nodes has a total of 2^k -1   nodes and a total of   2^k -1   leaf nodes
  • A binary tree with n nodes adopts a binary linked list storage structure, and has a total of    n+1     null pointer fields.
  • The two main differences between a tree and a binary tree are that     there is no limit to the maximum degree of the nodes in the tree, and the maximum degree of the nodes in the binary tree is limited to 2. The nodes of the tree are   not divided into left and right, and the nodes of the binary tree are divided into left and right   .
  • Any node in the tree is allowed to have   0 or more child      nodes, and except the root node, the rest of the nodes have    1   parent node.

picture 

  • In an undirected graph, the sum of the degrees of all vertices is twice that of all edges.
  • An undirected graph with n vertices has at most     n(n-1)/2        edges.
  • A strongly connected directed graph with n vertices has at least   n     edges.
  • A directed graph with n vertices has at most   n(n-1)     edges.
  •   The adjacency matrix    representation of a graph is unique, but   the adjacency list     representation is not unique.
  • For an undirected graph with 10 vertices, the total number of edges is at most   45     .
  • In a directed graph with n vertices, the degree of each vertex can be at most     n-1          .
  • It is known that a directed graph is represented by an adjacency matrix, and the method of calculating the in-degree of the i-th vertex is to     find the number of non-zero elements in the i-th column    .
  • Given the adjacency matrix representation of a directed graph, the way to delete all arcs starting from the i-th node is to   set the 1 corresponding to the i-th row to 0       .
  • For an undirected graph with n vertices, it is represented by an adjacency matrix. The method of finding the edge in the graph is to   calculate the number of elements whose value is 1 in the adjacency matrix       . The method of judging whether any two vertices are connected by an edge is   to judge the corresponding adjacency matrix elements. If the value of is 1, then divide by 2        , the way to find the degree of any vertex is    to find the number of elements whose value is 1 in the row corresponding to the vertex in the adjacency matrix       .
  • From the perspective of occupied storage space, for dense graphs and sparse graphs, which one is better to use adjacency matrix or adjacency list?

    Answer: In terms of storage space, it is better to use an adjacency matrix for a dense graph, and an adjacency list for a sparse graph.
  • When using an adjacency matrix to represent a graph, is the number of matrix elements related to the number of vertices? Is it related to the number of edges? Why? .

    Answer: The graph is represented by an adjacency matrix. The number of matrix elements is directly related to the number of vertices in the graph, and has nothing to do with the number of edges. Because the number of fixed points is assumed to be n, the size of the adjacency matrix is ​​n^2.

to sort 

  • Each time an element is taken out of the unordered sublist, it is inserted into the appropriate position in the ordered sublist. This sorting method is called insertion    sort; if the smallest or largest element is selected from the unordered sublist each time, it is Switching to one end of the ordered list, this sorting method is called direct selection     sorting.
  • Each time two elements are indirectly compared through the reference element, and the positions are exchanged when the agreed requirements are not met, this sorting method is called quick   sorting; the sorting method that merges two adjacent ordered lists into one ordered list is called   merge   sorting.
  •   The quick    sort method adopts the idea of ​​dichotomy, and   the heap    sort method organizes the data in a complete binary tree structure.
  • To perform direct selection sort on a table with n elements, the number of key comparisons required is    n(n-1)/2      .
  • In heap and quick sort, if the original records are close to positive or reverse order, heap    is used, and if the original records are out of order, fast         is selected .    

  • In insertion and selection sorting, if the initial data is basically in positive order, insert      is selected , and if the initial data is basically in reverse order, select is   selected     .

  • In heap sorting, quick sorting and merge sorting, if you only consider the storage space, you should choose the   heap sorting   method first, then choose   the quick sorting   method, and finally choose the merge sorting  method; if you only consider the fastest sorting method on average, then   The quick sort method should be selected ; if the worst case sorting is the fastest and the memory needs to be saved,   the heap sort      method should be selected.

Guess you like

Origin blog.csdn.net/weixin_46601559/article/details/126848329
Recommended