Software Designer (Data Structures)

data structure

  • Time complexity (here are all non-recursive)

    • O(1)<O(log2n)<O(n)<O(nlog2n)<O(n²)<O(n^3)
      • Addition: add, retain the highest item and convert it to 1 4n^3+2n²+4n+2 =n^3
      • Multiplication: Multiply, factorize to 1
      • Rule: parentheses first, multiplication and addition
      • log2n is common while loop x=x*2
      • n common for loop x++
    • Time complexity (recursive) = number of recursions * time complexity of each recursion (similar complexity)
  • Space complexity (non-recursive)

    • O(1) O(n) O(n²) (non-recursive) recursive: nlog2n log2n
    • Look at the defined space 1 a space n one-dimensional array n² two-dimensional array
  • progressive symbol
    Please add a picture description

  • recursive main method
    Please add a picture description

    • Faster runtime means less time complexity
  • linear structure

    • linear table
      • The simplest, most basic and most commonly used linear structure, using sequential storage and chain storage
      • Definition: A linear list is a finite sequence of (n>=0) elements.
      • Non-empty linear list, except for the first, all other elements have only one direct predecessor, except for the last, all other elements have only one direct successor
      • When there is only one it is both the first and the last, there is no predecessor and successor.
      • sequential storage
        • A group of storage units with consecutive addresses are stored in sequence in the data elements of the linear table. data only
        • Advantages: can randomly access elements in the table to find time complexity O(1)
        • Disadvantages: insertion and deletion require moving elements
          • Insertion requires an average move of n/2 time complexity O(n)
          • Delete needs to move n-1/2
      • chain storage
        • Data elements are stored in nodes linked by pointers. data field|pointer field
        • Leading (not leading) singly linked list insertion, deletion, search average complexity O(n) average movement 0
  • the stack

    • A linear data structure that can only access one end of it for storage and retrieval of data
    • Features: first in last out, one end for insertion and deletion is the top of the stack, and the other end is the bottom of the stack. Empty stack with no data elements
    • When implementing recursive calls and return processing of functions or procedures
    • There is no need to traverse when popping and pushing the stack
  • queue

    • A first-in-first-out linear table only allows elements to be inserted at one end of the table (Rear), and elements are deleted at the other end of the table (Front).
  • string

    • A finite sequence consisting of only characters is a linear table.
    • Spaces are also counted as length, and it is an empty string (length is zero) if it does not contain any characters.
    • Substring: 'abc' ab bc ac (not counting) composed of consecutive characters of any length in the string
    • string pattern matching
      • n main string: bbcabem mode string: abe
      • Simple pattern matching The first digit of the pattern string compares the first digit of the main string. If it matches, it compares the second digit. If it does not match, the first digit of the pattern string matches the second digit of the main string. By analogy, the matching returns the main string the starting position of . If it fails, it returns 0 at the beginning of 1, and returns -1 at the beginning of 0.
      • Time complexity best O(m) orO(1) worst O(m*n) or(n-m+1)m average O(n+m)
    • KMP
      • Prefix: b bb bbc bbca bbcab (take the above main string n as an example)
      • Suffix: e be abe cabe bcabe
      • The next value of the i-th character = the longest equal prefix and suffix length in the 1~i-1 string + 1
        • Special case next[1]=0
        • ps: main string: the fifth next value of ababa next 3
        • Time complexity O(n+m)
  • The first address of the array
    Please add a picture description

  • matrix

    • n*n matrix stores n² elements
    • Symmetric matrix
      • It is Aij=Aji, so when storing, only the main diagonal line + lower triangle area needs to be stored
      • When i>=j, store by row starting from 0 Aij=(i+1)i/2+j+1 When j>=i, just solve according to Aji=Aij
    • tridiagonal matrix
      • That is, there are only three diagonal lines, and the triangles (all 0) next to it are not required and do not need to be stored
      • Store Aij=2i+j+1 starting from 0 by row
    • sparse matrix
      • The matrix is ​​very large and stores very little
      • Ternary sequence list and cross linked list are compressed storage methods
  • Tree

    • The degree is the number of child nodes, and the node with degree 0 is a leaf node.
    • The maximum number of layers of a tree is recorded as the height of the tree, and the maximum degree is recorded as the degree of the tree
    • nature
      Please add a picture description
  • binary tree

    • The maximum degree of a node is 2, and a tree with 0 nodes is an empty tree with recursive properties. The subtree of a node needs to be divided into left subtree and right subtree

    • nature
      Please add a picture description

      • Complete binary tree: the number can be continuous from 1-n
        Please add a picture description
    • sequential storage

      • A group of nodes in a binary tree stored in consecutive address storage units, the nodes are arranged in a linear sequence
      • In the worst case, with depth k and only k nodes, 2 to the kth power -1 storage unit is required
    • chain storage

      • A binary tree node contains data elements, the root of the left subtree, the root of the right subtree, and parents, so it can be stored in a ternary table and a binary table. The head of the linked list points to the root node of the binary tree
      • There will be n+1 null pointer fields for n nodes in the binary tree table
      • There will be n+2 null pointer fields for n nodes in the ternary tree table
    • Binary tree traversal algorithm

      • Preorder traversal (root left and right)
      • Inorder traversal (left root right)
      • Post-order traversal (left and right root)
      • Hierarchical traversal (from the first layer to the i-th layer, each layer traverses from left to right)
      • To launch other sequences, there must be an in-order traversal, and another one can be added at will.
    • balanced binary tree

      • The absolute value of the difference between the heights of the left and right subtrees of any node in a binary tree does not exceed 1
      • If it is a complete binary tree, it must be a balanced binary tree, and the leaf nodes must satisfy
    • Binary Sorting Tree (Binary Search Tree)

      • The root node is greater than all the nodes of the left subtree and less than all the key nodes of the right node. The left and right subtrees are also a binary sorting tree
      • The sequence obtained by inorder traversal is an ordered sequence
    • Optimal Binary Tree (Huffman Tree)

      • A Class of Trees with the Shortest Weighted Path Length (WPL)
      • Construct an optimal binary tree (not unique but wpl is the same)
        • Find the two smallest weights from front to back
        • Add to the end after small left and big right
        • The weights are the same, from front to back
      • There are only nodes with degrees 0 and 2, the total number is: 2n-1
    • Huffman coding

      • Left zero right one, several digits * several, greedy strategy
      • compression ratio
        • (equal-length encoding-Huffman encoding)/equal-length encoding
        • Equal-length code x: 2 to the power of x >= several characters, that is, several digits
        • Huffman coding: several digits * frequency of occurrence
    • threaded binary tree

      • In order to save the predecessor and successor information of the node in any sequence, use the null pointer field to store
  • picture

    • n vertices, there are n(n-1)/2 undirected complete graphs and n(n-1) directed complete graphs
    • Directed and undirected degree = number of sides * 2
    • Undirected connected graph: every vertex is connected
      • have at least n-1 edges
      • up to n(n-1)/2
    • Directed strongly connected graph: every vertex can be connected back and forth
      • at least n sides
      • up to n(n-1)
    • Path: How many edges have passed through, that is, how many
    • Leading matrix: the relationship between vertices, suitable for dense graph storage (many edges)
      • The undirected graph is symmetric, and the i-th row (column) is the non-zero number of degree 2e of the vertex
      • The directed graph is asymmetric, the row is the out-degree, and the column is the in-degree e non-zero numbers
    • Linked list, suitable for sparse graph storage (few edges)
      • map network map
    • Graph traversal: starting from a certain vertex, visiting all vertices, and visiting only once
      • Depth-first traversal (DFS) stack and breadth-first search (BFS) queue
        • sequence is not unique
        • Average complexity of directed graph: O(n²) leading matrix, O(n+e) leading table
    • topological sort
      • AOV Network (Directed Acyclic Graph)
        • There is a directed path from vertex Vi to vertex Vj, then vertex Vi is the predecessor of Vj
        • <Vi,Vj> Then Vi is the direct predecessor of Vj, and Vj is the direct successor of Vi
        • There may be a path from Vi to Vj, but there must be no path from Vj to Vi
      • Topologically sort the AOV network
        • z selects a vertex with in-degree 0 in the net and outputs it
        • Delete the vertex and the arcs associated with the vertex from the net
        • Repeat the previous two steps until there is no vertex with an in-degree of 0 in the network
  • look up

    • Finds a collection representing data elements of the same type.
    • The key is the value of a data item of the data element, one is the primary key, and several are secondary keys. Give the keyword, if it is found, it means the search is successful, if it fails, it means a null pointer
    • Static search: sequential, half, block search
    • Dynamic search: binary sorting tree, balanced binary tree, B_ tree, hash table
    • The average search length of successful sequential search: n+1/2
    • Half (divided) search (normally rounded down)
      • Sequential storage is required, and the orderly array is incremented
      • Average search length: log2(n+1)-1
      • Up to [log2n]+1 rounded down
    • hash table
      • Hash function, which maps keywords to space, this process is called hash table or hash, and the stored address is hash address or hash address
      • The conflict is that different keywords have the same hash function value, and the same address is mapped, and the two are synonyms
      • Conflict is inevitable and can only be avoided as much as possible
      • Hash construction method: In addition to retaining and taking the remainder, H(key)=key%m=address
        • m generally takes a prime number close to n (n elements) but not greater than n
        • Whenever possible, use all components of the keyword to work
      • handle conflict
        • open address law
          • Hi=(H(key)%di)%m(table length) linear detection method
          • Secondary detection method: 1, -1, 4, -4, 9, -9...k², -k²
        • Chain address method: add a chain field after the record to store the storage address of the next record with the same hash function value
      • Hash table lookup
        • Dependent factors: hash function, method of handling collisions, fill factor a of the hash table
          • a = the number of records loaded in the table / the length of the hash table
          • The smaller a, the smaller the possibility of conflict, and vice versa.
    • heap
      • Large top pile ki>=k2i and k2i+1
      • Small top heap ki<=k2i and k2i+1
      • You can draw pictures according to the sequential storage of the binary tree and learn how to construct a large (small) top heap
  • to sort

    • Put the corresponding keywords. After sorting, the increasing (decreasing) relationship is satisfied

    • Internal sorting: sorting records are all stored in memory

    • external sort
      Please add a picture description

    • Counting sort is suitable for numbers with only 0-9 in the sequence, counting the number of each number

    • Direct insertion sorting cannot be homing Simple selection sorting can be homing

    • Heap sorting can be homing Quick sorting (divide and conquer algorithm) can be homing and merging (divide and conquer) cannot be homing

Guess you like

Origin blog.csdn.net/weixin_45113182/article/details/128679086