[Data Structure] Sorting and Search Review

Find and sort

Find

  1. Static lookup table
    Dynamic lookup table
  2. Search within
    Search outside
  3. Average search length ASL (Average Search Length)
    average search length ASL (Average Search Length) is defined as
    sigma(pi*ci)=1/n*sigma(ci)
    where, n is the number of elements in the lookup table, pi is the probability of finding the i-th element, generally, unless otherwise specified, It is considered that the search probability of each element is equal, that is, pi=1/n (1≤i≤n), and ci is the number of keyword comparisons required to find the i-th element.

Linear table lookup

sequential search

There is a "sentinel". In short, it is a boundary, you can want it or not

half search

Condition: In order
, it means two points

//拆半查找非递归算法 
int BinSearch1(vector<int>& R,int k) {
    
    
    int n=R.size();
    int low=0,high=n-1;
    while (low<=high){
    
      //要的等于
        int mid=(low+high)/2;
        if (k==R[mid])
            return mid;//书上这种是在这里返回答案
        if (k<R[mid])
            high=mid-1;
        else
            low=mid+1;
    }
}
return -1;
//自己更喜欢这种写法
int tofind(int x,int l,int r) {
    
    
    int mid;
    while(l<r) {
    
    //反正这里和下面的要对应
        mid=(l+r)/2;
        if(cc[mid]>=x) r=mid;//这个下面的r对应
        else l=mid+1;
    }
    //cout<<cnt[r]<<'\n';
    if(cc[r]==x)
        return cnt[r];
    else return 0;
}
Search in chunks

Does it look like? -associative cache?
block number is index

Tree table lookup - dynamic lookup table

BST

Small on the left, large on the right,
middle order traversal order

  1. Delete node: find the rightmost one in the left subtree and replace it
  2. Add nodes: just go down step by step
balanced binary tree

The height balance property of AVL trees: the heights of the left and right subtrees of each node in the tree differ by at most 1.

  1. LL
  2. LR
  3. RR
  4. RL
Red-black tree - weakly balanced tree

1 The color of each node is red or black.
The color of 2 nodes is black.
3 The color of all external nodes is black.
4 If a node is red, all its child nodes are black.
5 For each node, all paths starting from the node contain the same number of black nodes.
A path specifically refers to the path from a node to an external node among its descendants.

B+/B tree

Hash table lookup

Assume the number of elements to be stored is n, and set a continuous memory unit with a length of m (m≥n).
Taking the keyword ki (0≤i≤n-1) of each element as an independent variable, ki is mapped to the address (or relative address) h(ki) of the memory unit through a hash function h.
and store the element in this memory unit

conflict

For two different keywords ki and kj (i≠j), h(ki)=h(kj) appears. This phenomenon is called a hash conflict.
Elements with different keywords but the same hash address are called "synonyms", and this type of conflict is also called a synonym conflict. (This is a thing)

relation

  1. Relevant to loading factor. The so-called loading factor α refers to the ratio of the number of elements n stored in the hash table to the size of the hash address space m, that is, α = n/m. The smaller α is, the smaller the possibility of conflict is; but the smaller α is, the lower the storage space utilization is.
  2. Depends on the hash function used.
  3. Relating to conflict resolution methods.
    solve:
  1. Open addressing method:
    (1) Linear detection method: +1 in sequence until a gap is found
    (2) Square detection method: d0=h(k), di=(d0±i^2) mod m (a bit like tree climbing method ?)
  2. The zipper method (somewhat like an adjacency list)
    The zipper method is a method of linking all synonyms in a singly linked list.
    In this method, what is stored in each unit of the hash table is no longer the element itself, but the head node pointer of the corresponding synonyms singly linked list.
    Since any number of nodes can be inserted into a singly linked list, the loading factor α can be set to be greater than 1 or less than or equal to 1 depending on the number of synonyms. Usually α = 0.75.

unordered_map

hash function constructor
  1. Direct addressing method: h=k+C
  2. Division leaving remainder method: h(k)=k mod p (p prime number - the conflict is as small as possible)

sort

  1. Internal sorting External sorting
  2. Sorting not based on comparison: radix sort
  3. If there are multiple elements with the same key in the table to be sorted, and the relative order between these elements with the same key remains unchanged after sorting, then this sorting method is said to be stable.
    On the contrary, if the relative order between elements with the same keyword changes, this sorting method is said to be unstable.

insertion sort

Insert from unordered area to ordered area

direct insertion sort

Stablize

Insert in half

Stable
sorting is not optimized: because moving back takes time

Hill sort

Divide into n groups
and swap the two far apart!
But it is insertion sort.
The number of groups is getting smaller and smaller.
O(n^1.58)
is unstable.

swap sort

Bubble Sort

swap
compares from left to right, swaps, compares from left to right, swaps
adjacent swaps

Quick sort

Trinomial tree
selects the base, l\r moves to the middle respectively, if one of them is not changed, then the current pointer moves again (like playing cards)
quick sort algorithm

selection sort

Simple selection sorting
selects the smallest element from an unordered area. The simplest method is to compare elements one by one. For example, select the smallest element R[minj] from the unordered area R[i...n-1].

Heap sort

Unstable
space O(1)
time O(nlogn)

merge sort

Written all year round
and very stable

Radix sort

Least significant digit first (LSD)/Most significant digit first (MSD)
is probably which one comes last

Compare options

The code can be read or not, and it can only write merge sort.

Visualization - a good thing

Insert image description here

Guess you like

Origin blog.csdn.net/qq_39440588/article/details/129150838