STL sort parse the source code

Foreword

- This paper finishing from "STL source code analysis"
Although the source code is parsed older but the core idea did not change much and there are too many details look directly at the sourceI do not understand the latest

Brief introduction

sort accept two RandomAccessIterators (iterator random access memory), then all elements in the section ascending manner to gradually zo rearranged, second version allows the user to specify a functor as a sort criteria, all of the STL type container relationships have automatic sorting function does not require the Sort , Stack , Queue , priority-Queue has a special entrance, sorting is not allowed, and the rest the Vector , deque and list , both before iterators belong RandomAccessIterators , for sort, and list iterator It belongs BidirectionalIterators .

The STL sort algorithm, when the amount of data Quick Sort , recursive sorting segment, the segment data amount is smaller than a certain threshold once, as an example to avoid Quick Sort recursive call causing too much additional load on the switch Insertion the Sort . If the level of recursion too deep, will use Heap the Sort .

Insertion Sort

template <class RandomAccessIterator>
void __insertion_sort (RandomAccessIterator first,
                        RandomAccessIterator last){
   if(first == last) return;
   for(RandomAccessIterator i= first + 1; i != last; ++i){
      __linear_insert(first,i,value_type(first));
      //[first,i)形成一个子区间
   }
}
//版本一辅助函数
template <class RandomAccessIterator, class T>
inline void __linear_insert(RandomAccessIterator first,
                            RandomAccessIterator last, T*){
   T value  = *last;//记录末尾元素
   if(value < *first){ //如果尾部的值比头还小(头端必为最小元素)
      //无须比较 直接平移
      copy_backward(first, last, last + 1);
      *fist = value;
   }  
   else //尾部不小于头部
    __unguarded_linear_insert(last,value);                        
}

//版本一辅助函数
template <class RandomAccessIterator, class T>
void __unguarded_linear_insert(RandomAccessIterator last, T value){
   RandomAccessIterator next = last;
   --next;
   //insertion sort 的内循环
   //一旦不出现逆序对循环即可结束
   while(value < *next){ //逆序对存在
      *last = *next;     //调整
      last = next;      //调整迭代器
      --next;//左移一个位置
   }
   *last = value; //把value的值移动到正确的
}
         

The reason why the above function is named unguarded_x is because the general Insertion Sort in the inner loop needs to be done twice a judgment as to whether the two neighbors is "reverse order"
but also to determine whether out of bounds, but now inevitable, including the minimum guarantee the most edge layer subintervals, so once synthesized, that this large amount of data in time, execution times quite amazing.

After naming reasons similar to unguarded_x.

Median-of-Three (three-point median)

//返回 a,b,c之居中者
template <class T>
inline const T& __median(const T& a, const T& b, const T& c) {
    if (a < b)
        if (b < c)         // a<b<c
            return b;
        else if (a < c) // a < b, b >= c, a < c
            return c;
        else
            return a;
    else if (a < c) // c > a >=b
        return a;
    else if (b < c)  //a >= b, a >= c, b < c
        return c;
    return b;
}

Partitioning (division)

SGI STL partition function provided, which returns a value right after a segmentation position:

//版本一 
template <class RandomAccessIterator, class T>
RandomAccessIterator __unguarded_partition(RandomAccessIterator first,
    RandomAccessIterator last,
    T pivot) {
    while (true) {
        --last;
        while (*first < *last) ++first; //first 找到 >= pivot的元素就停下来
        --last;  //调整
        while (pivot < *last) --last; //last找到 <= pivot 的元素就停下来
        //一下first<last 判断操作只适用于 random iterator
        if (!(first < last)) return first; //交错,结束循环
            iter_swap(fist, last);//大小值交换
        ++first;//调整
    }
}

threshold (threshold)

Only the face of a small series of dozens of elements, use Quick Sort is not worth such a complex operation that may require a lot of sort, as in the case of a small amount of data, even simple Insertion Sort it may be faster than the Quick Sort - because Quick Sort will be minimal for sequences generated a lot of recursive function calls.
Therefore, the need for appropriate assessment and optimization algorithms.

final insertion sort

Optimization measures never too much, as long as we do not act rashly. If we make a certain size in the following sequence residence "almost sorted but not completed" state, and finally to once Insertion Sort all "almost sorted but not his work to fruition," the sequence to do a complete sort, its efficiency is generally considered to be better than "all sub-sequences completely sorted." This is because Insertion Sort in the face of sequence "almost sorted", and have a good performance.

introsort

Pivot improper selection can lead to improper segmentation, leading to Quick Sort rapid deterioration. David R. Musser proposed a hybrid sort algorithm in 1996 Introspective the Sorting , (introspective sort) referred InstroSort its behavior in most cases the same as above, but when divided into tendencies and behavior of quadratic behavior , self-detection, thereby switching to Heap Sort , to keep them in efficiency Heap Sort is O (NlogN).

//版本一
//千万注意:sort()只适用于RandomAccessIterator
template <class RandomAccessIterator>
inline void sort(RandomAccessIterator first,
    RandomAccessIterator last) {
    if (first != last) {
        __introsort_loop(first, last, value_type(first), __lg(last - first) * 2)
            __final_insertion_sort(first, last);
    }
}

    //其中 __lg() 用来控制分割恶化的情况:
//找出 2^k <= n 的最大k 例如: n = 7,得 k=2, n=20,得k=4,n=8,得k=3
template<class Size>
inline Size __lg(Size n) {
    Size k;
    for (k = 0; n > 1; n >= 1) ++k;
    return k;
}

    //当元素个数为40时,__introsort_loop()的最后一个参数将是5*2,意思是最多允许分割10层。

IntroSort algorithm is as follows:

//版本一
//注意,本函数内的许多迭代器运算操作,都只适用于RandomAccess Iterators
template <class RandomAccessIterator, class T, class Size>
void __introsort_loop(RandomAccessIterator first,
    RandomAccessIterator last, T*,
    Size depth_limit) {
    //以下, __stl_threshold 是个全局阐述,稍早定义为 const int 16
    while (last - first >= __stl_threshold) { // > 16
        if (depth_limit == 0) {             //至此,分裂恶化
            partial_sort(first, last, last);//改用heapsort
            return;
        }
        --depth_limit;
        //以下分别是 median-of-3 partition,选择一个够好的枢轴并决定分割点
        //分割点将落在迭代器cut 身上
        RandomAccessIterator cut = __unguarded_partition
        (first, last, T(__median(
            *first,
            *(first + (last - first) / 2),
            *(last - 1)
        )));
        //对右半段递归进行 sort.
        __introsort_loop(cut, last, value_type(first), depth_limit);
        last = cut;
        //现在回到while循环,准备对左半段递归进行sort
        //这种写法可读性差,效率并没有比较好
        // RW STL , 采用一般教科书写法(直观地对左半段和右半段递归),较易阅读
    }
}

Function start sequence determination size, __ stl_threshold generating a global integer u, are defined as follows:
const int __stl_threshold = 16 ;
By checking after the number of elements, then check segmentation level, if the level exceeds a specified value divided (this I have made in the previous paragraph of text Description), _sort to switch to partial (), psrtial_sort () is a Heap Sort done.

After all these tests, we will enter the Quick Sort Exactly the same procedure: to determine the position of the pivot median-of-3 method: then calls _unduarded_partition () to find the split point, and then for about paragraph recursive introsort .

When __introsort_loop () ends with a sequence number of "less than 16 the number of elements" of [first, last) within each sub-sequence have a considerable sorting Chengdu, but not yet fully sorted (less than once because the number of elements __stl_threshold, further operation will be terminated sorted), back to the parent function Sort (), re-enters __final_insertion_sort ().

//版本一
template <class RandomAccessIterator>
void __final_insertion_sort(RandomAccessIterator first,
    RandomAccessIterator last)
{
    if (last - first > __stl_threshold) { // >16
        __insertion_sort(first, first + __stl_threshold);
        __unguarded_insertion_sort(first + __stl_threshold, last);
    }
    __insertion_sort(first, last);
}

This function first determines whether the number of elements is greater than 16, if the answer is no to be processed is called __inertion_sort (), if the answer is yes, it will [first, last) divided period of the sub-sequences of length 16, and another piece of the remaining subsequence, and then calls for the two sub-sequences __insertiong_sort () and __unguarded_insertion_sort (respectively), the former has been demonstrated through the source, which code is as follows:

//版本一
template <class RandomAccessIterator>
inline void __unguarded_insertion_sort(RandomAccessIterator first,
    RandomAccessIterator last) {
    __unguarded_insertion_sort_aux(first, last, value_type(first));
}
//版本一
template <class RandomAccessIterator,class T>
void _unguarded_insertion_sort_aux(RandomAccessIterator first,
    RandomAccessIterator last,
    T*) {
    for (RandomAccessIterator i = first; i != last; ++i)
        __unguarded_linear_insert(i, T(*i));
}

This is the exciting part of the SGI STL sort of, in order to compare the following RW STL sort () the upper part of the source code for
RW version with purely Quick Sort is not introsort .

RW STL sort()

template <class RandomAccessIterator>
inline void sort(RandomAccessIterator first,
    RandomAccessIterator last)
{
    if (!(first == last))
    {
        __quick_sort_loop(first, last);
        __final_insertion_sort(first, last);
    }
}

template <class RandomAccessIterator>
inline void __quick_soort_loop(RandomAccessIterator first,
    RandomAccessIterator last)
{
    __quick_soort_loop_aux(first, last, _RWSTD_VALUE_TYPE(first));
}

template <class RandomAccessIterator,class T>
void __quick_sort_loop_aux(RandomAccessIterator first,
    RandomAccessIterator last,
    T*)
{
    while (last - first > __stl_threshold)
    {
        //median-of-3 partitioning
        RandomAccessIterator cut = __unguarded_partition(first, last, T(__median(*first, *(first + (last - first) / 2), *(last - 1)));
        if (cut - first >= last - cut)
        {
            __quick_soort_loop(cut, last);//对右段递归处理
            last = cut;
        }
        else
        {
            __quick_sort_loop(first,cut);
            first = cut;
        }
    }
}

Guess you like

Origin www.cnblogs.com/shcnee/p/12233580.html