Sword refers to offer (C++)-JZ41: Median in data stream (algorithm-sorting)

Author: Zhai Tianbao Steven
Copyright statement: The copyright belongs to the author. For commercial reprint, please contact the author for authorization. For non-commercial reprint, please indicate the source

Title description:

How to get the median in a data stream? If an odd number of values ​​are read from the data stream, the median is the value in the middle of all values ​​sorted. If an even number of values ​​are read from the data stream, the median is the average of the middle two numbers after all the values ​​are sorted. We use the Insert() method to read the data stream, and use the GetMedian() method to obtain the median of the currently read data.

Data range: The number of numbers in the data stream satisfies 1≤n≤1000, and the size satisfies 1≤val≤1000 

Advanced: Space complexity O(n), time complexity O(nlogn) 

Example:

enter:

[5,2,3,4,1,6,7,0,8]

return value:

"5.00 3.50 3.00 3.50 3.00 3.50 4.00 3.50 4.00 "

illustrate:

5, 2, 3... are continuously spit out in the data stream, and the average numbers obtained are 5, (5+2)/2, 3...

Problem-solving ideas:

This question is a sorting question. Three ways of solving problems.

1) Quick Sort

       Insert in order, perform quick sorting when obtaining the median, and then output the results according to the odd and even sizes.

       Time complexity O(nlogn), space complexity O(n).

2) Insertion sort

       Based on the idea of ​​insertion sorting, when inserting, first analyze the position where the data should be inserted, so that the order of the data set can be maintained, and the median can be directly output according to the odd and even size.

       The time complexity of the insertion operation is the maximum value of binary search and moving data, O(n), and because n times of insertion are performed, the time complexity is at worst O(n2), and the best is O(n), space Complexity O(n).

3) Heap sort

       Use two priority sequences (large and small top heaps) to divide the entire data into two parts, large and small, maintain dynamic balance for insertion, obtain the median directly according to the parity size, and calculate the data from the root of the large and small top heaps.

       Time complexity O(nlogn), space complexity O(n).

Test code:

1) Quick Sort

#include <algorithm>
class Solution {
public:
    vector<int> v;
    // 插入
    void Insert(int num)
    {
        v.emplace_back(num);
    }
    // 获取中位数
    double GetMedian()
    { 
        int size = int(v.size());
        sort(v.begin(), v.end());
        // 奇偶分情况讨论
        // 右移1位等于除以2
        if(size & 1){
            return double(v[size >> 1]);
        }
        else{
            return double(v[size >> 1] + v[(size - 1) >> 1]) / 2;
        }
    }
};

2) Insertion sort

#include <algorithm>
class Solution {
public:
    vector<int> v;
    // 插入
    void Insert(int num)
    {
        // 查找合适插入的位置
        auto idx = lower_bound(v.begin(), v.end(), num);
        v.insert(idx, num);
    }
    // 获取中位数
    double GetMedian()
    { 
        int size = int(v.size());
        // 奇偶分情况讨论
        // 右移1位等于除以2
        if(size & 1){
            return double(v[size >> 1]);
        }
        else{
            return double(v[size >> 1] + v[(size - 1) >> 1]) / 2;
        }
    }
};

3) Heap sort

class Solution {
public:
    priority_queue<int> minp; // 大顶堆
    priority_queue<int, vector<int>, greater<int>> maxp; // 小顶堆
    // 插入
    void Insert(int num)
    {
        // 大顶堆中存放数据
        minp.push(num);
        // 取大顶堆中最大的值放入小顶堆中
        maxp.push(minp.top()); 
        minp.pop();
        // 平衡两堆数据数量
        if (minp.size() < maxp.size()){
            minp.push(maxp.top());
            maxp.pop();
        }
    }
    // 获取中位数
    double GetMedian()
    { 
        return minp.size() > maxp.size() ? static_cast<double>(minp.top()) : static_cast<double>(minp.top() + maxp.top()) / 2;
    }
};

Guess you like

Origin blog.csdn.net/zhaitianbao/article/details/131697335