Written interview questions: find the median of massive data - Code World

Written interview questions: find the median of massive data

Others 2021-01-25 12:29:53 views: null

Originally published in:

The median is the number in the middle after sorting. The median is a frequent visitor in written interview interviews. This problem was encountered twice in T company's intern recruitment and campus recruitment N years ago.

The median of non-mass data

Refer to the previous article: top-K problem and random selection algorithm . Obviously, we can use:

1. Quick sort algorithm

After sorting, find the median directly.

2. Direct selection algorithm

Select directly until the median is selected.

3. Heap selection algorithm

Heap selection, finally selected to the median.

4. Random selection algorithm

Randomly select the algorithm and find the median.

The median of massive data

If it is to find the median of massive data, it is not easy to use the above method, because it is impossible to load the massive data in the large file into the memory. What should I do?

Refer to the previous article: Huashan on the Sword Barrel Sorting . Note that we need to ask for the median, not the sort. Specific steps are as follows:

Step1: Create multiple small file buckets, set the value range of each bucket, and then assign the massive data elements to the corresponding buckets, and record the number of elements in the buckets.

Step2: According to the number of elements in the bucket, calculate the bucket where the median is located, and then sort the bucket to find the median value of the massive data.

The specific schematic diagram is as follows:

The median is actually a special order statistic. Whether it is non-mass data or massive data, we can quickly find the median.

Guess you like

Origin blog.csdn.net/stpeace/article/details/108921752

Written interview questions: find the median of massive data

[C++] Massive data interview questions

(Reproduced) Massive data processing: ten interview questions and a summary of ten massive data processing methods

(Reproduced) Massive data processing: ten interview questions and a summary of ten massive data processing methods

Written interview questions: find the smallest positive integer missing

Always find the median of the data stream

Median (binary search) wins the interview questions 41. offer data stream

HiveSql interview questions-how to analyze the median?

Written interview questions: edit distance

Python interview written test questions

Share written test interview questions

Interview questions---find

[LeetCode 295] Find Median from Data Stream

295.Find Median from Data Stream

[Interview with experts]——Written test questions (4 questions)

Index: How to quickly find a particular data in massive data?

Find the median?

The strongest data structure in history - stack and queue related written test interview questions

Huawei Written Questions: Find Brother Words

Huawei Written Questions: Find the longest path of a string

go golang pen written interview questions interview questions

LeetCode Find the median of two positively ordered arrays (4 questions)

Realization of Huawei computer test questions in C language [find the mode and median]

Exercise 09.09 | Java Programming written interview questions

Digital IC common written examination and interview questions

Written interview questions: Frog Jumping and Fibonacci Sequence

Written interview questions: 1000 factorial question

Written interview questions: Judgment of symmetric tree

Written interview questions: Judgment of the same tree

Analysis of Written Test Questions in Java Interview

Recommended

Arc Browser for Windows 1.0 officially GA

A programmer born in the 1990s developed a video porting software and made over 7 million in less than a year. The ending was very punishing!

Ranking

1. Select Sort

Create a thread thread

3 press to play ball that reach 6

Programmation CUDA (4) : gestion de la mémoire

SpringBoot database connection pool Druid error

E Diudiu App redesign summary

4EVERLAND Hosting now supports SNS+IPFS

About HTTPS

[vue3+vite+ts+element-plu+sass] uses bug records in sass

Interpretation of HUAWEI CLOUD GaussDB (for Influx): Best Practice Data Modeling

Daily

More

2024-05-03(8)

2024-05-02(0)

2024-05-01(4)

2024-04-30(36)

2024-04-29(5)

2024-04-28(12)

2024-04-27(29)

2024-04-26(22)

2024-04-25(32)

2024-04-24(30)