numpy.partition usage

Features

np.partition workflow can be seen on the first array of sorted (ascending order), then the index i is used as a reference element, the element is separated into two portions, i.e., larger than the element placed behind, is less than the element front, somewhat similar to row fast here, see the following specific sub-classes:

import numpy as np
result = np.random.randint(1, 13,(6,4))
print(result)
result1 = np.sort(result,axis=0)
print(result1)
result2 = np.partition(result, kth=2, axis=0)
print(result2)

Here we are in columns (axis = 0) were sort

To validate our original idea to get a two-dimensional array sort result1

Then each column index i.e. the third element 2 reference will be divided into two parts each column

After sorting the first column such third element 7, then it is greater than are placed behind it, which are placed less than the front

Similarly the second element is the third column sorting 8

After the third column is the third element 3 to sort the like

There is a need to pay attention on the array is thereafter as in the first or front row disordered 10,10,8

It is usually applied to identify the most value

Suppose now that we find the second smallest number of each column, we can do this:

import numpy as np
result = np.random.randint(1, 13,(6,4))
print(result)
result1 = np.sort(result,axis=0)
print(result1)
result2 = np.partition(result, kth=1, axis=0)[1]
print(result2)

Of course, it does not consider the case where the de-duplication, i.e., such that only the second column 2,4,6,11,12 Ordinarily these types of data, the second data is small 4

Similarly we can also choose the second largest data for each column:

import numpy as np
result = np.random.randint(1, 13,(6,4))
print(result)
result1 = np.sort(result,axis=0)
print(result1)
result2 = np.partition(result, kth=-2, axis=0)[-2]
print(result2)

why

Why look for the K-th most value to do it? The reason is that this method is faster, in fact, is not the first internal numpy.partition to sort the array, but only consider the K-th most value, regardless of the order in which the front and rear of the array, faster, are interested can look at the source code , in short, to find the most value k, which is an optional program it

Guess you like

Origin blog.csdn.net/weixin_42001089/article/details/89204112