Discussion of Top k problem (java implementation and scope of application of three methods)
In many written tests and interviews, I like to examine Top K. The following three implementation methods and practical scope are given from my own experience.
- Merger
This method is suitable for the case where several arrays are ordered to find Top k. The time complexity is O(k*m). (m: is the number of arrays). The specific implementation is as follows:
/** * Knowing several m arrays in decreasing order, find the largest number in the first k of these data *Suitable for using Merge's method, time complexity (O(k*m); */ import java.util.List; import java.util.Arrays; import java.util.ArrayList; public class TopKByMerge{ public int[] getTopK(List<List<Integer>>input,int k){ int index[]=new int[input.size()];//Save the position of each array subscript scan; int result[]=new int[k]; for(int i=0;i<k;i++){ int max=Integer.MIN_VALUE; int maxIndex=0; for(int j=0;j<input.size();j++){ if(index[j]<input.get(j).size()){ if(max<input.get(j).get(index[j])){ max=input.get(j).get(index[j]); maxIndex=j; } } } if(max==Integer.MIN_VALUE){ return result; } result[i]=max; index[maxIndex]+=1; } return result; }
- Quick sort process
The quick sort process method uses the quick sort process to find Top k. The average time complexity is (O(n)). It is suitable for unordered single arrays. The specific java implementation is as follows:
The goal of Quick Select is to find the kth largest element, so
Select a pivot element pivot, partition the array into two sub-arrays,
- If the length of the split left subarray is > k, the kth largest element must appear in the left subarray;
- If the length of the split left subarray = k-1, the kth largest element is pivot;
- If the above two conditions are not satisfied, the kth largest element must appear in the right subarray.
/* *Using the process of quick sort to find the smallest k number * */ public class TopK{ int partion(int a[],int first,int end){ int i=first; int main=a[end]; for(int j=first;j<end;j++){ if(a[j]<main){ int temp=a[j]; a[j]=a[i]; a[i]=temp; i++; } } a[end]=a[i]; a[i]=main; return i; } void getTopKMinBySort(int a[],int first,int end,int k){ if(first<end){ int partionIndex=partion(a,first,end); if(partionIndex==k-1)return; else if(partionIndex>k-1)getTopKMinBySort(a,first,partionIndex-1,k); else getTopKMinBySort(a,partionIndex+1,end,k); } } public static void main(String []args){ int a[]={2,20,3,7,9,1,17,18,0,4}; int k=6; new TopK().getTopKMinBySort(a,0,a.length-1,k); for(int i=0;i<k;i++){ System.out.print(a[i]+" "); } } }
- Use a small root heap or a large root heap
To find the largest K, use a small root heap, and to find the smallest K use a large root heap.
Find the maximum K steps:
- A small root heap of K nodes is established according to the first K data.
- In the subsequent scan of NK data,
- If the data is larger than the root node of the small root heap, the value of the root node is overwritten with the data, and the node is adjusted to the small root heap.
- If the data is less than or equal to the root node of the small root heap, the small root heap is unchanged.
Finding the minimum K is similar to finding the maximum K. The time complexity is O(nlogK) (n: the length of the data), which is especially suitable for finding Top K of big data.
/** * Find the previous maximum K solutions: small root heap (when the amount of data is relatively large (especially when the memory cannot accommodate it), the heap is preferred) * * */ public class TopK { /** * Create a small root heap of k nodes * * @param a * @param k * @return */ int[] createHeap(int a[], int k) { int[] result = new int[k]; for (int i = 0; i < k; i++) { result[i] = a[i]; } for (int i = 1; i < k; i++) { int child = i; int parent = (i - 1) / 2; int temp = a[i]; while (parent >= 0 &&child!=0&& result[parent] >temp) { result[child] = result[parent]; child = parent; parent = (parent - 1) / 2; } result[child] = temp; } return result; } void insert(int a[], int value) { a[0]=value; int parent=0; while(parent<a.length){ int lchild=2*parent+1; int rchild=2*parent+2; int minIndex=parent; if(lchild<a.length&&a[parent]>a[lchild]){ minIndex = lchild; } if(rchild<a.length&&a[minIndex]>a[rchild]){ minIndex = rchild; } if(minIndex==parent){ break; }else{ int temp=a[parent]; a[parent]=a[minIndex]; a [minIndex] = temp; parent = minIndex; } } } int[] getTopKByHeap(int input[], int k) { int heap[] = this.createHeap(input, k); for(int i=k;i<input.length;i++){ if(input[i]>heap[0]){ this.insert(heap, input[i]); } } return heap; } public static void main(String[] args) { int a[] = { 4, 3, 5, 1, 2,8,9,10}; int result[] = new TopK().getTopKByHeap(a, 3); for (int temp : result) { System.out.println(temp); } } }
In many written tests and interviews, I like to examine Top K. The following three implementation methods and practical scope are given from my own experience.
- Merger
This method is suitable for the case where several arrays are ordered to find Top k. The time complexity is O(k*m). (m: is the number of arrays). The specific implementation is as follows:
/** * Knowing several m arrays in decreasing order, find the largest number in the first k of these data *Suitable for using Merge's method, time complexity (O(k*m); */ import java.util.List; import java.util.Arrays; import java.util.ArrayList; public class TopKByMerge{ public int[] getTopK(List<List<Integer>>input,int k){ int index[]=new int[input.size()];//Save the position of each array subscript scan; int result[]=new int[k]; for(int i=0;i<k;i++){ int max=Integer.MIN_VALUE; int maxIndex=0; for(int j=0;j<input.size();j++){ if(index[j]<input.get(j).size()){ if(max<input.get(j).get(index[j])){ max=input.get(j).get(index[j]); maxIndex=j; } } } if(max==Integer.MIN_VALUE){ return result; } result[i]=max; index[maxIndex]+=1; } return result; }
- Quick sort process
The quick sort process method uses the quick sort process to find Top k. The average time complexity is (O(n)). It is suitable for unordered single arrays. The specific java implementation is as follows:
The goal of Quick Select is to find the kth largest element, so
Select a pivot element pivot, partition the array into two sub-arrays,
- If the length of the split left subarray is > k, the kth largest element must appear in the left subarray;
- If the length of the split left subarray = k-1, the kth largest element is pivot;
- If the above two conditions are not satisfied, the kth largest element must appear in the right subarray.
/* *Using the process of quick sort to find the smallest k number * */ public class TopK{ int partion(int a[],int first,int end){ int i=first; int main=a[end]; for(int j=first;j<end;j++){ if(a[j]<main){ int temp=a[j]; a[j]=a[i]; a[i]=temp; i++; } } a[end]=a[i]; a[i]=main; return i; } void getTopKMinBySort(int a[],int first,int end,int k){ if(first<end){ int partionIndex=partion(a,first,end); if(partionIndex==k-1)return; else if(partionIndex>k-1)getTopKMinBySort(a,first,partionIndex-1,k); else getTopKMinBySort(a,partionIndex+1,end,k); } } public static void main(String []args){ int a[]={2,20,3,7,9,1,17,18,0,4}; int k=6; new TopK().getTopKMinBySort(a,0,a.length-1,k); for(int i=0;i<k;i++){ System.out.print(a[i]+" "); } } }
- Use a small root heap or a large root heap
To find the largest K, use a small root heap, and to find the smallest K use a large root heap.
Find the maximum K steps:
- A small root heap of K nodes is established according to the first K data.
- In the subsequent scan of NK data,
- If the data is larger than the root node of the small root heap, the value of the root node is overwritten with the data, and the node is adjusted to the small root heap.
- If the data is less than or equal to the root node of the small root heap, the small root heap is unchanged.
Finding the minimum K is similar to finding the maximum K. The time complexity is O(nlogK) (n: the length of the data), which is especially suitable for finding Top K of big data.
/** * Find the previous maximum K solutions: small root heap (when the amount of data is relatively large (especially when the memory cannot accommodate it), the heap is preferred) * * */ public class TopK { /** * Create a small root heap of k nodes * * @param a * @param k * @return */ int[] createHeap(int a[], int k) { int[] result = new int[k]; for (int i = 0; i < k; i++) { result[i] = a[i]; } for (int i = 1; i < k; i++) { int child = i; int parent = (i - 1) / 2; int temp = a[i]; while (parent >= 0 &&child!=0&& result[parent] >temp) { result[child] = result[parent]; child = parent; parent = (parent - 1) / 2; } result[child] = temp; } return result; } void insert(int a[], int value) { a[0]=value; int parent=0; while(parent<a.length){ int lchild=2*parent+1; int rchild=2*parent+2; int minIndex=parent; if(lchild<a.length&&a[parent]>a[lchild]){ minIndex = lchild; } if(rchild<a.length&&a[minIndex]>a[rchild]){ minIndex = rchild; } if(minIndex==parent){ break; }else{ int temp=a[parent]; a[parent]=a[minIndex]; a [minIndex] = temp; parent = minIndex; } } } int[] getTopKByHeap(int input[], int k) { int heap[] = this.createHeap(input, k); for(int i=k;i<input.length;i++){ if(input[i]>heap[0]){ this.insert(heap, input[i]); } } return heap; } public static void main(String[] args) { int a[] = { 4, 3, 5, 1, 2,8,9,10}; int result[] = new TopK().getTopKByHeap(a, 3); for (int temp : result) { System.out.println(temp); } } }