Sorting in JDK: Source Code Implementation of Arrays.sort

Sorting in JDK: Source Code Implementation of Arrays.sort

Sorting in Java is not that simple

  How is sorting implemented in the JDK? What algorithm is used? Some people may say that quicksort is used, but in fact the implementation of sorting in JDK is not that simple. Let's enter the source code of Arrays.sort to find out

Take a look at all the overloaded methods of Arrays.sort as a whole

  As an overloaded method, Arrays.sort has many different parameter implementations:
insert image description here
  From the point of view of sorting the array as a whole or partial sorting, it can be divided into two categories. Partial sorting means that the starting and ending points of the sorting part need to be passed in. According to our experience , the overall sorting may be the sort method of the partial sorting called, but the starting point of the input is 0, the end point is length - 1, look at the source code: we
insert image description here
insert image description here
  found that the overall sorting is not the partial sorting method of the call, Arrays.sort(int[] a)and Arrays.sort(int[] a, int fromIndex, int toIndex)it is just an entry, they Both methods will be called DualPivotQuicksort.sort, and the starting and ending points of the sorting part will be passed in, but the starting and ending points of the overall sorting are 0 and length - 1.
  From the perspective of the type of array elements, Arrays.sort can be divided into sorting of basic data types and sorting of generic and Object arrays. Entering the source code, we can find that for the sorting of arrays of basic data types, Arrays.sort will call DualPivotQuicksort.sortmethods, while the sorting implementations of generic and Object arrays are different.

Sorting Arrays of Primitive Data Types

  For the sorting of arrays of basic data types, Arrays.sort will call DualPivotQuicksort.sortthe method. Let's take a look at part of the source code of this method:
insert image description here
insert image description here
insert image description here

  At the very beginning of this method, a judgment will be made. If the length of the array is <286, it will be called sort(a, left, right, true). Let's take a look at this method:
insert image description here
insert image description here
  that is, if the length of the array is small and the length is <47, insertion sort will be used. If the array length >= 47, quick sort will be used. This is determined by the characteristics of the sorting algorithm, because when the length of the array is small, under the average result of a large number of tests, insertion sort will be faster than quicksort.
  So what about when the array length >= 286? We went back to DualPivotQuicksort.sortthe method and found that it will judge the structure of the array. If the array is basically ordered, it will use merge sorting. If the elements of the array are arranged in a chaotic manner, the method will be called. Since the length of the array is >=286, it is also sort(a, left, right, true)> =47, so a quick sort will be performed. Why this design is also determined by the characteristics of the sorting algorithm. Although the (average) time complexity of quick sort and merge sort is the same, for basically ordered arrays, merge sort will be faster than quick sort, and for almost For unordered arrays, merge sort is slower than quick sort.
  To sum up, for the sorting of arrays of basic data types, the relationship between the choice of sorting algorithm and the length of the array is as follows:

array length the sorting algorithm used
length < 47 insertion sort
47 <= length < 286 quick sort
length >= 286 and the array is basically ordered merge sort
length >= 286 and the array is basically unordered quick sort

Sorting Object arrays and generic arrays

  For the sorting of generic arrays, we can pass in the object of the class that implements the Comparator interface, or not. In fact, uploading and not passing in are the same method called, but when not passing in, the corresponding parameter is null . Let's take a look at the source code of Arrays.sort's sorting of Object arrays and generic arrays:
insert image description here
insert image description here
  We found that there is an if else condition to determine which method to call in the end, and JDK8 will choose TimSort as the sorting algorithm by default. The TimSort algorithm is a hybrid sorting algorithm derived from merge sort and insertion sort. In principle, TimSort is a merge sort, but insertion sort is used in the merging of small fragments. For the sorting of generic arrays, if the object of the class that implements the Comparator interface is not passed in, sort(Object[] a)the method will be called
  . For the above two overloaded methods of Arrays.sort, they are only entry points, and the final sorting methods they call are different. One Yes ComparableTimSort.sort, one is TimSort.sort. Let's enter the source code to see the difference between the two:

ComparableTimSort.sortFirst, let's take a look at part of the source code of the method   that will be called to sort the Object array :
insert image description here

  ComparableTimSort.sortcountRunAndMakeAscendingThe method and method will be called binarySort, and these two methods have the operation of converting the array elements to the Comparable interface type, because it needs to call the compareTo method in the Comparable interface to compare elements, and only one method is defined in the Comparable interface , that is compareTo.

insert image description here
insert image description here
insert image description here
  Therefore, if you call Arrays.sort(Object[] o) to sort the Object array, but the class represented by the array element type does not implement the Comparable interface, then Java will consider the objects of this class to be incomparable, and a ClassCastException will be thrown exceptions, such as:
insert image description here

  Let's take a look at the method that will be called for generic array sorting TimSort.sort:
insert image description here
  we found that it will also call countRunAndMakeAscendingmethods and binarySortmethods, but these two methods are generic versions. This is an old routine of JDK. In order to support generics back then, The generic version of many original methods has been reproduced. Let's take a look at the source code: we
insert image description here
  found that in the generic version, it will use the parameter c we passed in, that is, the compare method of the object of the class that implements the Comparator interface to perform element For comparison, the Comparator interface defines many methods including the compare method.
  Therefore, if you sort a generic array, either the class represented by the array element implements the Comparable interface (calling Arrays.sort(T[] a) is equivalent to calling Arrays.sort(Object[] a)), or calling Arrays.sort When passing in the object of the class that implements the Comparator interface, otherwise an exception will be thrown.

  Some people may ask, what about arrays of wrapper classes, such as arrays of type Integer, I sort them directly, why no exception is thrown? That's because the wrapper classes all implement the Comparable interface, and they are all comparable

Points to note when using Arrays.sort

  Arrays.sort in JDK actually uses the template mode in the design mode, which encapsulates the steps of the sorting algorithm, and leaves how to compare two array elements to the programmer. When we sort a custom class, we can make this class implement the Comparable interface and override its compareTo method. You can also create a class that implements the Comparator interface and override its compare method. The logic of how to compare two array elements is written in the two methods that need to be rewritten.
  Comparing the size of two array elements o1 and o2 is nothing more than three results: o1>o2, o1=o2, o1<o2. Therefore, there are three situations for the return value of the compareTo method and the compare method, which are designed for the default ascending order. When o1>o2, return a positive integer, if o1=o2, return 0, and if o1<o2, return a negative integer. For a class that implements the Comparable interface, o1 is this, representing the current class object. If we return the corresponding value according to the above corresponding relationship in the logic of the rewriting method, calling Arrays.sort will get the result in ascending order, and if we reverse the corresponding relationship, we will get the result in descending order.
  It should be noted that the return value of the compareTo method and the compare method must have a negative integer and a positive integer, otherwise TimSort will be invalid. We still can't be lazy, and we have to write the positive, negative, and zero return values ​​in a regular manner.

Guess you like

Origin blog.csdn.net/qq_44709990/article/details/122201673