Why is Arrays.equals(char[], char[]) 8 times faster than all the other versions?

BeeOnRope :

Short Story

Based on my tests with a few different Oracle and OpenJDK implementations, it seems that Arrays.equals(char[], char[]) is somehow about 8 times faster than all the other variants for other types.

Figure 1

If the performance of your application is correlated heavily0 with comparing arrays for equality, this means you pretty much want to coerce all your data into char[], just to get this magic performance boost.

Long Story

Recently I was writing some high-performance code that used Arrays.equals(...) to compare keys used to index into a structure. The keys can be quite long, and often differ only in later bytes, so the performance of this method is pretty important.

At one point I was using keys of type char[], but as part of generalizing the service and to avoid some copies from underlying sources of byte[] and ByteBuffer, I changed this to byte[]. Suddenly2, the performance of many basic operations dropped by about 3x. I traced it down the above-mentioned fact: Arrays.equals(char[], char[]) seems to enjoy a special status over all the other Arrays.equals() versions, including the one taking short[] which is semantically identical (and could be implemented with the identical underlying code since signedness doesn't affect the behavior of equals).

So I wrote a JMH benchmark to test all the primitive variants of Arrays.equals(...)1 and the char[] variant crushes all the others, as shown above.

Now, this dominance of the ~8x variety doesn't extend at the same magnitude to smaller or much larger arrays - but it is still faster.

For small arrays, it seems that constant factors start to dominate, and for larger arrays, the L2/L3 or main memory bandwidth starts to come into play (you can already see the latter effect pretty clearly in the earlier figure, where int[] and especially long[] arrays start to degrade in performance at the large sizes). Here's a look at the same test, but with a smaller small array and larger large array:

Figure 2

Here, char[] is still kicking ass, just not as much as before. The per-element times for the small array (only 16 elements) are about double the standard time, probably due to function overheads: at around 0.5 ns/element, the char[] variant still only takes about 7.2 nanoseconds for the whole call, or about 19 cycles on my machine - so a small amount of method overhead cuts into the runtime a lot (also, the benchmark overhead itself is several cycles).

At the big end, cache and/or memory bandwidth is a driving factor - the long[] variant takes almost 2x as long as the int[] variant. The short[] and especially the byte[] variant aren't very effective (their working set still fits in the L3 in my machine).

The difference between char[] and all the others is extreme enough that for applications that rely on array comparison (this isn't actually that unusual for some specific domains), it's worth it to try to get all your data into char[] to take advantage. Huh.

What gives? Does char get special treatment because it underlies some String methods? Is it just another case of the JVM optimizing methods that are hit heavily in benchmarks, and not extending the same (obvious) optimizations to the other primitive types (especially short which is identical here)?


0 ... and that's not even all that crazy - consider various systems which rely, for example, on (lengthy) hash comparison to check if values are equal, or hash maps where the keys are either long or variably sized.

1I didn't include boolean[], float[] and double[] or double in the results to avoid cluttering the graph, but for the record boolean[] and float[] performed the same as int[], while double[] performed the same as long[]. That makes sense based on the underlying size of the types.

2I lie a bit here. The performance presumably changed suddenly but I didn't actually notice until I again ran the benchmarks, after a series of other changes, leading to a painful bisection process where I determined the causal change. This is a great reason to have some type of performance-measuring continuous integration.

apangin :

@Marco13 guess was right. HotSpot JVM has an intrinsic (i.e. special hand-coded implementation) forArrays.equals(char[], char[]), but not for other Arrays.equals methods.

The following JMH benchmark proves that disabling this intrinsic makes char[] array comparsion as slow as short[] array comparison.

@State(Scope.Benchmark)
public class ArrayEquals {
    @Param("100")
    int length;

    short[] s1, s2;
    char[] c1, c2;

    @Setup
    public void setup() {
        s1 = new short[length];
        s2 = new short[length];
        c1 = new char[length];
        c2 = new char[length];
    }

    @Benchmark
    public boolean chars() {
        return Arrays.equals(c1, c2);
    }

    @Benchmark
    @Fork(jvmArgsAppend = {"-XX:+UnlockDiagnosticVMOptions", "-XX:DisableIntrinsic=_equalsC"})
    public boolean charsNoIntrinsic() {
        return Arrays.equals(c1, c2);
    }

    @Benchmark
    public boolean shorts() {
        return Arrays.equals(s1, s2);
    }
}

Results:

Benchmark                     (length)  Mode  Cnt   Score   Error  Units
ArrayEquals.chars                  100  avgt   10  19,012 ± 1,204  ns/op
ArrayEquals.charsNoIntrinsic       100  avgt   10  49,495 ± 0,682  ns/op
ArrayEquals.shorts                 100  avgt   10  49,566 ± 0,815  ns/op

This intrinsic was added long ago in 2008 in the times of aggressive JVM competition. JDK 6 included a special alt-string.jar library which was enabled by -XX:+UseStringCache. I've found a number of calls to Arrays.equals(char[], char[]) from one of these special classes - StringValue.StringCache. The intrinsic was an essential part of this "optimization". In modern JDK there is no more alt-string.jar, but JVM intrinsic is still there (not playing its original role though).

Update

I've tested the same with JDK 9-ea+148, and it appears that _equalsC intrinsic makes very little performance difference.

Benchmark                     (length)  Mode  Cnt   Score   Error  Units
ArrayEquals.chars                  100  avgt   10  18,931 ± 0,061  ns/op
ArrayEquals.charsNoIntrinsic       100  avgt   10  19,616 ± 0,063  ns/op
ArrayEquals.shorts                 100  avgt   10  19,753 ± 0,080  ns/op

Arrays.equals implementation has changed in JDK 9.

Now it calls ArraysSupport.vectorizedMismatch helper method for all types of non-object arrays. Furthermore, vectorizedMismatch is also a HotSpot intrinsic which has hand-written assembly implementation that uses AVX.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=427176&siteId=1