Unconventional code in JDK - specific constructions used for unknown reason

Shadov :

I was looking through JDK (JDK 12, but it applies to older ones too) code and found some weird constructions, and I don't understand why they were used. Let's take for example Map.computeIfPresent, since it's simple:

default V computeIfPresent(K key, BiFunction<? super K, ? super V, ? extends V> remappingFunction) {
    Object oldValue;
    if ((oldValue = this.get(key)) != null) {
        V newValue = remappingFunction.apply(key, oldValue);
        if (newValue != null) {
            this.put(key, newValue);
            return newValue;
        } else {
            this.remove(key);
            return null;
        }
    } else {
        return null;
    }
}

This construction if ((oldValue = this.get(key)) != null) surprised me. I knew it's possible, because it's nothing really special, but in normal production code I would consider it a code smell. Why not just write it in a normal way (Object oldValue = this.get(key))? Must be some clutch optimization, that's what I thought.

Wrote a smaller version to check the bytecode:

int computeIfPresent(int key) {
  Integer oldValue;
  if ((oldValue = get(key)) != null) {
    return oldValue;
  } else {
    return 2;
  }
}

Bytecode output:

int computeIfPresent(int);
  Code:
     0: aload_0
     1: iload_1
     2: invokevirtual #2                  // Method get:(I)Ljava/lang/Integer;
     5: dup
     6: astore_2
     7: ifnull        15
    10: aload_2
    11: invokevirtual #3                  // Method java/lang/Integer.intValue:()I
    14: ireturn
    15: iconst_2
    16: ireturn

Bytecode for 'normal' version with classic variable initialization:

int computeIfPresent(int);
  Code:
     0: aload_0
     1: iload_1
     2: invokevirtual #2                  // Method get:(I)Ljava/lang/Integer;
     5: astore_2
     6: aload_2
     7: ifnull        15
    10: aload_2
    11: invokevirtual #3                  // Method java/lang/Integer.intValue:()I
    14: ireturn
    15: iconst_2
    16: ireturn

The only difference is dup + astore_2 vs astore_2 + aload_2. I could even suspect that the first 'clutch-optimized' version is worse, because dup is used and stack is bigger for no reason. Maybe my example was too simple and optimization scales a lot in more complicated context.

And this is a simple example and definitely not a single occurrence in JDK code - open HashMap.java, there is tons of fragments like this, sometimes multiple on the same line:

if ((first = tab[i = (n - 1) & hash]) != null)

Even though it's really simple I have to stop for a moment and think what this code actually does, because of those constructions.

What is the real reason behind using those constructions? I'm sure it's not just bad code. In my opinion code quality suffers a lot, so the benefit must be substantial. Or just the rule leave small optimizations to JIT does not apply to JDK, because it must squeeze as much performance as possible?

Or is this just going extreme with rule initialize variables as late as possible?:)

Brian :

The answer to your question lies in the JVM specification, specifically in exactly the difference you pointed out: the dup instruction (JVMS §6.5.dup). From those docs:

Duplicate the top value on the operand stack and push the duplicated value onto the operand stack.

Looking at the operand stack documentation (JVMS §2.6.2, emphasis added):

A small number of Java Virtual Machine instructions (the dup instructions (§dup) and swap (§swap)) operate on run-time data areas as raw values without regard to their specific types; these instructions are defined in such a way that they cannot be used to modify or break up individual values. These restrictions on operand stack manipulation are enforced through class file verification (§4.10).

Following that one more level deep and looking at the class verification section (JVMS §4.10, emphasis added):

Link-time verification enhances the performance of the run-time interpreter. Expensive checks that would otherwise have to be performed to verify constraints at run time for each interpreted instruction can be eliminated. The Java Virtual Machine can assume that these checks have already been performed.

This shows that these restrictions are validated at link time, which is when the JVM loads your class file. So to answer your question:

What is the real reason behind using those constructions?

Let's dissect what the instructions do in each case:

In the first case (with the dup instruction):

invokevirtual stores the result at the top of the operand stack
dup duplicates that so now there are two copies of the result on the top of the stack
astore_2 stores that into local variable #2, which pops one reference off the operand stack
ifnull checks if the top of the operand stack is null, and if so, goes to instruction 15, otherwise continues (we'll assume it's not null)
aload_2 pushes local variable #2 onto the top of operand stack
invokevirtual calls a method on the top value of the operand stack, pops it, then pushes the result
ireturn pops the top value off the operand stack and returns it

In the second case:

invokevirtual stores the result at the top of the operand stack
astore_2 pops the result off the operand stack and stores it in local variable #2
aload_2 pushes local variable #2 onto the top of operand stack
ifnull checks if the top of the operand stack is null, and if so, goes to instruction 15, otherwise continues (we'll assume it's not null)
aload_2 pushes local variable #2 onto the top of operand stack
invokevirtual calls a method on the top value of the operand stack, pops it, then pushes the result
ireturn pops the top value off the operand stack and returns it

So what's the difference? The first one calls aload_2 once and dup once, the second just calls aload twice. The difference here will be practically nothing. If you look at the size of the stack throughout the operations, you'll see that the first implementation grows the operand stack by one extra value (less than 10 bytes, usually 8 or 4 bytes depending on 64-bit or 32-bit JVM), but has one less local-variable load from stack memory. The second one keeps the operand stack slightly smaller, but has one extra local-variable load (read: fetch from memory).

At the end of the day, these optimizations will have a very minimal impact except in applications with extremely low memory, e.g. embedded systems. So for you? Do what's readable.

When in doubt: "Premature optimization (can be) the root of all evil." Until you know that your code is slow or can prove that it's slow before running it, you're better off writing code that's readable. This hardly falls into the critical 3% of what you should optimize ahead of time.

Unconventional code in JDK - specific constructions used for unknown reason

Guess you like