I assumed that separating objects that implement different interfaces into several lists and iterating those lists afterwards would be faster than dumping all objects into a single list and then switching via instanceof
. E.g. this:
ArrayList<Visible> visibles = new ArrayList<>();
ArrayList<Highlightable> highlightables = new ArrayList<>();
ArrayList<Selectable> selectables = new ArrayList<>();
// populate the lists
// Visible is an interface, Highlightable is also interface (extends Visible),
// Selectable extends Highlightable
// All interfaces have 3 concrete subclasses each,
// to test situations when JVM is not able to optimize too much due to small number of classes
for (Visible e : visibles) {
vsum += e.visibleValue();
}
for (Highlightable e : highlightables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
}
for (Selectable e : selectables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
ssum += e.selectableValue();
}
should be faster than
ArrayList<Visible> visibles = new ArrayList<>();
// populate the list
for (Visible e : visibles) {
if (e instanceof Selectable) {
vsum += e.visibleValue();
hsum += ((Selectable) e).highlightableValue();
ssum += ((Selectable) e).selectableValue();
} else if (e instanceof Highlightable) {
vsum += e.visibleValue();
hsum += ((Highlightable) e).highlightableValue();
} else {
vsum += e.visibleValue();
}
}
However it doesn't seem to be the case:
Main.separateLists thrpt 30 1546.898 ± 32.312 ops/s
Main.singleListAndInstanceof thrpt 30 1673.733 ± 29.804 ops/s
I added full source for the benchmark below.
What could be the cause of instanceof
version being faster? Even if we assume that isntanceof
instruction is free, then both versions should at least be equal in performance (because they still add elements to list and iterate them).
package test;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Warmup;
import java.util.ArrayList;
import java.util.Random;
public class Main {
public static void main(String[] args) throws Exception {
org.openjdk.jmh.Main.main(args);
}
@Benchmark
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 15, time = 1)
@Fork(value = 2)
public static long separateLists() {
ArrayList<Visible> visibles = new ArrayList<>(3_500);
ArrayList<Highlightable> highlightables = new ArrayList<>(3_500);
ArrayList<Selectable> selectables = new ArrayList<>(3_500);
Random random = new Random();
for (int i = 0; i < 10_000; i++) {
switch (random.nextInt(9)) {
case 0:
visibles.add(new Visible1(i));
break;
case 1:
highlightables.add(new Highlightable1(i));
break;
case 2:
selectables.add(new Selectable1(i));
break;
case 3:
visibles.add(new Visible2(i));
break;
case 4:
highlightables.add(new Highlightable2(i));
break;
case 5:
selectables.add(new Selectable2(i));
break;
case 6:
visibles.add(new Visible3(i));
break;
case 7:
highlightables.add(new Highlightable3(i));
break;
case 8:
selectables.add(new Selectable3(i));
break;
}
}
long listSize = visibles.size() + highlightables.size() + selectables.size();
long vsum = 0;
long hsum = 0;
long ssum = 0;
for (Visible e : visibles) {
vsum += e.visibleValue();
}
for (Highlightable e : highlightables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
}
for (Selectable e : selectables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
ssum += e.selectableValue();
}
return listSize + vsum * hsum * ssum;
}
@Benchmark
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 15, time = 1)
@Fork(value = 2)
public static long singleListAndInstanceof() {
ArrayList<Visible> visibles = new ArrayList<>(10_000);
Random random = new Random();
for (int i = 0; i < 10_000; i++) {
switch (random.nextInt(9)) {
case 0:
visibles.add(new Visible1(i));
break;
case 1:
visibles.add(new Highlightable1(i));
break;
case 2:
visibles.add(new Selectable1(i));
break;
case 3:
visibles.add(new Visible2(i));
break;
case 4:
visibles.add(new Highlightable2(i));
break;
case 5:
visibles.add(new Selectable2(i));
break;
case 6:
visibles.add(new Visible3(i));
break;
case 7:
visibles.add(new Highlightable3(i));
break;
case 8:
visibles.add(new Selectable3(i));
break;
}
}
long listSize = visibles.size();
long vsum = 0;
long hsum = 0;
long ssum = 0;
for (Visible e : visibles) {
if (e instanceof Selectable) {
vsum += e.visibleValue();
hsum += ((Selectable) e).highlightableValue();
ssum += ((Selectable) e).selectableValue();
} else if (e instanceof Highlightable) {
vsum += e.visibleValue();
hsum += ((Highlightable) e).highlightableValue();
} else {
vsum += e.visibleValue();
}
}
return listSize + vsum * hsum * ssum;
}
}
abstract class Visible {
abstract int visibleValue();
}
abstract class Highlightable extends Visible {
abstract int highlightableValue();
}
abstract class Selectable extends Highlightable {
abstract int selectableValue();
}
class Visible1 extends Visible {
private int v;
Visible1(int v) {
this.v = v;
}
@Override int visibleValue() { return v; }
}
class Highlightable1 extends Highlightable {
private int v;
Highlightable1(int v) {
this.v = v;
}
@Override int visibleValue() { return v*2; }
@Override int highlightableValue() { return v*3; }
}
class Selectable1 extends Selectable {
private int v;
Selectable1(int v) {
this.v = v;
}
@Override int visibleValue() { return v*4; }
@Override int highlightableValue() { return v*5; }
@Override int selectableValue() { return v*6; }
}
class Visible2 extends Visible {
private int v;
Visible2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*7; }
}
class Highlightable2 extends Highlightable {
private int v;
Highlightable2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*8; }
@Override int highlightableValue() { return v*9; }
}
class Selectable2 extends Selectable {
private int v;
Selectable2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*10; }
@Override int highlightableValue() { return v*11; }
@Override int selectableValue() { return v*12; }
}
class Visible3 extends Visible {
private int v;
Visible3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*13; }
}
class Highlightable3 extends Highlightable {
private int v;
Highlightable3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*14; }
@Override int highlightableValue() { return v*15; }
}
class Selectable3 extends Selectable {
private int v;
Selectable3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*16; }
@Override int highlightableValue() { return v*17; }
@Override int selectableValue() { return v*18; }
}
Benchmark output:
Main.separateLists thrpt 600 1690.522 ± 6.570 ops/s
Main.singleListAndInstanceof thrpt 600 1751.375 ± 4.368 ops/s
Main.separateLists:L1-dcache-load-misses:u thrpt 2 2298.258 #/op
Main.singleListAndInstanceof:L1-dcache-load-misses:u thrpt 2 627.451 #/op
Main.separateLists:L1-dcache-loads:u thrpt 2 1217756.290 #/op
Main.singleListAndInstanceof:L1-dcache-loads:u thrpt 2 1135982.650 #/op
Main.separateLists:L1-icache-load-misses:u thrpt 2 113.599 #/op
Main.singleListAndInstanceof:L1-icache-load-misses:u thrpt 2 99.896 #/op
Main.separateLists:L1-icache-loads:u thrpt 2 656048.382 #/op
Main.singleListAndInstanceof:L1-icache-loads:u thrpt 2 694074.004 #/op
Main.separateLists:LLC-load-misses:u thrpt 2 872.681 #/op
Main.singleListAndInstanceof:LLC-load-misses:u thrpt 2 355.666 #/op
Main.separateLists:LLC-loads:u thrpt 2 12036.496 #/op
Main.singleListAndInstanceof:LLC-loads:u thrpt 2 7445.434 #/op
Main.separateLists:LLC-stores:u thrpt 2 15277.223 #/op
Main.singleListAndInstanceof:LLC-stores:u thrpt 2 10649.517 #/op
Main.separateLists:branch-misses:u thrpt 2 22463.763 #/op
Main.singleListAndInstanceof:branch-misses:u thrpt 2 29940.958 #/op
Main.separateLists:branches:u thrpt 2 254018.586 #/op
Main.singleListAndInstanceof:branches:u thrpt 2 275450.951 #/op
Main.separateLists:cycles:u thrpt 2 1988517.839 #/op
Main.singleListAndInstanceof:cycles:u thrpt 2 1921584.057 #/op
Main.separateLists:dTLB-load-misses:u thrpt 2 66.212 #/op
Main.singleListAndInstanceof:dTLB-load-misses:u thrpt 2 64.442 #/op
Main.separateLists:dTLB-loads:u thrpt 2 1217929.340 #/op
Main.singleListAndInstanceof:dTLB-loads:u thrpt 2 1135799.981 #/op
Main.separateLists:iTLB-load-misses:u thrpt 2 4.179 #/op
Main.singleListAndInstanceof:iTLB-load-misses:u thrpt 2 3.876 #/op
Main.separateLists:iTLB-loads:u thrpt 2 656595.175 #/op
Main.singleListAndInstanceof:iTLB-loads:u thrpt 2 693913.010 #/op
Main.separateLists:instructions:u thrpt 2 2273646.245 #/op
Main.singleListAndInstanceof:instructions:u thrpt 2 2045332.939 #/op
Main.separateLists:stalled-cycles-backend:u thrpt 2 773671.154 #/op
Main.singleListAndInstanceof:stalled-cycles-backend:u thrpt 2 619477.824 #/op
Main.separateLists:stalled-cycles-frontend:u thrpt 2 184604.485 #/op
Main.singleListAndInstanceof:stalled-cycles-frontend:u thrpt 2 271938.450 #/op
Main.separateLists:·gc.alloc.rate thrpt 600 217.266 ± 0.846 MB/sec
Main.singleListAndInstanceof:·gc.alloc.rate thrpt 600 222.747 ± 0.556 MB/sec
Main.separateLists:·gc.alloc.rate.norm thrpt 600 202181.035 ± 2.986 B/op
Main.singleListAndInstanceof:·gc.alloc.rate.norm thrpt 600 200083.395 ± 4.720 B/op
Main.separateLists:·gc.churn.PS_Eden_Space thrpt 600 217.792 ± 3.841 MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space thrpt 600 223.528 ± 4.973 MB/sec
Main.separateLists:·gc.churn.PS_Eden_Space.norm thrpt 600 202704.197 ± 3508.997 B/op
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space.norm thrpt 600 200804.794 ± 4414.457 B/op
Main.separateLists:·gc.churn.PS_Survivor_Space thrpt 600 0.095 ± 0.008 MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space thrpt 600 0.091 ± 0.008 MB/sec
Main.separateLists:·gc.churn.PS_Survivor_Space.norm thrpt 600 88.896 ± 7.778 B/op
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space.norm thrpt 600 81.693 ± 7.269 B/op
Main.separateLists:·gc.count thrpt 600 2440.000 counts
Main.singleListAndInstanceof:·gc.count thrpt 600 2289.000 counts
Main.separateLists:·gc.time thrpt 600 4501.000 ms
Main.singleListAndInstanceof:·gc.time thrpt 600 4236.000 ms
UPDATE: Below is the benchmark code and results with array setup code extracted into separate methods and removed from the measurement. instanceof
is slower for that case, as expected - above differences are probably related to branch prediction issues in the list setup. (while those are interesting too, they probably should go into separate question)
package test;
import org.openjdk.jmh.annotations.*;
import java.util.ArrayList;
import java.util.Random;
public class Main {
public static void main(String[] args) throws Exception {
org.openjdk.jmh.Main.main(args);
}
@State(Scope.Thread)
public static class SeparateListsState {
public ArrayList<Visible> visibles;
public ArrayList<Highlightable> highlightables;
public ArrayList<Selectable> selectables;
@Setup(Level.Invocation)
public void doSetup() {
visibles = new ArrayList<>();
highlightables = new ArrayList<>();
selectables = new ArrayList<>();
Random random = new Random(9698426994L + 8879);
for (int i = 0; i < 10_000; i++) {
switch (random.nextInt(9)) {
case 0:
visibles.add(new Visible1(i));
break;
case 1:
highlightables.add(new Highlightable1(i));
break;
case 2:
selectables.add(new Selectable1(i));
break;
case 3:
visibles.add(new Visible2(i));
break;
case 4:
highlightables.add(new Highlightable2(i));
break;
case 5:
selectables.add(new Selectable2(i));
break;
case 6:
visibles.add(new Visible3(i));
break;
case 7:
highlightables.add(new Highlightable3(i));
break;
case 8:
selectables.add(new Selectable3(i));
break;
}
}
}
}
@Benchmark
@BenchmarkMode(Mode.Throughput)
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 150, time = 1)
@Fork(value = 2)
public static long separateLists(SeparateListsState state) {
long vsum = 0;
long hsum = 0;
long ssum = 0;
for (Visible e : state.visibles) {
vsum += e.visibleValue();
}
for (Highlightable e : state.highlightables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
}
for (Selectable e : state.selectables) {
vsum += e.visibleValue();
hsum += e.highlightableValue();
ssum += e.selectableValue();
}
return vsum * hsum * ssum;
}
@State(Scope.Thread)
public static class SingleListAndInstanceofState {
public ArrayList<Visible> visibles;
@Setup(Level.Invocation)
public void doSetup() {
visibles = new ArrayList<>();
Random random = new Random(9698426994L + 8879);
for (int i = 0; i < 10_000; i++) {
switch (random.nextInt(9)) {
case 0:
visibles.add(new Visible1(i));
break;
case 1:
visibles.add(new Highlightable1(i));
break;
case 2:
visibles.add(new Selectable1(i));
break;
case 3:
visibles.add(new Visible2(i));
break;
case 4:
visibles.add(new Highlightable2(i));
break;
case 5:
visibles.add(new Selectable2(i));
break;
case 6:
visibles.add(new Visible3(i));
break;
case 7:
visibles.add(new Highlightable3(i));
break;
case 8:
visibles.add(new Selectable3(i));
break;
}
}
}
}
@Benchmark
@BenchmarkMode(Mode.Throughput)
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 150, time = 1)
@Fork(value = 2)
public static long singleListAndInstanceof(SingleListAndInstanceofState state) {
long vsum = 0;
long hsum = 0;
long ssum = 0;
for (Visible e : state.visibles) {
if (e instanceof Selectable) {
vsum += e.visibleValue();
hsum += ((Selectable) e).highlightableValue();
ssum += ((Selectable) e).selectableValue();
} else if (e instanceof Highlightable) {
vsum += e.visibleValue();
hsum += ((Highlightable) e).highlightableValue();
} else {
vsum += e.visibleValue();
}
}
return vsum * hsum * ssum;
}
}
abstract class Visible {
abstract int visibleValue();
}
abstract class Highlightable extends Visible {
abstract int highlightableValue();
}
abstract class Selectable extends Highlightable {
abstract int selectableValue();
}
class Visible1 extends Visible {
private int v;
Visible1(int v) {
this.v = v;
}
@Override int visibleValue() { return v; }
}
class Highlightable1 extends Highlightable {
private int v;
Highlightable1(int v) {
this.v = v;
}
@Override int visibleValue() { return v*2; }
@Override int highlightableValue() { return v*3; }
}
class Selectable1 extends Selectable {
private int v;
Selectable1(int v) {
this.v = v;
}
@Override int visibleValue() { return v*4; }
@Override int highlightableValue() { return v*5; }
@Override int selectableValue() { return v*6; }
}
class Visible2 extends Visible {
private int v;
Visible2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*7; }
}
class Highlightable2 extends Highlightable {
private int v;
Highlightable2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*8; }
@Override int highlightableValue() { return v*9; }
}
class Selectable2 extends Selectable {
private int v;
Selectable2(int v) {
this.v = v;
}
@Override int visibleValue() { return v*10; }
@Override int highlightableValue() { return v*11; }
@Override int selectableValue() { return v*12; }
}
class Visible3 extends Visible {
private int v;
Visible3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*13; }
}
class Highlightable3 extends Highlightable {
private int v;
Highlightable3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*14; }
@Override int highlightableValue() { return v*15; }
}
class Selectable3 extends Selectable {
private int v;
Selectable3(int v) {
this.v = v;
}
@Override int visibleValue() { return v*16; }
@Override int highlightableValue() { return v*17; }
@Override int selectableValue() { return v*18; }
}
And the results:
Main.separateLists thrpt 300 4211.552 ± 23.791 ops/s
Main.singleListAndInstanceof thrpt 300 3920.251 ± 15.478 ops/s
Main.separateLists:L1-dcache-load-misses:u thrpt 2 3046.033 #/op
Main.singleListAndInstanceof:L1-dcache-load-misses:u thrpt 2 1089.122 #/op
Main.separateLists:L1-dcache-loads:u thrpt 2 1090745.006 #/op
Main.singleListAndInstanceof:L1-dcache-loads:u thrpt 2 1125243.609 #/op
Main.separateLists:L1-icache-load-misses:u thrpt 2 150.542 #/op
Main.singleListAndInstanceof:L1-icache-load-misses:u thrpt 2 143.304 #/op
Main.separateLists:L1-icache-loads:u thrpt 2 600852.620 #/op
Main.singleListAndInstanceof:L1-icache-loads:u thrpt 2 700771.042 #/op
Main.separateLists:LLC-load-misses:u thrpt 2 1299.520 #/op
Main.singleListAndInstanceof:LLC-load-misses:u thrpt 2 636.764 #/op
Main.separateLists:LLC-loads:u thrpt 2 14408.815 #/op
Main.singleListAndInstanceof:LLC-loads:u thrpt 2 10429.768 #/op
Main.separateLists:LLC-stores:u thrpt 2 18999.178 #/op
Main.singleListAndInstanceof:LLC-stores:u thrpt 2 15370.582 #/op
Main.separateLists:branch-misses:u thrpt 2 22578.062 #/op
Main.singleListAndInstanceof:branch-misses:u thrpt 2 29257.959 #/op
Main.separateLists:branches:u thrpt 2 258026.890 #/op
Main.singleListAndInstanceof:branches:u thrpt 2 284911.889 #/op
Main.separateLists:cycles:u thrpt 2 1915774.770 #/op
Main.singleListAndInstanceof:cycles:u thrpt 2 1974841.023 #/op
Main.separateLists:dTLB-load-misses:u thrpt 2 101.573 #/op
Main.singleListAndInstanceof:dTLB-load-misses:u thrpt 2 99.982 #/op
Main.separateLists:dTLB-loads:u thrpt 2 1090174.103 #/op
Main.singleListAndInstanceof:dTLB-loads:u thrpt 2 1129185.929 #/op
Main.separateLists:iTLB-load-misses:u thrpt 2 4.432 #/op
Main.singleListAndInstanceof:iTLB-load-misses:u thrpt 2 3.955 #/op
Main.separateLists:iTLB-loads:u thrpt 2 600301.665 #/op
Main.singleListAndInstanceof:iTLB-loads:u thrpt 2 703339.482 #/op
Main.separateLists:instructions:u thrpt 2 1974603.052 #/op
Main.singleListAndInstanceof:instructions:u thrpt 2 2040460.093 #/op
Main.separateLists:stalled-cycles-backend:u thrpt 2 808914.974 #/op
Main.singleListAndInstanceof:stalled-cycles-backend:u thrpt 2 685615.056 #/op
Main.separateLists:stalled-cycles-frontend:u thrpt 2 186013.216 #/op
Main.singleListAndInstanceof:stalled-cycles-frontend:u thrpt 2 272207.204 #/op
Main.separateLists:·gc.alloc.rate thrpt 300 346.891 ± 1.166 MB/sec
Main.singleListAndInstanceof:·gc.alloc.rate thrpt 300 358.297 ± 0.614 MB/sec
Main.separateLists:·gc.alloc.rate.norm thrpt 300 310744.294 ± 0.107 B/op
Main.singleListAndInstanceof:·gc.alloc.rate.norm thrpt 300 328992.302 ± 0.110 B/op
Main.separateLists:·gc.churn.PS_Eden_Space thrpt 300 349.387 ± 14.305 MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space thrpt 300 360.039 ± 9.075 MB/sec
Main.separateLists:·gc.churn.PS_Eden_Space.norm thrpt 300 313154.953 ± 13018.012 B/op
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space.norm thrpt 300 330629.833 ± 8345.712 B/op
Main.separateLists:·gc.churn.PS_Survivor_Space thrpt 300 0.092 ± 0.012 MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space thrpt 300 0.094 ± 0.011 MB/sec
Main.separateLists:·gc.churn.PS_Survivor_Space.norm thrpt 300 82.348 ± 10.661 B/op
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space.norm thrpt 300 86.465 ± 10.417 B/op
Main.separateLists:·gc.count thrpt 300 1196.000 counts
Main.singleListAndInstanceof:·gc.count thrpt 300 1235.000 counts
Main.separateLists:·gc.time thrpt 300 2178.000 ms
Main.singleListAndInstanceof:·gc.time thrpt 300 2355.000 ms
As NoDataFound suggests in the comments, you're not just comparing the performance of iterating through the list, you're also comparing the list population methods. You need to pull this part of your code into a setup method - otherwise you're potentially going to be impacted by resize operations on your three ArrayList
instances (amongst other things).
You should also either scrap the use of Random
to populate the list, or at least instantiate it with the same seed across both implementations - otherwise you're not creating a repeatable order of elements across both implementations.