Why instanceof and iterating single list is faster than several specialized lists?

Rogach :

I assumed that separating objects that implement different interfaces into several lists and iterating those lists afterwards would be faster than dumping all objects into a single list and then switching via instanceof. E.g. this:

ArrayList<Visible> visibles = new ArrayList<>();
ArrayList<Highlightable> highlightables = new ArrayList<>();
ArrayList<Selectable> selectables = new ArrayList<>();

// populate the lists
// Visible is an interface, Highlightable is also interface (extends Visible),
// Selectable extends Highlightable
// All interfaces have 3 concrete subclasses each,
// to test situations when JVM is not able to optimize too much due to small number of classes

for (Visible e : visibles) {
    vsum += e.visibleValue();
}
for (Highlightable e : highlightables) {
    vsum += e.visibleValue();
    hsum += e.highlightableValue();
}
for (Selectable e : selectables) {
    vsum += e.visibleValue();
    hsum += e.highlightableValue();
    ssum += e.selectableValue();
}

should be faster than

ArrayList<Visible> visibles = new ArrayList<>();

// populate the list

for (Visible e : visibles) {
    if (e instanceof Selectable) {
        vsum += e.visibleValue();
        hsum += ((Selectable) e).highlightableValue();
        ssum += ((Selectable) e).selectableValue();
    } else if (e instanceof Highlightable) {
        vsum += e.visibleValue();
        hsum += ((Highlightable) e).highlightableValue();
    } else {
        vsum += e.visibleValue();
    }
}

However it doesn't seem to be the case:

Main.separateLists             thrpt   30     1546.898 ±     32.312   ops/s
Main.singleListAndInstanceof   thrpt   30     1673.733 ±     29.804   ops/s

I added full source for the benchmark below.

What could be the cause of instanceof version being faster? Even if we assume that isntanceof instruction is free, then both versions should at least be equal in performance (because they still add elements to list and iterate them).

package test;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Warmup;

import java.util.ArrayList;
import java.util.Random;

public class Main {

    public static void main(String[] args) throws Exception {
        org.openjdk.jmh.Main.main(args);
    }

    @Benchmark
    @Warmup(iterations = 5, time = 1)
    @Measurement(iterations = 15, time = 1)
    @Fork(value = 2)
    public static long separateLists() {
        ArrayList<Visible> visibles = new ArrayList<>(3_500);
        ArrayList<Highlightable> highlightables = new ArrayList<>(3_500);
        ArrayList<Selectable> selectables = new ArrayList<>(3_500);

        Random random = new Random();
        for (int i = 0; i < 10_000; i++) {
            switch (random.nextInt(9)) {
                case 0:
                    visibles.add(new Visible1(i));
                    break;
                case 1:
                    highlightables.add(new Highlightable1(i));
                    break;
                case 2:
                    selectables.add(new Selectable1(i));
                    break;

                case 3:
                    visibles.add(new Visible2(i));
                    break;
                case 4:
                    highlightables.add(new Highlightable2(i));
                    break;
                case 5:
                    selectables.add(new Selectable2(i));
                    break;

                case 6:
                    visibles.add(new Visible3(i));
                    break;
                case 7:
                    highlightables.add(new Highlightable3(i));
                    break;
                case 8:
                    selectables.add(new Selectable3(i));
                    break;
            }
        }

        long listSize = visibles.size() + highlightables.size() + selectables.size();

        long vsum = 0;
        long hsum = 0;
        long ssum = 0;
        for (Visible e : visibles) {
            vsum += e.visibleValue();
        }
        for (Highlightable e : highlightables) {
            vsum += e.visibleValue();
            hsum += e.highlightableValue();
        }
        for (Selectable e : selectables) {
            vsum += e.visibleValue();
            hsum += e.highlightableValue();
            ssum += e.selectableValue();
        }

        return listSize + vsum * hsum * ssum;
    }

    @Benchmark
    @Warmup(iterations = 5, time = 1)
    @Measurement(iterations = 15, time = 1)
    @Fork(value = 2)
    public static long singleListAndInstanceof() {
        ArrayList<Visible> visibles = new ArrayList<>(10_000);

        Random random = new Random();
        for (int i = 0; i < 10_000; i++) {
            switch (random.nextInt(9)) {
                case 0:
                    visibles.add(new Visible1(i));
                    break;
                case 1:
                    visibles.add(new Highlightable1(i));
                    break;
                case 2:
                    visibles.add(new Selectable1(i));
                    break;

                case 3:
                    visibles.add(new Visible2(i));
                    break;
                case 4:
                    visibles.add(new Highlightable2(i));
                    break;
                case 5:
                    visibles.add(new Selectable2(i));
                    break;

                case 6:
                    visibles.add(new Visible3(i));
                    break;
                case 7:
                    visibles.add(new Highlightable3(i));
                    break;
                case 8:
                    visibles.add(new Selectable3(i));
                    break;
            }
        }

        long listSize = visibles.size();

        long vsum = 0;
        long hsum = 0;
        long ssum = 0;
        for (Visible e : visibles) {
            if (e instanceof Selectable) {
                vsum += e.visibleValue();
                hsum += ((Selectable) e).highlightableValue();
                ssum += ((Selectable) e).selectableValue();
            } else if (e instanceof Highlightable) {
                vsum += e.visibleValue();
                hsum += ((Highlightable) e).highlightableValue();
            } else {
                vsum += e.visibleValue();
            }
        }

        return listSize + vsum * hsum * ssum;
    }
}

abstract class Visible {
    abstract int visibleValue();
}
abstract class Highlightable extends Visible {
    abstract int highlightableValue();
}
abstract class Selectable extends Highlightable {
    abstract int selectableValue();
}

class Visible1 extends Visible {
    private int v;
    Visible1(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v; }
}
class Highlightable1 extends Highlightable {
    private int v;
    Highlightable1(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*2; }
    @Override int highlightableValue() { return v*3; }
}
class Selectable1 extends Selectable {
    private int v;
    Selectable1(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*4; }
    @Override int highlightableValue() { return v*5; }
    @Override int selectableValue() { return v*6; }
}

class Visible2 extends Visible {
    private int v;
    Visible2(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*7; }
}
class Highlightable2 extends Highlightable {
    private int v;
    Highlightable2(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*8; }
    @Override int highlightableValue() { return v*9; }
}
class Selectable2 extends Selectable {
    private int v;
    Selectable2(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*10; }
    @Override int highlightableValue() { return v*11; }
    @Override int selectableValue() { return v*12; }
}


class Visible3 extends Visible {
    private int v;
    Visible3(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*13; }
}
class Highlightable3 extends Highlightable {
    private int v;
    Highlightable3(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*14; }
    @Override int highlightableValue() { return v*15; }
}
class Selectable3 extends Selectable {
    private int v;
    Selectable3(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*16; }
    @Override int highlightableValue() { return v*17; }
    @Override int selectableValue() { return v*18; }
}

Benchmark output:

Main.separateLists                                             thrpt  600     1690.522 ±    6.570   ops/s
Main.singleListAndInstanceof                                   thrpt  600     1751.375 ±    4.368   ops/s
Main.separateLists:L1-dcache-load-misses:u                     thrpt    2     2298.258               #/op
Main.singleListAndInstanceof:L1-dcache-load-misses:u           thrpt    2      627.451               #/op
Main.separateLists:L1-dcache-loads:u                           thrpt    2  1217756.290               #/op
Main.singleListAndInstanceof:L1-dcache-loads:u                 thrpt    2  1135982.650               #/op
Main.separateLists:L1-icache-load-misses:u                     thrpt    2      113.599               #/op
Main.singleListAndInstanceof:L1-icache-load-misses:u           thrpt    2       99.896               #/op
Main.separateLists:L1-icache-loads:u                           thrpt    2   656048.382               #/op
Main.singleListAndInstanceof:L1-icache-loads:u                 thrpt    2   694074.004               #/op
Main.separateLists:LLC-load-misses:u                           thrpt    2      872.681               #/op
Main.singleListAndInstanceof:LLC-load-misses:u                 thrpt    2      355.666               #/op
Main.separateLists:LLC-loads:u                                 thrpt    2    12036.496               #/op
Main.singleListAndInstanceof:LLC-loads:u                       thrpt    2     7445.434               #/op
Main.separateLists:LLC-stores:u                                thrpt    2    15277.223               #/op
Main.singleListAndInstanceof:LLC-stores:u                      thrpt    2    10649.517               #/op
Main.separateLists:branch-misses:u                             thrpt    2    22463.763               #/op
Main.singleListAndInstanceof:branch-misses:u                   thrpt    2    29940.958               #/op
Main.separateLists:branches:u                                  thrpt    2   254018.586               #/op
Main.singleListAndInstanceof:branches:u                        thrpt    2   275450.951               #/op
Main.separateLists:cycles:u                                    thrpt    2  1988517.839               #/op
Main.singleListAndInstanceof:cycles:u                          thrpt    2  1921584.057               #/op
Main.separateLists:dTLB-load-misses:u                          thrpt    2       66.212               #/op
Main.singleListAndInstanceof:dTLB-load-misses:u                thrpt    2       64.442               #/op
Main.separateLists:dTLB-loads:u                                thrpt    2  1217929.340               #/op
Main.singleListAndInstanceof:dTLB-loads:u                      thrpt    2  1135799.981               #/op
Main.separateLists:iTLB-load-misses:u                          thrpt    2        4.179               #/op
Main.singleListAndInstanceof:iTLB-load-misses:u                thrpt    2        3.876               #/op
Main.separateLists:iTLB-loads:u                                thrpt    2   656595.175               #/op
Main.singleListAndInstanceof:iTLB-loads:u                      thrpt    2   693913.010               #/op
Main.separateLists:instructions:u                              thrpt    2  2273646.245               #/op
Main.singleListAndInstanceof:instructions:u                    thrpt    2  2045332.939               #/op
Main.separateLists:stalled-cycles-backend:u                    thrpt    2   773671.154               #/op
Main.singleListAndInstanceof:stalled-cycles-backend:u          thrpt    2   619477.824               #/op
Main.separateLists:stalled-cycles-frontend:u                   thrpt    2   184604.485               #/op
Main.singleListAndInstanceof:stalled-cycles-frontend:u         thrpt    2   271938.450               #/op
Main.separateLists:·gc.alloc.rate                              thrpt  600      217.266 ±    0.846  MB/sec
Main.singleListAndInstanceof:·gc.alloc.rate                    thrpt  600      222.747 ±    0.556  MB/sec
Main.separateLists:·gc.alloc.rate.norm                         thrpt  600   202181.035 ±    2.986    B/op
Main.singleListAndInstanceof:·gc.alloc.rate.norm               thrpt  600   200083.395 ±    4.720    B/op
Main.separateLists:·gc.churn.PS_Eden_Space                     thrpt  600      217.792 ±    3.841  MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space           thrpt  600      223.528 ±    4.973  MB/sec
Main.separateLists:·gc.churn.PS_Eden_Space.norm                thrpt  600   202704.197 ± 3508.997    B/op
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space.norm      thrpt  600   200804.794 ± 4414.457    B/op
Main.separateLists:·gc.churn.PS_Survivor_Space                 thrpt  600        0.095 ±    0.008  MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space       thrpt  600        0.091 ±    0.008  MB/sec
Main.separateLists:·gc.churn.PS_Survivor_Space.norm            thrpt  600       88.896 ±    7.778    B/op
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space.norm  thrpt  600       81.693 ±    7.269    B/op
Main.separateLists:·gc.count                                   thrpt  600     2440.000             counts
Main.singleListAndInstanceof:·gc.count                         thrpt  600     2289.000             counts
Main.separateLists:·gc.time                                    thrpt  600     4501.000                 ms
Main.singleListAndInstanceof:·gc.time                          thrpt  600     4236.000                 ms

UPDATE: Below is the benchmark code and results with array setup code extracted into separate methods and removed from the measurement. instanceof is slower for that case, as expected - above differences are probably related to branch prediction issues in the list setup. (while those are interesting too, they probably should go into separate question)

package test;

import org.openjdk.jmh.annotations.*;

import java.util.ArrayList;
import java.util.Random;

public class Main {

    public static void main(String[] args) throws Exception {
        org.openjdk.jmh.Main.main(args);
    }

    @State(Scope.Thread)
    public static class SeparateListsState {
        public ArrayList<Visible> visibles;
        public ArrayList<Highlightable> highlightables;
        public ArrayList<Selectable> selectables;

        @Setup(Level.Invocation)
        public void doSetup() {
            visibles = new ArrayList<>();
            highlightables = new ArrayList<>();
            selectables = new ArrayList<>();

            Random random = new Random(9698426994L + 8879);
            for (int i = 0; i < 10_000; i++) {
                switch (random.nextInt(9)) {
                    case 0:
                        visibles.add(new Visible1(i));
                        break;
                    case 1:
                        highlightables.add(new Highlightable1(i));
                        break;
                    case 2:
                        selectables.add(new Selectable1(i));
                        break;

                    case 3:
                        visibles.add(new Visible2(i));
                        break;
                    case 4:
                        highlightables.add(new Highlightable2(i));
                        break;
                    case 5:
                        selectables.add(new Selectable2(i));
                        break;

                    case 6:
                        visibles.add(new Visible3(i));
                        break;
                    case 7:
                        highlightables.add(new Highlightable3(i));
                        break;
                    case 8:
                        selectables.add(new Selectable3(i));
                        break;
                }
            }
        }
    }

    @Benchmark
    @BenchmarkMode(Mode.Throughput)
    @Warmup(iterations = 5, time = 1)
    @Measurement(iterations = 150, time = 1)
    @Fork(value = 2)
    public static long separateLists(SeparateListsState state) {
        long vsum = 0;
        long hsum = 0;
        long ssum = 0;
        for (Visible e : state.visibles) {
            vsum += e.visibleValue();
        }
        for (Highlightable e : state.highlightables) {
            vsum += e.visibleValue();
            hsum += e.highlightableValue();
        }
        for (Selectable e : state.selectables) {
            vsum += e.visibleValue();
            hsum += e.highlightableValue();
            ssum += e.selectableValue();
        }

        return vsum * hsum * ssum;
    }

    @State(Scope.Thread)
    public static class SingleListAndInstanceofState {
        public ArrayList<Visible> visibles;

        @Setup(Level.Invocation)
        public void doSetup() {
            visibles = new ArrayList<>();

            Random random = new Random(9698426994L + 8879);
            for (int i = 0; i < 10_000; i++) {
                switch (random.nextInt(9)) {
                    case 0:
                        visibles.add(new Visible1(i));
                        break;
                    case 1:
                        visibles.add(new Highlightable1(i));
                        break;
                    case 2:
                        visibles.add(new Selectable1(i));
                        break;

                    case 3:
                        visibles.add(new Visible2(i));
                        break;
                    case 4:
                        visibles.add(new Highlightable2(i));
                        break;
                    case 5:
                        visibles.add(new Selectable2(i));
                        break;

                    case 6:
                        visibles.add(new Visible3(i));
                        break;
                    case 7:
                        visibles.add(new Highlightable3(i));
                        break;
                    case 8:
                        visibles.add(new Selectable3(i));
                        break;
                }
            }
        }
    }

    @Benchmark
    @BenchmarkMode(Mode.Throughput)
    @Warmup(iterations = 5, time = 1)
    @Measurement(iterations = 150, time = 1)
    @Fork(value = 2)
    public static long singleListAndInstanceof(SingleListAndInstanceofState state) {
        long vsum = 0;
        long hsum = 0;
        long ssum = 0;
        for (Visible e : state.visibles) {
            if (e instanceof Selectable) {
                vsum += e.visibleValue();
                hsum += ((Selectable) e).highlightableValue();
                ssum += ((Selectable) e).selectableValue();
            } else if (e instanceof Highlightable) {
                vsum += e.visibleValue();
                hsum += ((Highlightable) e).highlightableValue();
            } else {
                vsum += e.visibleValue();
            }
        }

        return vsum * hsum * ssum;
    }
}

abstract class Visible {
    abstract int visibleValue();
}
abstract class Highlightable extends Visible {
    abstract int highlightableValue();
}
abstract class Selectable extends Highlightable {
    abstract int selectableValue();
}

class Visible1 extends Visible {
    private int v;
    Visible1(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v; }
}
class Highlightable1 extends Highlightable {
    private int v;
    Highlightable1(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*2; }
    @Override int highlightableValue() { return v*3; }
}
class Selectable1 extends Selectable {
    private int v;
    Selectable1(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*4; }
    @Override int highlightableValue() { return v*5; }
    @Override int selectableValue() { return v*6; }
}

class Visible2 extends Visible {
    private int v;
    Visible2(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*7; }
}
class Highlightable2 extends Highlightable {
    private int v;
    Highlightable2(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*8; }
    @Override int highlightableValue() { return v*9; }
}
class Selectable2 extends Selectable {
    private int v;
    Selectable2(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*10; }
    @Override int highlightableValue() { return v*11; }
    @Override int selectableValue() { return v*12; }
}


class Visible3 extends Visible {
    private int v;
    Visible3(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*13; }
}
class Highlightable3 extends Highlightable {
    private int v;
    Highlightable3(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*14; }
    @Override int highlightableValue() { return v*15; }
}
class Selectable3 extends Selectable {
    private int v;
    Selectable3(int v) {
        this.v = v;
    }
    @Override int visibleValue() { return v*16; }
    @Override int highlightableValue() { return v*17; }
    @Override int selectableValue() { return v*18; }
}

And the results:

Main.separateLists                                             thrpt  300     4211.552 ±    23.791   ops/s
Main.singleListAndInstanceof                                   thrpt  300     3920.251 ±    15.478   ops/s
Main.separateLists:L1-dcache-load-misses:u                     thrpt    2     3046.033                #/op
Main.singleListAndInstanceof:L1-dcache-load-misses:u           thrpt    2     1089.122                #/op
Main.separateLists:L1-dcache-loads:u                           thrpt    2  1090745.006                #/op
Main.singleListAndInstanceof:L1-dcache-loads:u                 thrpt    2  1125243.609                #/op
Main.separateLists:L1-icache-load-misses:u                     thrpt    2      150.542                #/op
Main.singleListAndInstanceof:L1-icache-load-misses:u           thrpt    2      143.304                #/op
Main.separateLists:L1-icache-loads:u                           thrpt    2   600852.620                #/op
Main.singleListAndInstanceof:L1-icache-loads:u                 thrpt    2   700771.042                #/op
Main.separateLists:LLC-load-misses:u                           thrpt    2     1299.520                #/op
Main.singleListAndInstanceof:LLC-load-misses:u                 thrpt    2      636.764                #/op
Main.separateLists:LLC-loads:u                                 thrpt    2    14408.815                #/op
Main.singleListAndInstanceof:LLC-loads:u                       thrpt    2    10429.768                #/op
Main.separateLists:LLC-stores:u                                thrpt    2    18999.178                #/op
Main.singleListAndInstanceof:LLC-stores:u                      thrpt    2    15370.582                #/op
Main.separateLists:branch-misses:u                             thrpt    2    22578.062                #/op
Main.singleListAndInstanceof:branch-misses:u                   thrpt    2    29257.959                #/op
Main.separateLists:branches:u                                  thrpt    2   258026.890                #/op
Main.singleListAndInstanceof:branches:u                        thrpt    2   284911.889                #/op
Main.separateLists:cycles:u                                    thrpt    2  1915774.770                #/op
Main.singleListAndInstanceof:cycles:u                          thrpt    2  1974841.023                #/op
Main.separateLists:dTLB-load-misses:u                          thrpt    2      101.573                #/op
Main.singleListAndInstanceof:dTLB-load-misses:u                thrpt    2       99.982                #/op
Main.separateLists:dTLB-loads:u                                thrpt    2  1090174.103                #/op
Main.singleListAndInstanceof:dTLB-loads:u                      thrpt    2  1129185.929                #/op
Main.separateLists:iTLB-load-misses:u                          thrpt    2        4.432                #/op
Main.singleListAndInstanceof:iTLB-load-misses:u                thrpt    2        3.955                #/op
Main.separateLists:iTLB-loads:u                                thrpt    2   600301.665                #/op
Main.singleListAndInstanceof:iTLB-loads:u                      thrpt    2   703339.482                #/op
Main.separateLists:instructions:u                              thrpt    2  1974603.052                #/op
Main.singleListAndInstanceof:instructions:u                    thrpt    2  2040460.093                #/op
Main.separateLists:stalled-cycles-backend:u                    thrpt    2   808914.974                #/op
Main.singleListAndInstanceof:stalled-cycles-backend:u          thrpt    2   685615.056                #/op
Main.separateLists:stalled-cycles-frontend:u                   thrpt    2   186013.216                #/op
Main.singleListAndInstanceof:stalled-cycles-frontend:u         thrpt    2   272207.204                #/op
Main.separateLists:·gc.alloc.rate                              thrpt  300      346.891 ±     1.166  MB/sec
Main.singleListAndInstanceof:·gc.alloc.rate                    thrpt  300      358.297 ±     0.614  MB/sec
Main.separateLists:·gc.alloc.rate.norm                         thrpt  300   310744.294 ±     0.107    B/op
Main.singleListAndInstanceof:·gc.alloc.rate.norm               thrpt  300   328992.302 ±     0.110    B/op
Main.separateLists:·gc.churn.PS_Eden_Space                     thrpt  300      349.387 ±    14.305  MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space           thrpt  300      360.039 ±     9.075  MB/sec
Main.separateLists:·gc.churn.PS_Eden_Space.norm                thrpt  300   313154.953 ± 13018.012    B/op
Main.singleListAndInstanceof:·gc.churn.PS_Eden_Space.norm      thrpt  300   330629.833 ±  8345.712    B/op
Main.separateLists:·gc.churn.PS_Survivor_Space                 thrpt  300        0.092 ±     0.012  MB/sec
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space       thrpt  300        0.094 ±     0.011  MB/sec
Main.separateLists:·gc.churn.PS_Survivor_Space.norm            thrpt  300       82.348 ±    10.661    B/op
Main.singleListAndInstanceof:·gc.churn.PS_Survivor_Space.norm  thrpt  300       86.465 ±    10.417    B/op
Main.separateLists:·gc.count                                   thrpt  300     1196.000              counts
Main.singleListAndInstanceof:·gc.count                         thrpt  300     1235.000              counts
Main.separateLists:·gc.time                                    thrpt  300     2178.000                  ms
Main.singleListAndInstanceof:·gc.time                          thrpt  300     2355.000                  ms
Catchwa :

As NoDataFound suggests in the comments, you're not just comparing the performance of iterating through the list, you're also comparing the list population methods. You need to pull this part of your code into a setup method - otherwise you're potentially going to be impacted by resize operations on your three ArrayList instances (amongst other things).

You should also either scrap the use of Random to populate the list, or at least instantiate it with the same seed across both implementations - otherwise you're not creating a repeatable order of elements across both implementations.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=120777&siteId=1