Efficient summing the number of ten million

Foreword

Today saw a face questions

Ten million number, how to efficiently sum?

See this problem in the "efficient summation", the first reaction thought JDK1.8 provides LongAdderdesign class is segmented sum then aggregated . That is, open multiple threads, each responsible for a part of the calculation, so the threads are calculated after the completion of the summary. The whole process is as follows:
Efficient sum

Ideas are there, then began a pleasant bar coding

test environment

  • win10 system
  • 4-core 4 thread CPU
  • JDK1.8
  • com.google.guava.guava-25.1-jre.jar
  • chilli

Examples

Since the topic is no clear definition of what number to the number of ten million, so tentatively scheduled for int type of random numbers . In order to compare the efficiency of bloggers to achieve a single-threaded version and multi-threaded version to see how efficient multi-threading in the end.

Single-threaded version

Threaded accumulated ten million number, the code is relatively simple, direct analysis

/**
 * 单线程的方式累加
 * @param arr 一千万个随机数
 */
public static int singleThreadSum(int[] arr) {
    long start = System.currentTimeMillis();
    int sum = 0;
    int length = arr.length;
    for (int i = 0; i < length; i++) {
        sum += arr[i];
    }
    long end = System.currentTimeMillis();
    log.info("单线程方式计算结果:{}, 耗时:{} 秒", sum, (end - start) / 1000.0);
    return sum;
}

Multithreaded version

Multi-threaded version comes to the thread pool (open multiple threads), CountDownLatch use (the main thread waits for the child thread execution is complete) and other tools, it is slightly more complicated.

// 每个task求和的规模
private static final int SIZE_PER_TASK = 200000;
// 线程池
private static ThreadPoolExecutor executor = null;

static {
    // 核心线程数 CPU数量 + 1
    int corePoolSize = Runtime.getRuntime().availableProcessors() + 1;
    executor = new ThreadPoolExecutor(corePoolSize, corePoolSize, 3, TimeUnit.SECONDS, new LinkedBlockingQueue<Runnable>());
}

/**
 * 多线程的方式累加
 *
 * @param arr 一千万个随机数
 * @throws InterruptedException
 */
public static int concurrencySum(int[] arr) throws InterruptedException {
    long start = System.currentTimeMillis();
    LongAdder sum = new LongAdder();
    // 拆分任务
    List<List<int[]>> taskList = Lists.partition(Arrays.asList(arr), SIZE_PER_TASK);
    // 任务总数
    final int taskSize = taskList.size();
    final CountDownLatch latch = new CountDownLatch(taskSize);
    for (int i = 0; i < taskSize; i++) {
        int[] task = taskList.get(i).get(0);
        executor.submit(() -> {
            try {
                for (int num : task) {
                	// 把每个task中的数字累加
                    sum.add(num);
                }
            } finally {
            	// task执行完成后,计数器减一
                latch.countDown();
            }
        });
    }
    // 主线程等待所有子线程执行完成
    latch.await();
    long end = System.currentTimeMillis();
    log.info("多线程方式计算结果:{}, 耗时:{} 秒", sum, (end - start) / 1000.0);
    // 关闭线程池
    executor.shutdown();
    return sum.intValue();
}

Since the code has detailed notes, so I will not repeat them.

main method

The main method is relatively simple, mainly produced 10 million random number, then call the two methods can be.

// 求和的个数
private static final int SUM_COUNT = 10000000;

public static void main(String[] args) throws InterruptedException {
    Random random = new Random();
    int[] arr = new int[SUM_COUNT];
    for (int i = 0; i < SUM_COUNT; i++) {
        arr[i] = random.nextInt(200);
    }

    // 多线程版本
    concurrencySum(arr);
    // 单线程版本
    singleThreadSum(arr);
}

8th line of code random.nextInt(200)Why 200?
Because 1kw * 200 = 20 Yi <Integer.MAX_VALUE, so the accumulated result does not overflow

Finally to test the efficiency of the time, is the mule is a horse, pull out yo.
Confident, I clicked on the run, got the following results

22:13:31.068 [main] INFO com.sicimike.concurrency.EfficientSum - 多线程方式计算结果:995523090, 耗时:0.133 秒
22:13:31.079 [main] INFO com.sicimike.concurrency.EfficientSum - 单线程方式计算结果:995523090, 耗时:0.006 秒

I may be open the wrong way ...

but

After multiple runs , as well as adjust the thread pool parameters run multiple times after, always come to the operating results can not bear to look.
Multi-threaded runtime stable at around 0.130 seconds, running a single-threaded mode stable at around 0.006 seconds.

Multithreading improvements

Used in earlier versions of the multi-threaded LongAdderclass, because the LongAdderclass uses a lot of the underlying operating cas, the thread is very competitive, there will be reduced to varying degrees of efficiency. Therefore, when the development of the multi-threaded version of the embodiment, without using LongAdderthe class, but the embodiment is more suitable for the current scene.

/**
 * 多线程的方式累加(改进版)
 *
 * @param arr 一千万个随机数
 * @throws InterruptedException
 */
public static int concurrencySum(int[] arr) throws InterruptedException {
    long start = System.currentTimeMillis();
    int sum = 0;
    // 拆分任务
    List<List<int[]>> taskList = Lists.partition(Arrays.asList(arr), SIZE_PER_TASK);
    // 任务总数
    final int taskSize = taskList.size();
    final CountDownLatch latch = new CountDownLatch(taskSize);
    // 相当于LongAdder中的Cell[]
    int[] result = new int[taskSize];
    for (int i = 0; i < taskSize; i++) {
        int[] task = taskList.get(i).get(0);
        final int index = i;
        executor.submit(() -> {
            try {
                for (int num : task) {
                	// 各个子线程分别执行累加操作
                	// result每一个单元就是一个task的累加结果
                    result[index] += num;
                }
            } finally {
                latch.countDown();
            }
        });
    }
    // 等待所有子线程执行完成
    latch.await();
    for (int i : result) {
    	// 把子线程执行的结果累加起来就是最终的结果
        sum += i;
    }
    long end = System.currentTimeMillis();
    log.info("多线程方式计算结果:{}, 耗时:{} 秒", sum, (end - start) / 1000.0);
    // 关闭线程池
    executor.shutdown();
    return sum;
}

Improved method of performing the following results were obtained:

22:46:05.085 [main] INFO com.sicimike.concurrency.EfficientSum - 多线程方式计算结果:994958790, 耗时:0.049 秒
22:46:05.094 [main] INFO com.sicimike.concurrency.EfficientSum - 单线程方式计算结果:994958790, 耗时:0.006 秒

Multiple runs , as well as adjust the thread pool parameters after multiple runs, results have stabilized.
Multi-threaded runtime stable at around 0.049 seconds, running a single-threaded mode stable at around 0.006 seconds

From 0.133 seconds to 0.049 seconds, about efficiency upgrade 170%

Think

Code is improved not only failed to solve a single thread Why faster than multi-threaded problem, but also one more question:

Why introduce an array of casual, even writing than Doug Lea LongAdderfaster?

Because LongAdderis a generic tool type, good balance of time and space relationships, so in a variety of scenarios can have better efficiency. The result array according to the present embodiment, the numbers are divided into ten million task how many, how much is the length of the array, the result of each task are present independent array entry, there is no competition, but takes up more space, so more time-efficient, that is, take the time space for thought.

As to why faster than single-threaded multi-threaded, it is not difficult to explain. Because no single thread context switch , plus accumulated relatively simple scenarios , each task execution time is very short, so the faster normal single-threaded.

stream mode

streamSyntactic sugar JDK1.8 provided, and it is single-threaded. On streamusage, we can understand their own. And the following are mainly used parallel streamfor comparison.

public static int streamSum(List<Integer> list) {
    long start = System.currentTimeMillis();
    int sum = list.stream().mapToInt(num -> num).sum();
    long end = System.currentTimeMillis();
    log.info("stream方式计算结果:{}, 耗时:{} 秒", sum, (end - start) / 1000.0);
    return sum;
}

parallelStream way

parallelStreamSee known name meaning, it is a parallel of stream.

public static int parallelStreamSum(List<Integer> list) {
    long start = System.currentTimeMillis();
    int sum = list.parallelStream().mapToInt(num -> num).sum();
    long end = System.currentTimeMillis();
    log.info("parallel stream方式计算结果:{}, 耗时:{} 秒", sum, (end - start) / 1000.0);
    return sum;
}

ForkJoin way

ForkJoinThe frame is raised for JDK1.7 split task calculation result calculated recombining frame.

When we need to perform a large number of small tasks, we experienced Java developers will thread pool to efficiently perform these tasks. However, there is a task, for example, more than 10 million element of the array to sort, such a task in itself can execute concurrently, but how broken down into smaller tasks that require dynamic split during task execution. In this way, you can split large tasks into small tasks, small tasks can continue down into smaller tasks, the last task results are summarized combined to give the final result, this model is the Fork / Join model.

ForkJoinUsing the frame is roughly divided into two parts: implement ForkJoin tasks, tasks

Achieve ForkJoin task

Custom class inherits RecursiveTask(return value) or RecursiveAction(no return value), implemented computemethod

/**
 * 静态内部类的方式实现
 * forkjoin任务
 */
static class SicForkJoinTask extends RecursiveTask<Integer> {
    // 子任务计算区间开始
    private Integer left;
    // 子任务计算区间结束
    private Integer right;
    private int[] arr;

    @Override
    protected Integer compute() {
        if (right - left < SIZE_PER_TASK) {
        	// 任务足够小时,直接计算
            int sum = 0;
            for (int i = left; i < right; i++) {
                sum += arr[i];
            }
            return sum;
        }
        // 继续拆分任务
        int middle = left + (right - left) / 2;
        SicForkJoinTask leftTask = new SicForkJoinTask(arr, left, middle);
        SicForkJoinTask rightTask = new SicForkJoinTask(arr, middle, right);
        invokeAll(leftTask, rightTask);
        Integer leftResult = leftTask.join();
        Integer rightResult = rightTask.join();
        return leftResult + rightResult;
    }

    public SicForkJoinTask(int[] arr, Integer left, Integer right) {
        this.arr = arr;
        this.left = left;
        this.right = right;
    }
}

Mission

By ForkJoinPoolthe invokemethod of performing ForkJointhe task

// ForkJoin线程池
private static final ForkJoinPool forkJoinPool = new ForkJoinPool();

public static int forkJoinSum(int[] arr) {
    long start = System.currentTimeMillis();
    // 执行ForkJoin任务
    Integer sum = forkJoinPool.invoke(new SicForkJoinTask(arr, 0, SUM_COUNT));
    long end = System.currentTimeMillis();
    log.info("forkjoin方式计算结果:{}, 耗时:{} 秒", sum, (end - start) / 1000.0);
    return sum;
}

main method

public static void main(String[] args) throws InterruptedException {
    Random random = new Random();
    int[] arr = new int[SUM_COUNT];
    List<Integer> list = new ArrayList<>(SUM_COUNT);
    int currNum = 0;
    for (int i = 0; i < SUM_COUNT; i++) {
        currNum = random.nextInt(200);
        arr[i] = currNum;
        list.add(currNum);
    }

    // 单线程执行
    singleThreadSum(arr);
    
    // Executor线程池执行
    concurrencySum(arr);
    
    // stream执行
    streamSum(list);
    
    // 并行stream执行
    parallelStreamSum(list);
    
    // forkjoin线程池执行
    forkJoinSum(arr);
}

Results of the

23:19:21.207 [main] INFO com.sicimike.concurrency.EfficientSum - 单线程方式计算结果:994917205, 耗时:0.006 秒
23:19:21.274 [main] INFO com.sicimike.concurrency.EfficientSum - 多线程方式计算结果:994917205, 耗时:0.062 秒
23:19:21.292 [main] INFO com.sicimike.concurrency.EfficientSum - stream方式计算结果:994917205, 耗时:0.018 秒
23:19:21.309 [main] INFO com.sicimike.concurrency.EfficientSum - parallel stream方式计算结果:994917205, 耗时:0.017 秒
23:19:21.321 [main] INFO com.sicimike.concurrency.EfficientSum - forkjoin方式计算结果:994917205, 耗时:0.012 秒

Source

Code Address: EfficientSum.java
Interested students can download the source code after themselves, adjust various parameters of operation, the results do not necessarily like me.

to sum up

Code written a large version, the results of the initial problem still not resolved. Some might say: bloggers you pit father do.
Indeed, I did not think of a better way, but put a few questions to think clearly in the text should be more valuable than a face questions.

Which students if there is a better way to optimize, also please let me know.

reference

The Java Fork / Join task, you write the right thing?

Published 52 original articles · won praise 107 · views 10000 +

Guess you like

Origin blog.csdn.net/Baisitao_/article/details/103831031