This is the sixth article in the multi-threading series. Please pay attention to the following for others:

Java multithreading - how do threads come from?

java multithreading - memory model

java multithreading - volatile

java multithreading - lock

java multithreading - CAS

When writing concurrent programs, you can use the same programming style as serial programs. The only difficulty is that concurrent programs have uncertainty, which can make programs go wrong far more often than serial programs, and there are no fixed rules for how they appear. This will cause some difficulties for the application of the program, so how to expose these problems as much as possible in the test, and understand its performance bottleneck, which also brings new challenges to the developers.

Based on multi-threading knowledge, this article sorts out some methods and principles that multi-thread testing needs to master, hoping to expose the security and performance issues of concurrent programs during the development stage, and provide help for efficient and safe operation of multi-threading.

This article mainly includes the following contents:

1. Concurrent test classification

2. Correctness test

3. Security testing

4. Performance test

Concurrent test classification

test process

Concurrent testing and serial testing have the same parts. For example, they both need to test their correctness in serial conditions. This is the basis for ensuring subsequent testing. Of course, correctness testing is no different from our serial testing method. , are to ensure that its program is executed in a single thread and serial execution has the same result, we will not state this.

For general concurrency testing, we follow the following process.

Classification

Concurrency testing can be roughly divided into two categories: security testing and liveness testing.

Security testing can be defined as "behavior without any errors", or it can be understood as maintaining consistency. For example, i++ operation, but in the case of single thread, the loop is 20 times, i=20, but in the case of multi-threading, if the total loop is 20 times and the result is not 20, then the result is wrong, indicating that there is a wrong thread safety question. When we test this kind of problem, we must add a "test point" to ensure its atomicity without affecting the correctness of the program. Use this as a judgment condition to execute the test code. We will discuss how to do the "test point" later.

Liveness testing is defined as "a good behavior will happen eventually", and it can also be understood as a certain result of the program running, and there will be no blocking due to a method, slow running, or thread deadlock occurs, resulting in The state of waiting, etc.

Related to liveness testing is performance testing. There are mainly the following aspects to measure: throughput, responsiveness, scalability.

Throughput: The percentage of completed tasks in a set of concurrent tasks. Or the number of tasks completed within a certain period of time.

Responsiveness: The time between when a request is made and when it completes

Scalability: When adding more resources (CPU, IO, memory), the throughput is improved.

security testing

Security testing, as said before, is "behavior without any errors", but also to test for errors that may be caused by data races. This is also where we need to find a concurrent "test point" in a function and construct some additional tests for it. And these tests preferably don't need any synchronization mechanism.

We illustrate with an example.

For example , ArrayBlockingQueue , we know that this class uses a bounded blocking queue to implement the production-consumption model. If it is testing for concurrency problems, it is important to test the put and take methods. An effective method is to check whether the elements put into the queue and the elements in the queue are equal. If there is a data security problem, then the value of the queue must not correspond to the value of the queue, and the results are not the same. For example, in the case of multi-threading, we compare the checksums of all enqueued elements and dequeued elements, and if the two are equal, the test is successful.

In order to ensure that it can test all the points, it is necessary to randomly generate the values entered into the queue, so that the results obtained by each test are not the same. In addition, in order to ensure its fairness, it is necessary to ensure that all threads start operations together to prevent the first program from performing serial operations.

public class PutTakeTest {

    protected static final ExecutorService pool = Executors.newCachedThreadPool();



    //栅栏，通过它可以实现让一组线程等待至某个状态之后再全部同时执行

    protected CyclicBarrier barrier;

    protected final ArrayBlockingQueue<Integer> bb;

    protected final int nTrials, nPairs;

    //入列总和

    protected final AtomicInteger putSum = new AtomicInteger(0);

    //出列总和

    protected final AtomicInteger takeSum = new AtomicInteger(0);



    public static void main(String[] args) throws Exception {

        new PutTakeTest(10, 10, 100000).test(); // 10个承载因子，10个线程，运行100000

        pool.shutdown();

    }



    /**

     *

     * @param capacity 承载因子（缓存）

     * @param npairs 线程数量

     * @param ntrials 单个线程执行数量（吞吐量）

     */

    public PutTakeTest(int capacity, int npairs, int ntrials) {

        this.bb = new ArrayBlockingQueue<Integer>(capacity);

        this.nTrials = ntrials;

        this.nPairs = npairs;

        this.barrier = new CyclicBarrier(npairs * 2 + 1);

    }



    void test() {

        try {

            for (int i = 0; i < nPairs; i++) {

                pool.execute(new Producer());

                pool.execute(new Consumer());

            }

            barrier.await(); // 等待所有的线程就绪

            barrier.await(); // 等待所有的线程执行完成

            System.out.println("result，put==take :"+(putSum.get()==takeSum.get()));

        } catch (Exception e) {

            throw new RuntimeException(e);

        }

    }



    static int xorShift(int y) {

        y ^= (y << 6);

        y ^= (y >>> 21);

        y ^= (y << 7);

        return y;

    }



    //生产者

    class Producer implements Runnable {

        public void run() {

            try {

                int seed = (this.hashCode() ^ (int) System.nanoTime());

                int sum = 0;

                barrier.await();

                for (int i = nTrials; i > 0; --i) {

                    bb.put(seed);

                    sum += seed;

                    seed = xorShift(seed);

                }

                putSum.getAndAdd(sum);

                barrier.await();

            } catch (Exception e) {

                throw new RuntimeException(e);

            }

        }

    }



    //消费者

    class Consumer implements Runnable {

        public void run() {

            try {

                barrier.await();

                int sum = 0;

                for (int i = nTrials; i > 0; --i) {

                    sum += bb.take();

                }

                takeSum.getAndAdd(sum);

                barrier.await();

            } catch (Exception e) {

                throw new RuntimeException(e);

            }

        }

    }

}

In the above program, we add putSum and takeSum variables to count the checksum of put and take data. At the same time, CyclicBarrier is used to make all threads execute from the same position at the same time. In order to ensure the uniqueness of each thread's entry data, a unique seed is generated. When the following code is executed, it must be a place where multiple threads compete.

    for (int i = nTrials; i > 0; --i) {

         bb.put(seed);

                    sum += seed;

                    seed = xorShift(seed);

       }

If there is a thread safety problem here, then the final take data and put data must be different, and the final values of putSum and takeSum must be different, and vice versa.

Since most errors in concurrent code are low-probability events, it is still necessary to repeat the test multiple times during testing to improve the probability of finding errors.

Performance Testing

Performance testing is usually an extension of functional testing. While there will be overlap between performance testing and functional testing, their goals are different.

First of all, the performance test needs to reflect the actual usage of the tested object in the application and its throughput. In addition, it is necessary to adjust various limits according to the empirical value, such as the number of threads, the number of concurrency, etc., so that the program can run better on the system.

We extend the above PutTakeTest to add the following functions:

1. Add a record for the running time of running a grouping, in order to ensure the accuracy of the time.

It is maintained by BarrierTimer. It implements Runnable. After the count reaches the number specified by the barrier (CyclicBarrier), the callback will be called once to set the end time.

We use it to record, the time a single test runs. With the total time, the time for a single operation can be calculated. This way we can calculate the throughput of a single test.

Throughput = 1ms / time for a single operation = number of times it can be executed per second.

Below is a fence based timer.

public class BarrierTimer implements Runnable{

    private boolean started;

    private long startTime, endTime;



    public synchronized void run() {

        long t = System.nanoTime();

        if (!started) {

            started = true;

            startTime = t;

        } else

            endTime = t;

    }



    public synchronized void clear() {

        started = false;

    }



    public synchronized long getTime() {

        return endTime - startTime;

    }


}

2. The performance test needs to be tested for different parameter combinations.

Combination tests are performed with different parameters to obtain throughput rates under different parameters and scalability under different numbers of threads. In putTakeTest, we only focus on security testing.

Let's look at the enhanced version of TimedPutTakeTest. Here we set the capacity of ArrayBlockingQueue to 1, 10, 100, and 1000, respectively, so that when the number of threads is 1, 2, 4, 8, 16, 32, 64, and 128, respectively, Look at the throughput rate of its linked list.

public class TimedPutTakeTest extends PutTakeTest {

    private BarrierTimer timer = new BarrierTimer();



    public TimedPutTakeTest(int cap, int pairs, int trials) {

        super(cap, pairs, trials);

        barrier = new CyclicBarrier(nPairs * 2 + 1, timer);

    }



    public void test() {

        try {

            timer.clear();

            for (int i = 0; i < nPairs; i++) {

                pool.execute(new PutTakeTest.Producer());

                pool.execute(new PutTakeTest.Consumer());

            }

            barrier.await();

            barrier.await();

            long nsPerItem = timer.getTime() / (nPairs * (long) nTrials);

            System.out.print("Throughput: " + nsPerItem + " ns/item");

            System.out.println("result:"+(putSum.get()==takeSum.get()));

        } catch (Exception e) {

            throw new RuntimeException(e);

        }

    }



    public static void main(String[] args) throws Exception {

        int tpt = 100000; // trials per thread

        for (int cap = 1; cap <= 1000; cap *= 10) {

            System.out.println("Capacity: " + cap);

            for (int pairs = 1; pairs <= 128; pairs *= 2) {

                TimedPutTakeTest t = new TimedPutTakeTest(cap, pairs, tpt);

                System.out.print("Pairs: " + pairs + "\t");

                t.test();

                System.out.print("\t");

                Thread.sleep(1000);

                t.test();

                System.out.println();

                Thread.sleep(1000);

            }

        }

        PutTakeTest.pool.shutdown();

    }

}

The following are our performance test results for ArrayBlockingQueue, my computer hardware environment is:

cpu i7 4 cores 8 threads

memory 16G

Hard disk SSD110G

jdk environment

java version “1.8.0_45"

Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)

The following points can be seen from the above

1. When the cache capacity of ArrayBlockingQueue is 1, no matter the number of linear concurrency, its throughput cannot be significantly improved. This is because each thread is blocking waiting for another thread to perform a task.

2. When trying to increase the cache capacity to 10, 100, and 1000, the throughput rate has been greatly improved, especially at 1000, the maximum can reach 900w times/s.

3. When the number of threads is increased to 16, the throughput rate will reach the peak, and then the thread throughput rate will not decrease but will decrease. Of course, it will not decrease greatly. This is because when the number of threads increases, most of the time is spent in Blocking and unblocking above.

Comparison of other blocking queues

The following is a horizontal evaluation of several blocking queues such as ArrayBlockingQueue, LinkedBlockingQueue, LinkedBlockingDeQue, and PriorityBlockingQueue. The hardware environment is still the same as above. jdk still uses the 1.8 API.

The buffer capacity of each queue is 1000. Then check the throughput under the concurrent threads of 1, 2, 4, 8, 16, 32, 64, and 128 respectively.

From the above data, we can see:

1. The performance of ArrayBlockingQueue is higher than that of LinkedBlockingQueue under the optimization of jdk1.8, although the difference between the two is not too big, this is before 1.6, LinkedBlockingQueue is better than ArrayBlockingQueue.

2. After the PriorityBlockingQueue reaches the throughput peak of 290w, the performance begins to decline continuously. This is because the priority queue needs to continuously optimize the priority list and requires a certain sorting time.

The main purpose of the above test is to test the constraints that will affect the overall throughput when producers and consumers transmit data through bounded puts and takes. So many practical factors are ignored. In addition, due to the dynamic compilation of jit, the compiled code will be directly compiled into machine code. Therefore, the above tests need to be preheated and run more times to ensure that all the codes are compiled before counting the running time of the tests.

finally

Testing the correctness of concurrent programs can be particularly difficult because many failures of concurrent programs are low-probability events, and they are sensitive to execution timing, load conditions, and other hard-to-reproduce conditions. In order to find these errors as much as possible, we need to do more work to analyze and test, and I look forward to today's introduction to help you open up some ideas.

Quote:

The above test code is quoted from "java Concurrency in Practice"

java multithreading - concurrent testing