Java multi-threaded programming (3) - thread safety

I. thread safety

  In general, if a class in the case of single-threaded environment to function properly, and in a multithreaded environment, in which the consumer does not have to make any changes to its functioning can, then we call it thread-safe . Conversely, if a class function properly in a single-threaded environment can not function properly in a multithreaded environment, then this class is not thread-safe. Thus, if a class can lead to race conditions, then it is not thread-safe; and if the class is a thread-safe, then it will not result in a race. The following is a "Java Concurrency in combat" is defined for a given thread-safe book:

When multiple threads access a class, no matter what kind of scheduling runtime environment using or how these threads will alternate execution, and does not require any additional synchronization or coordination, the class in the code can show the correct behavior, then call this class is thread-safe.

  Use a class when we have to figure out whether this class is thread-safe. As it relates to how we use these classes correct. Java standard library classes such as ArrayList, HashMap and SimpleDateFormat, are non-thread-safe, in a multithreaded environment directly they may lead to some unexpected results, even some disastrous results. In general, Java standard library classes in its API documentation will indicate whether it is thread safe (did not specify whether it is thread-safe, it may or may not be thread-safe).
  From the thread-safe definition we can see, if a thread-safe class will work correctly in a multithreaded environment, then it can function properly in a single-threaded environment. That being the case, why not just put all of the classes are made thread-safe? Whether to make a class thread-safe, to some extent is to weigh on results or a design decision: on the one hand, the need for a class to be thread safe and expected use of this class is about, for example, we always hope that a class can only be used on its own thread, then there is no need to make this class thread safe. Secondly, to make a thread-safe class is often the additional cost.
  If a class is not thread-safe, we say it uses the thread-safety issues exist directly in a multithreaded environment. Thread-safety issues summary showed three aspects: atomicity, visibility and orderliness.

II. Atomicity

  Literally atoms are indivisible. For operations involving shared variable access, if the operation from any thread other than the thread of execution is an integral point of view, this operation is an atomic operation, accordingly we call this operation is atomic. The so-called "indivisible", one of which is meant that access a shared variable operating from any thread other than the thread of execution point of view, this operation has been performed either ended or has not occurred, that other threads do not "see" the action the effect of the intermediate portion is performed.
  Examples of an atomic operation in life we find is that people can extract cash from ATM machines: Although from the perspective of ATM software, the sum withdrawal transactions involving deductions household account balances, spit out money, and a series of new trading record operation, but from the perspective of the user operation is an ATM. This operation is either a success that we get the cash (account balance will be deducted) this operation takes place after; or fail, that we did not get the cash, as this operation has never happened, like (the account balance is not will be deducted). Unless ATM software is defective, we will not encounter some of the cash dispenser out of the mouth and our account balance was deducted from this partial results.
  In general, in Java, there are two ways to achieve atomicity. One is to use a lock (Lock). Exclusive lock that it can protect a shared variable is only one thread can be accessed at any time. This precludes the possibility of multiple threads access the same shared variable at the same time lead to interference with the conflict, namely the elimination of the race. Another is to use a dedicated processor to provide CAS (Compare-and-Swap) instruction. CAS instruction atomicity locking manner atomicity substantially the same manner, the difference is that the lock is usually implemented in the software level, and CAS in hardware (processors and memory) directly to achieve this level it can be seen as a "dongle."
  In the Java language, the write operation variables of any type other than the long type and double-type is atomic, i.e., reference variables and variable types of the base (long, except for double) the write operation is atomic. This point is defined by the Java Language Specification (Java Language Specification), embodied by the Java Virtual Machine. A long / double variable read / write operations in a 32-bit Java virtual machine may be broken down into two sub-steps (such as the first to write low 32, high 32 write) is implemented, which led to a long thread / write operation intermediate results of double type variables other threads may be observed, i.e., access operation is not the case for atomic Long / variable of type double. Nevertheless, Java language specification in particular, provided for a write operation volatile keyword modified long / double type variable atomic. Therefore, we only need the volatile keyword (The next article will further introduce the keyword) modifying a variable of type long double / may be accessed by multiple threads, you can guarantee atomic write operation of the variable.

III. Visibility

  In a multithreaded environment, after a thread of a shared variable is updated, follow-up visits to the variable thread may not be able to immediately read the updated results, never even read the updated results. This is the thread-safety issues is another form of expression: visibility.
  See the following example of a visibility:

// Code 2-2
public class VisibilityDemo {
    public static void main(String[] args) {
        UselessThread uselessThread = new UselessThread();
        uselessThread.start();
        try {
            Thread.sleep(500);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        uselessThread.cancel();
    }
}

class UselessThread extends Thread {
    private boolean cancelled = false;

    @Override
    public void run() {
        System.out.println("Task has been started.");
        while (!cancelled) {}
        System.out.println("Task has been cancelled.");
    }

    public void cancel() {
        cancelled = true;
    }
}

  The above program, the main thread in uselessThread thread is started, at which point the thread will output "Task has been started.", After one second, the main thread will call the cancel method uselessThread, that is, the uselessThread of calcelled variable is set to true. In theory, this time in the run method uselessThread while loop will end, and after the output "Task has been cancelled." The end of the thread. However, to run the program, we will see the following output:

Task has been started.

  We found that the program does not output "Task has been cancelled.", The program is still running (if this phenomenon does not appear you can add -server option, after the java command). This phenomenon is only one explanation, that is run method while caught in an infinite loop. In other words, cancel the value of the variable sub-thread uselessThread read is always false, even though the main thread has updated the value of this variable is true. It is seen here had a visibility problem that main thread updates the sub-thread uselessThread shared variables cancelled invisible.
  Visibility problems in the above example because the code is not sufficient to JIT compiler hints state variable such that it cancelled that only one thread to access it, resulting in the variable JIT compiler to avoid repetition in order to improve the reading state code cancelled operating efficiency, and the while loop is optimized to run method and machine code equivalent to the following code:

if (!cancelled) {
    while (true) {}
}

  Unfortunately, at this time this optimization results in an infinite loop, that is, we have seen the program has been running without exit.
  On the other hand, the visibility of issues related to computer and storage systems. Process variables may be assigned to the register instead of the main memory for storage. Each processor has its own register, the processor can not read the contents of a register on another processor. Thus, if two threads are run on different processors, but shared by the two threads was assigned to the variables stored in registers, it will have a visibility problem. Further, even if a shared variable is assigned to the main memory for storage, and it can not guarantee the visibility of the variable. This is because the processor access to the main memory is not accessed directly, but rather through its cache subsystem. If the contents of the cache subsystem does not update, then the processor reads the value is still there may be an old value, which can also cause visibility problems.

The processor does not deal directly with the main memory and performs memory read and write operations, but by defining registers, caches, read-write memory buffer and performs invalidation queues member, the write operation. From this perspective, those components corresponding to main memory copies, so book For convenience these components are collectively referred to as the main memory of the processor cache, referred to as a processor cache.

  While the contents of the cache of one processor to another processor can not be read directly, but a processor may read the data in the caches of other processors by the cache coherency protocol (Cache Coherence Protocol), and updating the read data to the cache of the processor. Such a processor reads from the storage section other than the processor itself and update the data cache to the cache of the processor process, which we call cache synchronization, the storage means comprises a processor's cache , main memory. Cache synchronization makes threads running on a processor can read a thread running on another processor updates to shared variables do, is to protect visibility. Therefore, in order to guarantee visibility, we have to make a processor updates made to the shared variable is finally written into the processor's cache or main memory (but not always stay in their write buffer), a process called flushing processor caches. And, a processor reads the shared variable, if other processors before this has updated the variable. Then the processor cache must be synchronized from the respective variables other processor's cache or main memory. This process is called the refresh processor cache. Therefore, visibility of security is executed by the processor cache refresh processor updates the shared variable execution processor cache flushing action, and shared variables read operation of the processor to achieve.
  So, in the Java platform How do we ensure the visibility of it? In fact, the use of the volatile keyword can ensure visibility. For the code shown in Code 2-2, we only need to add a volatile keyword to declare instance variables in cancelled in:

private volatile boolean cancelled = false;

  Here, a role that is played by the volatile keyword, suggesting that JIT compiler is modified variables may be shared by multiple threads to prevent the JIT compiler to make the program may cause undesired operation optimization. Another key role is modified to read a volatile variable will cause the processor to perform the appropriate action to refresh the processor cache, write a volatile keyword modified variable will cause the processor to perform the appropriate action processor cache flush, so protection of visibility.
  After a shared variable for the same purposes, a thread updates the value of the variable, other threads can read this updated value, then this value is called the relative value of the new variable. If you read this thread shared variables to read and use that variable when other threads can not update the value of the variable, then the thread reads the relative value that will be called the latest value of the variable. Visibility protection simply means that a thread can be read to a relatively new shared variable value, but can not guarantee that the thread can read the latest value to the corresponding variable.
  For atomicity, Java language specification defines two thread start and stop related specifications:

  1. Parent thread before the thread updates the promoter of shared variables is visible to the child thread is;
  2. Thread one thread of the thread updates to shared variables to call join method of this thread post is visible to the termination.

IV. Orderliness

  Ordering means that other threads of memory access operations a thread running on the processor and in some cases executed by running on another processor appears to be out of order. The so-called disorder, refers to sequential memory access operations appear to have changed. Before ordering further information on this concept, we need to introduce the concept of reordering.

Reordering concept

  Sequential structure is a basic structure of programming, it means that we want an action must precede another operation is performed. In addition, even if the two operations can be executed using a sequential arbitrary, but is reflected in the code for these two operations always have precedence relationship. However, in an environment of a multi-core processor, such operations may be performed sequentially is not guaranteed: the compiler may change the order of the two operations; processor instruction may not be completely executed in the order specified by the object code program; in addition, operations performed on a plurality of processors, the order may be inconsistent with the order specified by the object code from the point of view of other processors. This phenomenon is called reordering.
  Reordering is related to memory access operations (read and write) made an optimization that can improve the performance of the program without affecting single-threaded program correctness. However, it may have an impact on the accuracy of multi-threaded programs, that is, it can lead to thread-safety issues. And visibility problems similar reordering is not necessarily arise.
  Potential sources reordering of many, including compilers (in the Java platform that basically refers to the JIT compiler), the processor and memory subsystem (including the write buffer cache). To facilitate the following explanation, we first define the number of sequential memory operations related terms:

  • Sequence source: access memory sequential operations specified in the source code.
  • Program sequence: target code memory access operation sequence running on a given processor specified.
  • Execution order: the actual memory access operations performed sequentially on a given processor.
  • Perception sequence: a given sequence between the occurrence of the perceived processor memory access by the processor and the other processor operations.

  On this basis, we divided the reordered reordering instruction reordering, and two kinds of storage subsystem, as shown in the following table:

Instruction reordering

  In the source code sequence and the sequence program is inconsistent, or is inconsistent with the program order of execution order, we reorder instruction is said to occur. Reordering operation is an instruction that indeed the field of sequential instructions have been adjusted, it is an object of reordering instructions.

Java platform includes two compilers: static compiler (javac) and dynamic compiler (JIT compiler). The role of the former is the Java source code (.java text file) compiled to bytecode (.class binary files), which is involved in the compilation phase of the code. The latter role is dynamically compiled byte code for the Java virtual machine host native code (machine code), which is involved in the process of running Java programs.

  Consider the following program:

// Code 2-3
public class PossibleReordering {
    private static int a;
    private static int b;
    private static int x;
    private static int y;

    public static void main(String[] args) {
        Thread threadA = new Thread(() -> {
            a = 1;
            x = b;
        });
        Thread threadB = new Thread(() -> {
            b = 1;
            y = a;
        });
        threadA.start();
        threadB.start();
        try {
            threadA.join();
            threadB.join();
            System.out.printf("(%d,%d)", x, y);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

  A thread can execute due to be completed before the start of the thread B, thread B may also be completed before the thread A starts, the two may also be performed alternately, therefore, what the program will eventually output is uncertain. However, according to our knowledge, operating in each thread should be in the order of code to be executed. That is, a = 1 should be performed before x = b, b = 1 should be performed before y = a. We can arrange these few simple operations to analyze the final output:

Action 1 Operation 2 Operation 3 Operation 4 result
a=1 x=b b=1 y = a (0,1)
a=1 b=1 x=b y = a (1,1)
a=1 b=1 y = a x=b (1,1)
b=1 y = a a=1 x=b (1,0)
b=1 a=1 y = a x=b (1,1)
b=1 a=1 x=b y = a (1,1)

  It can be seen in the absence of proper synchronization, program output (1,0), (0,1) or (1,1) are possible. But strangely, the program may also output (0,0), this result does not belong to either case of the above analysis. Since there is no dependency between the data stream of each thread in each of the above operations, instruction reordering may occur, so these operations there may be executed out of order. The following figure shows a possible implementation of alternating caused by reordering, in which case the output will be (0,0).

  It can be seen, reordering may lead to thread-safety issues. Of course, this does not mean reordering itself is wrong, but there are problems with our program: Our program does not use or does not properly use the thread synchronization mechanism. However, reordering is not inevitable, the above (0,0) is in the program probably run about 50,000 times only appeared once. Nevertheless, we can not ignore the potential risks posed by reordering.
  In other compiled language (e.g., C ++), the instructions may cause the compiler reordered. In the Java platform, the static compiler (javac) basically does not perform instruction reordering, and JIT compiler may execute an instruction reordering.
  The processor may execute reordering of instructions, which makes the execution order is inconsistent with the program order. Processor instructions out of order execution is also referred to reordering processor under the conditions allow, a direct current running ability to perform subsequent instructions immediately, avoiding waiting caused by the acquisition of data required for the next instruction. By order execution techniques, can greatly improve the efficiency of the processor. Instruction reordering processor will not have an impact on the correctness of single-threaded programs, but it can lead to unexpected results appear multithreaded programs.

Storage subsystem reordering

  Main memory (RAM) with respect to the processor is a slow device. In order to avoid drag, the processor is not directly access the main memory, but through cache access main memory. On this basis, the efficiency of modern processors also introduced a write buffer to improve write caching operations. Some processors (such as Intel's x86 processor) for all of main memory write operation is performed by the write buffer. Here, we will write buffer and cache memory subsystem collectively, it is actually a sub-processor.
  9, other processors of these two operations is still perceived order may not be consistent even when the processor strictly enforce the two memory access operations in program order under the effect of the storage subsystem and program order, that these two operations It looks like the order of execution has changed. This phenomenon is the storage subsystem reordering, also known as memory reordering.
  Reordered reordering command object is a command, it is actually of the entire sequence of instructions is dry, while the storage subsystem is a reordering operation rather than a phenomenon, it does not really adjusted for sequential instruction execution, but rather a result of an instruction execution order is adjusted as such a phenomenon that the object is the result of reordering of memory operations.
  From the perspective of the processor, the memory read operation is to load from the essence of the specified RAM address data (via a cache loaded) into a register, so that the read memory operations are generally referred to as the Load, substantial memory write operation is to store data RAM memory unit represented by the specified address, so the memory write operation is commonly referred to store. So, in fact, memory reordering only the following four possibilities:

  memory reordering may cause thread-safety issues. Assumed that the processor Processor Processor 0 and processor 1 in accordance with the two threads on the interleaved order shown below each execute its code in which data, ready two threads are shared variables, the initial values of 0 and false. Processing logic threads on Processor 0 is performed to update data DATA and thereafter the value of a respective update flag ready to true. Processing the logical thread 1 Processor is executed when the flag value is updated to true data is not ready to wait indefinitely for ready before the data is true
Value printed out.

  Processor 0 is assumed in program order execution S1 and S2 successively, then operation results S1 and S2 are successively written to the write buffer. However, due to certain processors write buffer to improve the efficiency in which the contents of the write cache without guaranteed write the operation result of the first in first out order, i.e. late write buffer reaches the write operation earlier results may be write cache, so operation S2 may precede the operation result S1 is written to the cache, i.e., S1 to S2 then reordered (reordering memory). This leads when threads on 1 Processor ready to read is true, because the result of the operation S1 is still stuck in the write buffer Processor 0, and a processor and can not be read to another processor write the contents of the buffer, so the thread on 1 Processor to read the data value is still 0. Visible, this time led to a reordering memory processing logic thread on 1 Processor can not achieve its intended goal, that led to the thread-safety issues.

Ensure sequential memory access

  How to avoid reordering thread safety problems caused by it? We need to know that we can not physically disable the reordering such that the processor executes instructions in accordance with the complete source code sequence, as low as performance. But, we can selectively prohibit reordering that reordering is either not happen or even if it will not affect the validity of the occurrence of a multithreaded program logically.
  From the perspective of the bottom prohibits the reordering is accomplished by calling the command processor providing a respective (memory barrier). Of course, Java as a cross-platform language, it will deal with such instructions for us, and we only need to use the mechanisms provided by the language itself. Volatile keyword we mentioned earlier, synchronized keyword are able to achieve orderly. The volatile keyword, synchronized keyword and reordering related, we will conduct more in-depth understanding in a subsequent article.

Guess you like

Origin www.cnblogs.com/maconn/p/11489606.html