Do you really understand the volatile keyword?

1. Java Memory Model

To understand why volatile ensures visibility, we must first understand what the memory model in Java looks like.
write picture description here

The Java memory model dictates that all variables are stored in main memory. Each thread also has its own working memory, and the working memory of the thread saves the variables used by the thread (these variables are copied from the main memory). All operations (reads, assignments) to variables by a thread must take place in working memory. Different threads cannot directly access variables in each other's working memory, and the transfer of variable values ​​between threads needs to be done through the main memory.

Based on this memory model, problems such as "dirty reading" of data in multi-threaded programming arise.

For a simple example: in java, execute the following statement:

i  = 10++;

The execution thread must first perform an assignment operation on the cache line where the variable i is located in its own worker thread, and then write it into main memory. Instead of directly writing the value 10 to main memory.

For example, there are two threads executing this code at the same time. If the initial value of i is 10, then we hope that the value of i will become 12 after the two threads are executed. But will that be the case?

There may be one of the following situations: Initially, two threads read the value of i and store it in their respective working memory, then thread 1 adds 1, and then writes the latest value of i, 11, to the memory. At this time, the value of i in the working memory of thread 2 is still 10. After adding 1, the value of i is 11, and then thread 2 writes the value of i into the memory.

The final result i has a value of 11, not 12. This is the well-known cache coherency problem. Such variables that are accessed by multiple threads are often called shared variables.

So how to ensure that shared variables can be correctly output when accessed by multiple threads?

Before solving this problem, we must first understand the three concepts of concurrent programming: atomicity, orderliness, and visibility.

2. Atomicity

1. Definition

Atomicity: That is, an operation or multiple operations are either all executed and the process of execution is not interrupted by any factor, or none of them are executed.

2. Examples

A very classic example is the bank account transfer problem:

For example, transferring 1,000 yuan from account A to account B must include two operations: subtracting 1,000 yuan from account A, and adding 1,000 yuan to account B.

Just imagine what the consequences would be if these two operations were not atomic. Suppose that after deducting 1,000 yuan from account A, the operation is suddenly terminated. This will cause account A to deduct 1,000 yuan, but account B does not receive the transferred 1,000 yuan.

Therefore, these two operations must be atomic to ensure that there are no unexpected problems.

What happens when the same is reflected in concurrent programming?

For the simplest example, think about what would happen if the assignment process to a 32-bit variable was not atomic?

i = 9;

If a thread executes this statement, I assume for the time being that assigning a 32-bit variable includes two processes: assigning a value to the lower 16 bits and assigning a value to the upper 16 bits.

Then there may be a situation: when the lower 16-bit value is written, it is suddenly interrupted, and at this time another thread reads the value of i, then the wrong data is read.

3. Atomicity in Java

In Java, read and assignment operations to variables of primitive data types are atomic operations, that is, these operations are not interruptible, either executed or not.

Although the above sentence seems simple, it is not so easy to understand. Consider the following example i:

Please analyze which of the following operations are atomic:

x = 10;         //语句1
y = x;         //语句2
x++;           //语句3
x = x + 1;     //语句4

At first glance, it may be said that the operations in the above 4 statements are all atomic operations. In fact, only statement 1 is an atomic operation, and the other three statements are not atomic operations.

Statement 1 directly assigns the value 10 to x, which means that the thread executing this statement will directly write the value 10 to the working memory.

Statement 2 actually contains two operations. It first reads the value of x, and then writes the value of x to the working memory. Although the two operations of reading the value of x and writing the value of x to the working memory are both Atomic operations, but together they are not atomic operations.

Similarly, x++ and x = x+1 include 3 operations: read the value of x, add 1, and write the new value.

Therefore, in the above four statements, only the operation of statement 1 is atomic.

That is to say, only simple read and assignment (and must be to assign a number to a variable, mutual assignment between variables is not an atomic operation) are atomic operations.

As can be seen from the above, the Java memory model only guarantees that basic reads and assignments are atomic operations. If you want to achieve the atomicity of a wider range of operations, you can use synchronized and Lock to achieve it. Since synchronized and Lock can ensure that only one thread executes the code block at any time, there is naturally no atomicity problem, thus ensuring atomicity.

3. Visibility

1. Definition

Visibility means that when multiple threads access the same variable, one thread modifies the value of the variable, and other threads can immediately see the modified value.

2. Examples

As a simple example, look at the following code:

//Code executed by thread 1

int i = 0;
i = 10;

//Code executed by thread 2

j = i;

From the above analysis, when thread 1 executes the sentence i = 10, it will first load the initial value of i into the working memory, and then assign it to 10, then the value of i in the working memory of thread 1 becomes 10. , but not immediately written to main memory.

At this point, when thread 2 executes j = i, it will first read the value of i from the main memory and load it into the working memory of thread 2. Note that the value of i in the memory is still 0 at this time, so the value of j will be 0. , instead of 10.

This is the visibility problem. After thread 1 modifies the variable i, thread 2 does not immediately see the value modified by thread 1.

3. Visibility in Java

For visibility, Java provides the volatile keyword to guarantee visibility.

When a shared variable is modified by volatile, it guarantees that the modified value will be updated to main memory immediately, and when another thread needs to read it, it will go to memory to read the new value.

Ordinary shared variables cannot guarantee visibility, because after ordinary shared variables are modified, it is uncertain when they are written to main memory. When other threads read, the memory may still have the old value at this time. Therefore, Visibility is not guaranteed.

In addition, visibility can also be guaranteed through synchronized and Lock. Synchronized and Lock can ensure that only one thread acquires the lock at the same time and executes the synchronization code, and the changes to variables are flushed to main memory before releasing the lock. So visibility is guaranteed.

4. Orderly

1. Definition

Ordered: The order in which the program is executed is executed in the order of the code.

2. Examples

As a simple example, look at the following code:

int i = 0;              

boolean flag = false;

i = 1;                //语句1  
flag = true;          //语句2

The above code defines an int type variable, defines a boolean type variable, and then assigns the two variables respectively. In terms of code order, statement 1 is before statement 2, so does the JVM guarantee that statement 1 will be executed before statement 2 when it actually executes this code? Not necessarily, why? Instruction Reorder may occur here.

Let’s explain what instruction reordering is. Generally speaking, in order to improve the efficiency of the program, the processor may optimize the input code. It does not guarantee that the execution order of each statement in the program is the same as the order in the code, but it will Ensure that the final execution result of the program is consistent with the result of the sequential execution of the code.

For example, in the above code, whoever executes statement 1 and statement 2 first has no effect on the final program result, so it is possible that during the execution process, statement 2 is executed first and statement 1 is executed later.

But it should be noted that although the processor will reorder the instructions, it will guarantee that the final result of the program will be the same as the result of the code order execution, so what does it guarantee? Consider the following example:

int a = 10;    //语句1
int r = 2;    //语句2
a = a + 3;    //语句3
r = a*a;     //语句4

This code has 4 statements, so one possible execution order is:

write picture description here

So is it possible that this order of execution is: Statement 2 Statement 1 Statement 4 Statement 3

Impossible, because the processor will consider the data dependencies between instructions when reordering. If an instruction Instruction 2 must use the result of Instruction 1, then the processor will guarantee that Instruction 1 will be executed before Instruction 2.

While reordering does not affect the results of program execution within a single thread, what about multithreading? Let's see an example:

//线程1:

context = loadContext();   //语句1
inited = true;             //语句2

 //线程2:
while(!inited ){
   sleep()
}
doSomethingwithconfig(context);

In the above code, since statement 1 and statement 2 have no data dependencies, they may be reordered. If reordering occurs, statement 2 is executed first during the execution of thread 1, and thread 2 will think that the initialization work has been completed, then it will jump out of the while loop and execute the doSomethingwithconfig(context) method, and the context does not have at this time. If it is initialized, it will cause a program error.

As can be seen from the above, instruction reordering will not affect the execution of a single thread, but it will affect the correctness of concurrent execution of threads.

That is, for concurrent programs to execute correctly, atomicity, visibility, and ordering must be guaranteed. As long as one is not guaranteed, it is possible to cause the program to behave incorrectly.

3. Ordering in Java

In the Java memory model, the compiler and processor are allowed to reorder instructions, but the reordering process will not affect the execution of single-threaded programs, but will affect the correctness of multi-threaded concurrent execution.

In Java, the volatile keyword can be used to ensure a certain "order". In addition, synchronized and Lock can be used to ensure orderliness. Obviously, synchronized and Lock ensure that one thread executes the synchronization code at each moment, which is equivalent to letting the threads execute the synchronization code sequentially, which naturally ensures the orderliness.

In addition, the Java memory model has some innate "orderliness", that is, the orderliness that can be guaranteed without any means, this is usually called the happens-before principle. If the order of execution of two operations cannot be deduced from the happens-before principle, then they cannot guarantee their ordering, and the virtual machine can reorder them at will.

Let's introduce the happens-before principle in detail:

①Program order rule: In a thread, according to the code order, the operation written in the front happens first before the operation written in the back

②Locking rules: An unLock operation occurs first before a lock operation on the same lock.

③volatile variable rule: the write operation to a variable occurs first before the read operation of the variable

④Transmission rule: If operation A occurs first in operation B, and operation B occurs first in operation C, it can be concluded that operation A occurs first in operation C

⑤Thread start rules: The start() method of the Thread object occurs first for each action of this thread

⑥Thread interrupt rules: The call to the thread interrupt() method occurs first when the code of the interrupted thread detects the occurrence of the interrupt event

⑦Thread termination rule: All operations in the thread occur first in the thread termination detection. We can detect that the thread has terminated execution through the end of the Thread.join() method and the return value of Thread.isAlive().

⑧Object finalization rule: the initialization of an object occurs first at the beginning of its finalize() method

Among these 8 rules, the first 4 rules are more important, and the last 4 rules are obvious.

Let's explain the first 4 rules:

For program order rules, the execution of a piece of program code appears to be ordered in a single thread. Note that although this rule mentions that "the operation written in front happens first in the operation written in the back", this should be the order in which the program appears to be executed in the code order, but the virtual machine may perform operations on the program code. Instruction reordering. Although reordering is performed, the final execution result is consistent with the result of program sequential execution, and it will only reorder instructions that do not have data dependencies. Therefore, within a single thread, program execution appears to be executed in order, which should be understood carefully. In fact, this rule is used to ensure the correctness of the program execution results in a single thread, but it cannot guarantee the correctness of the program execution in multiple threads.

The second rule is also relatively easy to understand, that is to say, whether in a single thread or a multi-thread, if the same lock is in a locked state, the lock must be released before the lock operation can continue.

The third rule is a more important one. The intuitive explanation is that if a thread writes a variable first, and then a thread reads it, the write operation will definitely happen before the read operation.

The fourth rule actually reflects the transitivity of the happens-before principle.

Five, in-depth understanding of the volatile keyword

1. volatile guarantees visibility

Once a shared variable (member variable of a class, static member variable of a class) is modified by volatile, it has two layers of semantics:

1) Ensures the visibility of different threads operating on this variable, that is, a thread modifies the value of a variable, and the new value is immediately visible to other threads.

2) Instruction reordering is prohibited.

Let's first look at a piece of code. If thread 1 executes first, thread 2 executes after:

//线程1
boolean stop = false;
while(!stop){
    doSomething();
}

//线程2
stop = true;

This code is a typical piece of code, and many people may use this marking method when interrupting a thread. But in fact, will this code run exactly right? Will the thread be interrupted? Not necessarily, maybe most of the time, this code can interrupt the thread, but it may also cause the thread to be interrupted (although this possibility is very small, as long as this happens, it will cause an infinite loop).

Let's explain why this code may cause the thread to not be interrupted. As explained earlier, each thread has its own working memory during operation, so when thread 1 is running, it will copy the value of the stop variable and put it in its own working memory.

Then when thread 2 changes the value of the stop variable, but has not had time to write to the main memory, thread 2 turns to do other things, then thread 1 will keep looping because it does not know the change of thread 2 to the stop variable. go down.

But after modifying it with volatile, it becomes different:

First: using the volatile keyword forces the modified value to be written to main memory immediately;

Second: If the volatile keyword is used, when thread 2 modifies, the cache line of the cache variable stop in the working memory of thread 1 will be invalid (if reflected to the hardware layer, it is the corresponding cache line in the L1 or L2 cache of the CPU). invalid);

Third: Since the cache line of the cache variable stop in the working memory of thread 1 is invalid, when thread 1 reads the value of the variable stop again, it will go to the main memory to read.

Then when thread 2 modifies the stop value (of course, there are two operations here, modifying the value in the working memory of thread 2, and then writing the modified value to the memory), the cache line of the variable stop will be cached in the working memory of thread 1 Invalid, then when thread 1 reads, it finds that its cache line is invalid, it will wait for the main memory address corresponding to the cache line to be updated, and then go to the corresponding main memory to read the latest value.

Then thread 1 reads the latest correct value.

2. volatile does not guarantee atomicity

Let's see an example:

public class Test {
    public volatile int inc = 0;

    public void increase() {
        inc++;
    }

    public static void main(String[] args) {
        final Test test = new Test();
        for(int i=0;i<10;i++){
            new Thread(){
                public void run() {
                    for(int j=0;j<1000;j++)
                        test.increase();
                };
            }.start();
        }

        while(Thread.activeCount()>1)  //保证前面的线程都执行完
            Thread.yield();
        System.out.println(test.inc);
    }
}

Think about the output of this program. Maybe some friends think it's 10,000. But in fact, running it will find that the results of each run are inconsistent, and it is a number less than 10,000.

Some friends may have doubts, no, the above is to perform an auto-increment operation on the variable inc. Since volatile ensures visibility, after the inc is auto-incremented in each thread, it can be seen in other threads. The modified value, so there are 10 threads performing 1000 operations respectively, then the final value of inc should be 1000*10=10000.

There is a misunderstanding here. The volatile keyword can ensure that there is no error in visibility, but the above program is wrong in that it fails to guarantee atomicity. Visibility can only guarantee that each read is the latest value, but volatile cannot guarantee the atomicity of operations on variables.

As mentioned earlier, the auto-increment operation is not atomic. It includes reading the original value of the variable, adding 1, and writing to the working memory. Then it means that the three sub-operations of the auto-increment operation may be executed separately, which may lead to the following situations:

If the value of the variable inc at a certain time is 10,

Thread 1 performs an auto-increment operation on the variable. Thread 1 first reads the original value of the variable inc, and then thread 1 is blocked;

Then thread 2 performs an auto-increment operation on the variable, and thread 2 also reads the original value of the variable inc. Since thread 1 only reads the variable inc without modifying the variable, it will not cause the work of thread 2. The cache line of the cache variable inc in the memory is invalid, and it will not cause the value in the main memory to be refreshed, so thread 2 will directly go to the main memory to read the value of inc, find that the value of inc is 10, then add 1 to it, and put 11 is written to working memory and finally to main memory.

Then thread 1 continues to add 1. Since the value of inc has been read, note that the value of inc in the working memory of thread 1 is still 10 at this time, so the value of inc after thread 1 adds 1 to inc is 11. , then writes 11 to working memory, and finally to main memory.

Then after the two threads perform an auto-increment operation, inc only increases by 1.

This is the root cause. The auto-increment operation is not an atomic operation, and volatile does not guarantee that any operation on a variable is atomic.

Solution: You can lock through synchronized or lock to ensure the atomicity of the operation. Also via AtomicInteger.

Some atomic operation classes are provided under the java.util.concurrent.atomic package of java 1.5, that is, self-increment (add 1 operation), self-decrement (subtract 1 operation), and addition operation (add a number) to basic data types , the subtraction operation (minus a number) is encapsulated to ensure that these operations are atomic. Atomic uses CAS to implement atomic operations (Compare And Swap). CAS is actually implemented using the CMPXCHG instruction provided by the processor, and the processor executing the CMPXCHG instruction is an atomic operation.

3. volatile guarantees ordering

As mentioned earlier, the volatile keyword can prohibit instruction reordering, so volatile can guarantee ordering to a certain extent.

The volatile keyword prohibiting instruction reordering has two meanings:

1) When the program executes the read operation or write operation to the volatile variable, the changes of the previous operations must have all been carried out, and the results have been visible to the subsequent operations; the subsequent operations must have not been carried out;

2) When performing instruction optimization, the statement that reads or writes the volatile variable cannot be executed after it, nor can the statement after the volatile variable be executed before it.

The above may be more convoluted. Here is a simple example:

//x、y为非volatile变量
//flag为volatile变量

x = 2;        //语句1
y = 0;        //语句2
flag = true;  //语句3
x = 4;         //语句4
y = -1;       //语句5

Since the flag variable is a volatile variable, during the instruction reordering process, statement 3 will not be placed before statement 1 and statement 2, nor will statement 3 be placed after statement 4 and statement 5. Note, however, that the order of Statement 1 and Statement 2, and the order of Statement 4 and Statement 5 are not guaranteed.

And the volatile keyword can ensure that when statement 3 is executed, statement 1 and statement 2 must be executed, and the execution results of statement 1 and statement 2 are visible to statement 3, statement 4, and statement 5.

So let's go back to the example we gave earlier:

//线程1:
context = loadContext();   //语句1
inited = true;             //语句2

//线程2:
while(!inited ){
  sleep()
}
doSomethingwithconfig(context);

In the previous example, it was mentioned that statement 2 may be executed before statement 1, which may cause the context to not be initialized for a long time, and thread 2 will use the uninitialized context to operate, resulting in program errors.

If the inited variable is modified with the volatile keyword, this problem will not occur, because when statement 2 is executed, it must be guaranteed that the context has been initialized.

Six, the realization principle of volatile

1. Visibility

In order to improve the processing speed, the processor does not directly communicate with the memory, but separates the data in the system memory into the internal cache before operating, but after the operation, it will be written to the memory at some point.

If a write operation is performed on a declared volatile variable, the JVM will send a Lock prefixed instruction to the processor to write the data of the cache line where the variable is located to the system memory. This step ensures that if another thread modifies the declared volatile variable, the data in main memory is updated immediately.

However, the caches of other processors are still old at this time, so in a multiprocessor environment, in order to ensure that the caches of each processor are consistent, each process will check whether its own cache is expired by sniffing the data propagated on the bus. When the processor finds that the memory address corresponding to its cache line has been modified, it will set the current processor's cache line to an invalid state. When the processor wants to modify the data, it will be forced to re-read the data from the system memory. in the processor cache. This step ensures that the declared volatile variables obtained by other threads are all up-to-date from main memory.

2. Orderliness

The Lock prefix instruction is actually equivalent to a memory barrier (also called a memory barrier), which ensures that the instruction after it will not be queued to the position before the memory barrier when the instruction is reordered, and the previous instruction will not be queued to the memory barrier. After; that is, when the instruction to the memory barrier is executed, the operations in front of it have been completed.

Seven, volatile application scenarios

The synchronized keyword is to prevent multiple threads from executing a piece of code at the same time, which will greatly affect the program execution efficiency, and the volatile keyword has better performance than synchronized in some cases, but it should be noted that the volatile keyword cannot replace the synchronized keyword. , because the volatile keyword cannot guarantee the atomicity of the operation. Generally speaking, the use of volatile must meet the following two conditions:

1) A write to a variable does not depend on the current value
2) The variable is not contained in an invariant with other variables

Here are a few scenarios where volatile is used in Java.

①.Status mark amount

volatile boolean flag = false;
 //线程1
while(!flag){
    doSomething();
}
  //线程2
public void setFlag() {
    flag = true;
}
根据状态标记,终止线程。

②.单例模式中的double check

class Singleton{
    private volatile static Singleton instance = null;

    private Singleton() {

    }

    public static Singleton getInstance() {
        if(instance==null) {
            synchronized (Singleton.class) {
                if(instance==null)
                    instance = new Singleton();
            }
        }
        return instance;
    }
}

Why use volatile to modify instance?

The main reason is the sentence instance = new Singleton(), which is not an atomic operation. In fact, in the JVM, this sentence probably does the following three things:

1. Allocate memory to the instance
2. Call the constructor of Singleton to initialize member variables
3. Point the instance object to the allocated memory space (the instance will be non-null after this step).

But there are optimizations for instruction reordering in the JVM's just-in-time compiler. That is to say, the order of the second and third steps above is not guaranteed, and the final execution order may be 1-2-3 or 1-3-2. If it is the latter, it is preempted by thread 2 before 3 is executed and 2 is not executed. At this time, instance is already non-null (but not initialized), so thread 2 will directly return to instance, then use it, and then logically report an error.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324601296&siteId=291194637