JVM Learning (1)—A simple understanding of JVM and JMM memory models

1. Overview of JVM memory model

(1) Why does the JVM memory model appear?

The JVM memory model is a specification that describes the way and rules for the Java virtual machine to store data and code in the program into computer memory when executing a Java program. The JVM memory model defines the memory structure used by the Java virtual machine and the relationship between memory areas, allowing Java programs to run more efficiently.

The Java virtual machine is a virtual machine based on a stack architecture, which is different from that on a physical machine 寄存器的架构. Therefore, the Java virtual machine requires a memory model to handle the data and code involved in a Java program 存储, as well as during execution 内存管理. The emergence of the JVM memory model enables Java programs to run more standardized, efficient and flexibly.

JVM memory model diagram, as shown below:

Insert image description here

  1. Program Counter (Program Counter Register): Each thread has an independent program counter, which is used to store the address of the bytecode instruction currently executed by the thread.
  2. Java virtual machine stack (JVM Stack): Each thread has an independent Java virtual machine stack, which is used to store local variables, operand stacks, return values ​​and other information.
  3. Native Method Stack : Similar to the Java virtual machine stack, but used to execute native methods.
  4. Heap : used to store Java object instances and is shared by all threads.
  5. Method Area : Used to store structural information, static variables, constant pool and other information of the class, which is shared by all threads.
  6. Runtime Constant Pool: part of the method area, used to store various literals and symbol references generated during compilation.
  7. Direct Memory : A memory that uses non-JVM management, but uses it through the JVM's API. It is usually used to improve the performance of I/O operations.

(2) Why is JVM memory divided in this way? The main points are as follows:

  1. Separate program data and JVM internal data structures : The Java virtual machine needs to store a lot of internal structure and state information, as well as the data and code of the Java program. In order to make these different types of data structures independent of each other, the Java virtual machine stores them in different memory areas.

  2. Memory management needs : JVM needs to manage and optimize memory. Dividing different memory areas helps the JVM more accurately control memory allocation, recycling, sorting and other operations, thereby improving the operating efficiency and stability of the program. For example, a large number of objects are created and destroyed in Java programs. In order to avoid frequent garbage collection, the JVM uses two areas: the new generation and the old generation. The new generation is used to store newly created objects, and the old generation is used to store life objects. Objects with long periods. In addition, the JVM also provides areas such as the method area and the virtual machine stack, which are used to store class information and stack frame information when threads are executed. By dividing memory areas, the JVM can more accurately control the allocation and release of memory, avoiding problems such as memory leaks and memory fragmentation. At the same time, this design can also make the GC algorithm more efficient, because different areas use different GC strategies, which can be optimized according to their respective characteristics to improve GC efficiency and response speed.

  3. Flexible memory management : In different application scenarios, the JVM needs to allocate memory spaces of different sizes and life cycles to Java programs. Dividing different memory areas allows the JVM to manage memory more flexibly to meet different needs.

  4. Memory safety : The JVM's memory model has a powerful security mechanism that can protect the data and code of Java programs from being destroyed by malicious programs. By dividing different memory areas, the JVM can better achieve memory safety.

To sum up, the JVM divides different memory areas mainly to store the data and code of the Java program in different memory areas, better manage the memory, improve the operating efficiency and stability of the program, and meet different memory requirements. And ensure the memory safety of the program.

( 3) The GC garbage collector corresponds to the recycling algorithm, as shown in the following list:

  1. Serial collector : uses mark-sweep algorithm.

  2. Parallel collector : uses mark-sweep algorithm or mark-collation algorithm.

  3. CMS collector : uses mark-sweep algorithm and mark-sweep-collation algorithm (mark-sweep is used in initial mark and concurrent sweep phases of CMS, and mark-sweep-collate is used in concurrent mark and concurrent sweep phases).

  4. G1 collector : uses mark-sort algorithm and copy algorithm (moving and cleaning objects through memory copy between regions).

It should be noted that these are just some common garbage collectors and corresponding recycling algorithms. In fact, there are many other garbage collectors and recycling algorithms, and different garbage collectors may also use different combinations for recycling.

2. Overview of JMM memory model

JMM (Java Memory Model, referred to as JMM) is an abstract concept that does not really exist. It is a set of conventions or specifications described. Through this set of specifications, the read and write access methods of each variable in the program are defined and determined. When does one thread's reading and writing of shared variables become visible to another thread? The key technical point revolves around the three characteristics of multi-threading 原子性, 可见性and 有序性. What is 原子性, 可见性and are discussed below 有序性.

Since the entity of the JVM running program is a thread, when each thread is created, the JVM will create a working memory (called stack space in some places) for storing thread-private data, and the Java memory model stipulates that all variables are stored In main memory, main memory is a shared memory area that can be accessed by all threads. However, thread operations on variables (reading and assigning values, etc.) must be performed in the working memory. First, the variables must be copied from the main memory to their own working memory space. , and then operate the variables. After the operation is completed, the variables are written back to the main memory. Variables in the main memory cannot be directly operated. A copy of the variables in the main memory is stored in the working memory. Different threads cannot access each other's working memory. , communication (value transfer) between threads must be completed through main memory. The brief access process is as follows:

Insert image description here

Program running is driven by threads. It can be said that the program running carrier is thread.What can the JMM specification be used for?

  1. Implement the abstract relationship between threads and main memory through JMM
  2. Shield the differences in memory access between various hardware platforms and operating systems to achieve consistent access to memory by Java programs on various platforms.

Assume no JMM control exists, because the copies between threads cannot be accessed, and the shared data is actually in the main memory. If you operate variables in the above way, there will inevitably be a data consistency problem. An example is as follows:

Insert image description here

There is a variable count in the main memory with an initial value of 0. Now if thread A wants to increment it by 1, it must first read the count variable from the main memory and copy it to the private memory, and then increment the count by 1. When thread A updates the count variable value in private memory, it is ready to synchronize it to the main memory, but the time is not fixed. Assume that thread A has not had time to flush the count to the main memory, and thread B also copies the previous count to its own private memory. At this time, the count is still the initial value 0, and it also adds 1 to it, and finally flushes it back to the main memory. , the expected value is count to be 2, but since there is no visibility guarantee, the final count value is 1.

This is it 线程数据脏读. Then JMM specifications (constraints) are needed to solve this problem 可见性. Its key technical point is centered around the three characteristics of multi-threading 原子性, 可见性and 有序性. What is 原子性, 可见性and are discussed below 有序性.

3. Three major features of JMM

1. Visibility

Visibility refers to whether other threads can immediately know the change when a thread modifies the value of a shared variable. JMM stipulates that all variables are stored in main memory.

Generally, it is not allowed for a thread to directly modify the value in the main memory. The value in the main memory needs to be copied to the local memory of the current thread. The thread modifies this shared memory copy. After the modification, it is written back to the main memory through JMM control. , and then notify other threads of the change operation of this variable. Threads cannot directly access variables in each other's working memory, and the transfer of variable values ​​between threads is completed through main memory.

For example, the following example:

public class VisibilityDemo {
    
    
    private static boolean flag = true;

    public static void main(String[] args) throws InterruptedException {
    
    
        Thread thread1 = new Thread(() -> {
    
    
            try {
    
    
                Thread.sleep(500);
            } catch (InterruptedException e) {
    
    
                e.printStackTrace();
            }
            flag = false;
            System.out.println("Thread1 set flag to true");
        });

        Thread thread2 = new Thread(() -> {
    
    
            while (flag) {
    
    
                // 此处不断循环等待,直到flag变量被修改
            }
            System.out.println("Thread2 detected flag change");
        });

        thread1.start();
        thread2.start();
    }
}

In the above code, there are two threads reading and writing the shared variable flag respectively. These two threads are started in the main thread. The thread1 thread will set the flag to true after sleeping for 500ms, while the thread2 thread will wait in a continuous loop for the flag variable to be modified. Once it detects that the flag changes to true, it will output Thread2 detected flag change.

However, since there is no synchronization mechanism, the second thread may never detect the modification of the shared variable by the first thread, because the modification of the shared variable by the first thread may always exist in that thread's local cache without being timely. Flush to main memory, causing the second thread not to see this modification. Therefore, the output of the program can be one of two things:

The program cannot exit normally and remains in a waiting state because the second thread cannot detect the modifications of the first thread.
Thread2 detected flag change is output, but this is not a result that occurs every time it is run, because the modification of the flag variable may always exist in the local cache of the first thread and is not refreshed to the main memory in time.
To solve this problem, we need to use some synchronization mechanism to ensure visibility, such as using synchronizedkeyword or volatilekeyword.

2. Orderliness

For a threaded code, we may be accustomed to thinking that it is executed from top to bottom and in order. However, in order to improve performance, the compiler and processor will perform instruction set modifications 重排序. Instruction rearrangement can be guaranteed 串行语义一致, butThere is no obligation to ensure consistent semantics across multiple threads, is very likely to occur 数据脏读. In other words, when two lines of unrelated code are executed, the first one may not be executed first, and it may not be executed from top to bottom. The execution order may be optimized. Instruction set optimization may occur in the following stages, as shown below:

Insert image description here

1. Compiler optimization reordering : The compiler can rearrange the order of statement execution without changing the semantics of the program running in a single-threaded environment. The purpose is to reduce the number of reading and storing registers and reuse the data in the registers.

For example, there are three codes below. Assume that A and B happen to be in the same address space in the stack at this time. Then B will overwrite the original value of A. When C uses A, it needs to read the value of A again, resulting in performance degradation:

Step 1: A = The result value of a certain calculation
Step 2: B = The result value of a certain calculation
Step 3: C = The result value of A needs to be used for calculation

After optimization and rearrangement, it is as follows:

Step 1: A = The result value of a calculation
Step 2: C = The result value of A needs to be used to perform the calculation
Step 3: B = The result value of a calculation

In this way, C can reuse the A value stored in the register.

2. Instruction-level parallel reordering : The processor executes multiple instructions in parallel. If there is no data dependency, the processor can change the execution order of instructions corresponding to statements.

For example, the following two instructions have no data dependence at all and can be executed in parallel to improve execution efficiency:

int a = 5
int b = 6

3. Memory system reordering : The processor uses cache and read-write buffers, making data loading and storage operations appear to be out of order.

Insert image description here

There is nothing to say about single thread. The final execution result is definitely the same as the result of sequential execution of the code.

The processor must consider instructions between instructions when reordering 数据依赖性, so what is it 数据依赖性?

For example, the following code:

    public static synchronized void sop() {
    
    
        int x = 15; // 语句1
        int y = 20; // 语句2
        x = x + 100;// 语句3
        y = x * x;  // 语句4
    }

The execution statement can be executed according to 1234, 2134, and 1324, but you cannot put statement 4 before statement 3 for execution, because statement 4 needs to rely on statement 3 to be executed first, and you cannot put the cart before the horse, otherwise the result will be wrong. Like it exists between statements 3 and 4 数据依赖性, ifViolating data dependencies must prohibit instruction reordering

Data dependency definition: If two operations access the same shared variable, and one of the two operations is a write operation, then there is a data dependency between the two operations, and reordering is not allowed.

Data dependencies are divided into three categories. First assume that there are two int variables a and b, and then look at the categories:

  • Read and then write : read a variable and then write to the variable
a = b;
b = 1;
  • Write after write : write a variable, and then continue writing this variable
a = 10;
a = 100;
  • Read after writing : After writing a variable, read the variable
a = 10;
b = a;

Note that there is no read after read, because the data dependency must have a write operation, and these instructions with data dependencies will not be reordered.

2.1. What is as-if-serial semantics?

Regardless of whether there is reordering or not, and regardless of how the reordering is performed, the execution result of the program will not change in a single-threaded environment. Compilers, JVMs, and processors must implement this semantics. as-if-serialThe semantics ensure that the execution result will not be changed in a single-threaded environment, and happens-beforemore importantly, it will ensure that the execution result will not be changed in a multi-threaded environment.

In a multi-threaded environment, there is alternating execution of up and down. Due to the existence of compiler optimization rearrangement, it is uncertain whether the variables used in the two threads can ensure consistency, and the results are unpredictable. So how to solve this problem? Can follow happens-beforethe rules.

2.2. happens-before rule

In the JMM memory model, happens-beforea rule is used to determine the execution order between two operations. If the execution result of one operation needs to affect the execution of another operation, then the relationship between the two operations must be satisfied happens-beforeto ensure the order between them.

Specifically, if operation A happens-before operation B, then the execution results of operation A are visible to operation B, and the execution order of operation A is earlier than the execution order of operation B. happens-beforeRelationships can be established in many ways, such as synchronization, volatile variables, thread startup and termination, thread join , etc.

In Java, happens-before rules include the following aspects:

  1. Program order rules : In a thread, according to the order of program code, the previous operation happens-before any subsequent operation.

  2. Locking rules : In a thread, all operations before releasing the lock happen-before all operations when subsequent threads acquire the same lock.

  3. Volatile variable rules : A write operation to a volatile variable happens-before a subsequent read operation to the variable.

  4. Transitivity rule : If A happens-before B and B happens-before C, then A happens-before C.

  5. Thread startup rules : Within a thread, the start() method of the Thread object happens-before any operation on this thread.

  6. Thread interruption rules : A call to the interrupt() method of a thread happens-before any operation of the interrupted thread in that thread.

  7. Thread termination rules : Within a thread, any operation in the thread happens-before other threads check that the thread has terminated.

  8. Object finalization rules : The initialization of an object is completed (the execution of the constructor method ends) happens-before the beginning of its finalize() method.

These rules describe the relationships between operations in the JMM memory model happens-before,Guaranteed correctness in Java concurrent programs

Give an example to illustrate the use of eight rules:

private int value = 0;

public void setValue(int value){
    
    
	this.value = value;
}
public void getValue(){
    
    
	return this.value;
}

Suppose there are two threads A and B. Thread A removes the setValue() method, and thread B calls the getValue() method. What is the return value received by thread B? Answer: Not sure

Simply analyze according to the above 8 happens-beforerules (Rules 5, 6, 7, and 8 can be ignored, and their operations are not involved)

  1. Since the two methods are called by different threads and are not in the same thread, it is not satisfied.程序次序规则
  2. Both methods do not use locks and are not satisfactory.锁定规则
  3. The variable is not modified with the volatile keyword, which is not satisfied.volatile 变量规则
  4. 传递规则Even more dissatisfied

Therefore, it cannot be deduced from the happens-before principle that thread A happens-before thread B, so this code is unsafe. How to fix this code?

  1. You can add synchornized locks to both getValue() and setValue() methods.
  2. Define the value variable as volatile keyword

有序性Problem: If everything in the JMM memory model relies volatileon synchornizedkeywords, then many programs will become very verbose. happens-beforeBut generally speaking, we do not add these two keywords all the time when writing code. They are mostly used in concurrent programming. This is because of the principle restrictions and regulations under the JMM principle in the Java language.

To summarize what happens-before , there are two general principles:

1. If one operation happens-before another operation, then the execution result of the first operation will be visible to the second operation, and the execution order of the first operation will be before the second execution (for the programmer) a constraint) .

2. The existence of a happens-before relationship between two operations does not mean that they must be executed in the order of happens-before atomic execution. If reordering can give better performance, the execution result will be executed according to the happens-before relationship. The results are consistent. Then this reordering is not illegal. For example: 1+2 = 2+1, or shift duty, etc. The final result is consistent (a constraint on compiler and processor reordering) .

2.3. Understand volatile semantics?

Regarding the visibility and ordering issues caused by instruction rearrangement, the JMM memory model defines a set of the above eight happens-beforeprinciples to ensure the atomicity, visibility, and ordering between two operations in a multi-threaded environment. In addition to the above eight principles, developers are also provided with volatilekeywords synchornizedto solve atomicity, visibility and orderliness. volatileAnother very important role is 禁止指令重排序.

It can be summarized in two sentences as follows:

1. When writing a volatile variable, JMM will immediately refresh the local variable corresponding to the thread back to the main memory.
2. When reading a volatile variable, JMM will invalidate the local variable corresponding to the thread and read it directly from the main memory. shared variables


2.3.1. volatileImplementation of memory semantics:

1. Bytecode level

volatile is used to modify variables, and a flag ACC_VOLATILE will be added at the bytecode level.

2.JMM level

Where to insert 内存屏障指令and what to insert 内存屏障指令. See the volatile variable rule table as follows:

2.1.volatile variable rules:

first operation The second operation: ordinary reading and writing The second operation: volatile read The second operation: volatile write
Ordinary reading and writing Can be rearranged Can be rearranged Cannot be rearranged
volatile read Cannot be rearranged Cannot be rearranged Cannot be rearranged
volatile write Can be rearranged Cannot be rearranged Cannot be rearranged
  1. When the first operation is volatilea read, the second operation does not allow any reordering, which ensures that volatileoperations after the read will not be queued volatilebefore the read.

  2. When the second operation is volatilea write, the first operation does not allow reordering no matter what it is. This ensures that volatileoperations before the write will not be reordered volatileafter the write.

  3. When the first operation is volatilewriting and the second operation is volatilereading, reordering is not possible.

2.2. Barrier instruction insertion strategy:

The reason why the above effect occurs is because volatilethe bottom layer of keywords automatically volatileinserts the above four major variables for you in the read and write operations 内存屏障指令:loadload、storestore、loadstore、storeload. The specific insertion strategy is as follows:

  • Volatile writes : Insert a storestore barrier before each volatile write operation and a storeload barrier after each volatile write operation .

  • Volatile read operations : Insert a loadload barrier after each volatile read operation and a loadstore barrier after each volatile read operation .

Why insert it like this? Let’s explain it below:

Let's first see why a storestore barrier is inserted before each volatile write operation and a storeload barrier is inserted after each volatile write operation . As shown below:

Insert image description here

After seeing why a loadload barrier is inserted after every volatile read operation , a loadstore barrier is inserted after every volatile read operation . As shown below:

Insert image description here

But why isn't volatilereordering between ordinary reads and writes prohibited?

(1) Volatile write : Targets volatilethe variable modified by . When writing, volatilethe value of the variable is refreshed from the working memory back to the main memory.
(2) Normal reading: What is read must not be volatilethe modified variable.

Therefore, normal reading will not affect the value of volatile variables. It also does not affect the corresponding memory. Will not destroy volatile memory semantics.

In a simple sentence: volatilewhen writing, it is directly flushed back to the main memory, and when reading, it is directly fetched from the main memory. So how does the volatile keyword ensure program visibility and orderliness? The bottom layer is 内存屏障指令implemented through, what is it specifically 内存屏障指令?

3. Processor level

When the CPU executes machine instructions, it uses lockprefix execution to implement volatilethe function.

(1) First lock the bus/cache, then execute subsequent instructions, and finally, release the lock and refresh the cache data back to the main memory.
(2) When the lock locks the bus/cache, the read and write requests of other CPUs will be blocked until the lock is released. Write operations after the lock will invalidate the corresponding data in the caches of other CPUs, so that when subsequent CPUs read data, they need to load the latest data from the main memory.

Supplement: Cache line refers to the smallest storage unit in the cache, usually a 64-byte (or more) data block, also called cache block. When the processor needs to access a memory address, it will first check whether the corresponding cache line exists. If it exists, the processor will directly read the data from the cache line. This situation is called a cache hit; if it does not exist, The processor needs to first read the corresponding data block from memory and put it into the cache. This process is called a cache miss. Since the size of the cache line is generally relatively large, multiple data can be obtained in one read, thereby improving the reading efficiency. At the same time, the access speed of cache is much faster than that of memory, so using cache can effectively reduce the number of accesses to memory by the processor and improve the running efficiency of the program.

2.4. Memory barrier instructions

Let’s look at an example first. For example, during holidays, no one is controlling the West Lake and it looks very chaotic. Some people may fall into the river, as shown in the following figure:

Insert image description here

Because there is no control, the order cannot be guaranteed, so rules need to be set to prohibit disorder - the human wall isolation in Shanghai and Nanjing, as shown below:

Insert image description here

This can prevent people above and people below from running around together. This human wall is equivalent to the barrier we are talking about now, but it is now in memory, so it is called it 内存屏障. So the volatile keyword is used to 内存屏障ensure visibility and ordering.

Memory Barrier refers to a group 处理器指令used to implement sequential restrictions on memory operations. Memory barriers can be divided into many types, including 读屏障, 写屏障, 全屏障etc.

In Java, memory barriers are widely used to implement memory visibility and happens-before rules. The Java memory model stipulates that inserting appropriate memory barrier instructions before or after performing an operation can ensure program correctness and ensure that each thread's operations on shared variables can be performed in a certain order.

For example, for a write operation to a volatile variable, the Java memory model requires that a write barrier (store barrier) be inserted after the write operation to ensure that the results of the operation are visible to other threads. For reading operations of volatile variables, a read barrier (load barrier) needs to be inserted before the reading operation to ensure that the reading operation sees the latest value.

In summary, memory barriers are a very important hardware and compiler technology that play a key role in implementing multi-threaded programming and shared memory.

2.5. Memory barrier classification

Memory barriers can be divided into many types, including 读屏障, 写屏障, 全屏障.

  1. Read barrier (Load Barrier) : Insert a read barrier before a read operation to ensure that all write operations before the read operation are visible to the read operation, that is, the read operation sees the latest write operation results. In other words, this read instruction will directly invalidate the data in its working memory/CPU cache and re-read the latest data from the main memory.

  2. Write Barrier (Store Barrier) : Inserting a write barrier after a write operation ensures that the write operation is visible to the read operations of other threads, that is, other threads see the latest write operation results. In other words, this write instruction will force the data in the cache area to be flushed back to the main memory.

  3. Full barrier (fullFence) : The full barrier is a combination of a read barrier and a write barrier. It not only ensures that the write operation is visible to the read operations of other threads, but also ensures that the read operation sees the latest write operation results.

It should be noted that 内存屏障you need to use it with caution, as excessive use will affect the performance of the program.

Let’s see 内存屏障what it looks like:

  1. The source code of Unsafe.java is as follows: There are three native modified methods in unsafe classes in Java , all of which call the JVM code for execution.
public final class Unsafe {
    
    
    @HotSpotIntrinsicCandidate
    public native void loadFence();
    @HotSpotIntrinsicCandidate
    public native void storeFence();
    @HotSpotIntrinsicCandidate
    public native void fullFence();
}
  1. The source code of Unsafe.cpp is as shown below: In the HotSport source code, the bottom layer calls the OrderAccess::acquire (), OrderAccess::release (), and OrderAccess::fence () methods respectively.

Insert image description here

  1. Continue to enter the OrderAccess class method, as shown below:

Insert image description here

The above four are what we often talk about四大内存屏障指令: loadload、storestore、loadstore、storeload

  1. Continue to follow orderAccess_linux_x64.inline.hppthe Linux operating system files, as shown below:

Insert image description here

2.6. Memory barrier instructions

loadloadWhat do the four barrier commands , storestore, loadstore, seen above storeloadmean? In fact 写屏障前后顺序, 读屏障前后顺序they are arranged and combined to form four groups 屏障指令.

barrier type Instruction example illustrate
loadload load1;loadload;load2 1. Disable reordering: ensure that load1 read operation occurs before load2 and 后续read operations.
2. Loadload here is equivalent to one 读屏障. You need to know what will happen if a read barrier is inserted before a read instruction (load2). It will have a read barrier function (guaranteeing that the data in the Load2 working memory cache will be directly invalidated and read from the master again. Read the latest data in the memory)

There is a question here, why only the later load2 is guaranteed and not the previous load1? If executed according to the program sequence, there is always a beginning to execute the acquisition of data. At the beginning, there is no data in the load1 cache at all, only It can be obtained from the main memory, so only when you read it for the second time, you need to invalidate your cache and re-read the data from the main memory.
storestore store1;storestore;store2 1. Reordering is prohibited: Before store2 writes and 后续operations, it must be ensured that store1 has been flushed back to the main memory.
2. The store here is equivalent to one . You must know what will happen if a write barrier is inserted after写屏障 a write instruction (store1). effect, it will have the function of a write barrier (to ensure that the data written by the store1 instruction is forced to be returned to the main memory)
loadstore load1;loadstore;store2 1. Disable reordering: Before store2 write operation and 后续operation, it must be ensured that load1 has been read.
2. This instruction is just opposite to the read and write barrier order, so it does not have the read and write barrier function, only the reordering function is prohibited.
storeload store1;storeload;load2 1. Prohibit reordering: ensure that the write operation of store1 has been flushed back to the main memory, and the read operation and load2 后续operation can be executed
. 2. The storeload instruction is very powerful. If you look closely, you will find that it has such characteristics. It is inserted after the write operation (store1) 写屏障. Insert before the read operation (load2) 读屏障, then it must have the functions of the above 读屏障and写屏障

It can be found in the above table that storeloadthe barrier instructions are relatively heavy: loadload is equivalent to a read barrier, storestore is equivalent to a write barrier, loadstore only has the function of disabling reordering, because the order of read and write barriers is exactly the opposite, only storeload has read barriers, write barriers, and disable The reordering function interacts with memory a lot, has a large interaction delay, and consumes a lot of resources.

Extension: These barrier instructions are not the actual execution instructions of the processor, but are cross-platform instructions defined by JMM. Because different hardware implements memory barriers in different ways, JMM abstracts these memory barrier instructions in order to shield the differences in underlying hardware platforms. When running, the JVM generates corresponding machine code for different platforms. These memory barrier instructions may be optimized on different hardware platforms, so that only some JMM memory barrier instructions are supported. For example, on x86 machines, only the memory barrier instructions are valid, and others are not supported and are replaced by nop storeload. That is, no operation.

2.7. Volatile reading and writing process—8 atomic methods

JMM defines 8 types of atomic operations to ensure that operations on shared variables are atomic in a multi-threaded environment. That is, when one thread is performing this operation, other threads cannot perform this operation at the same time, thus ensuring data consistency. .

These 8 atomic operations include:

lock : Acts on variables in main memory, marking a variable as exclusive to a thread.
unlock : Acts on a variable in main memory, releasing a variable that is in a locked state. After being released, it can be locked by other threads.
read : Acts on variables in main memory, transferring the value of a variable from main memory to the working memory of the thread for use in subsequent load operations.
load : Acts on variables in the thread's working memory, placing the variable value obtained from the main memory by the read operation into a copy of the variable in the working memory.
use : Acts on variables in the thread's working memory, passing the value of a variable in the working memory to the execution engine.
assign (assignment): Acts on a variable in the thread's working memory, assigning a value received from the execution engine to a variable in the working memory.
store (storage): Acts on variables in the thread's working memory, passing the value of a variable in the working memory to the main memory for use in subsequent write operations.
write : Acts on variables in main memory, writing the value of the variable obtained from the working memory by the store operation into the variable in main memory.
Through the combination of these atomic operations, it can be ensured that the access and modification of shared variables are atomic, thereby avoiding the problem of data inconsistency.

Insert image description here

Notice: However, thread competition will occur in the gap between use and assign (at the CPU execution engine). For example, the bottom layer of the i++ operation is actually composed of three instruction sets. Executing these three statements between use and assign cannot guarantee the atomicity of thread execution, so , locking is also required to ensure that the thread is atomic and the data will not be error-free .

The eight atomic operations defined by JMM include three operations: reading, writing, and read-modify-write on a variable, and three operations: reading, writing, and read-modify-write on an array. These operations ensure that each operation is atomic when executed, but only on a single variable (boolean) or a single element.

In actual development, compound operations (such as i++) usually involve the modification of multiple variables or multiple elements. In this case, the atomic operations provided by JMM cannot guarantee atomicity, and synchronization methods such as synchronized or Lock need to be used to ensure Thread safe. Therefore, the eight atomic operations defined by JMM cannot fully guarantee atomicity.

For example, the following reference example:

public class VolatileNotAtomicExample {
    
    
    private volatile int count = 0;

    public void increment() {
    
    
        count++;
    }

    public int getCount() {
    
    
        return count;
    }

    public static void main(String[] args) throws InterruptedException {
    
    
        final int THREADS_COUNT = 1000;
        final int INCREMENT_COUNT = 1000;

        VolatileNotAtomicExample example = new VolatileNotAtomicExample();

        Thread[] threads = new Thread[THREADS_COUNT];

        for (int i = 0; i < THREADS_COUNT; i++) {
    
    
            threads[i] = new Thread(() -> {
    
    
                for (int j = 0; j < INCREMENT_COUNT; j++) {
    
    
                    example.increment();
                }
            });
            threads[i].start();
        }

        for (int i = 0; i < THREADS_COUNT; i++) {
    
    
            threads[i].join();
        }

        System.out.println("Count: " + example.getCount());
    }
}

In the above code, 1000 threads perform 1000 accumulation operations on the same volatile variable at the same time. Theoretically, the value of count should be 1000 * 1000 = 1000000, but the actual running result may be less than this value, because volatile can only guarantee visibility. atomicity is not guaranteed.

The underlying instructions of count++ are as follows:

0: iconst_0
1: istore_1
2: iinc 1, 1
5: return

Executing these three statements between use and assign cannot guarantee thread execution atomicity.

2.8. volatile is used in DCL

DCL (Double-checked locking) is an implementation of singleton mode that achieves thread safety and performance optimization by double-checking locks. Using volatile ensures the thread safety of the DCL implementation.

The following is a sample code that uses volatile to implement DCL:

public class Singleton {
    
    
    private static volatile Singleton instance;

    private Singleton() {
    
    }

    public static Singleton getInstance() {
    
    
        if (instance == null) {
    
    
            synchronized (Singleton.class) {
    
    
                if (instance == null) {
    
    
                    instance = new Singleton();
                }
            }
        }
        return instance;
    }
}

In the above example, the instance variable is declared volatile, which ensures that the instance is visible to all threads. Without volatile modification, one thread may cache the instance in local memory instead of reading it from main memory, causing another thread to not see the latest value of the variable. At the same time, using double-check locking can avoid locking the entire getInstance() method, thus improving performance.

If the volatile modifier is not added, then when a thread accesses the getInstance() method for the first time, if the instructions are rearranged in the judgment of instance == null, the following situation will occur:

Thread A creates a Singleton object and assigns the object to instance.
Thread B calls the getInstance() method and finds that instance != null, so it returns instance directly, but this instance has not actually been initialized yet.
Adding the volatile modifier can prohibit instruction reordering, ensuring the visibility of the instance to all threads, and other threads cannot access the instance object before the instance instantiation is completed. This ensures the thread safety of the DCL implementation.

3. Atomicity

In Java, atomicity means that an operation cannot be interrupted or divided, that is, either the operation has been completely executed, or it has not been executed yet. Simply put, atomicity means that an operation is either fully executed successfully or completely failed, and there is no such thing as half execution.

In multi-threaded programming, atomicity is a very important concept, because multiple threads may access and modify the same data at the same time. If atomicity is not guaranteed, data inconsistency or other abnormalities may occur.

In Java, operations on some basic data types, such as simple operations such as assignment, addition, subtraction, multiplication, and division, are all atomic, that is, these operations will not be interrupted by other threads. For some complex operations, such as the combination of multiple operations, it is necessary to use synchronization mechanisms such as synchronizedAtomic` Lock、to ensure atomicity to avoid thread safety issues.

Assume that two threads add 1 to a shared counter at the same time. The code is as follows:

public class Counter {
    
    
    private int count = 0;

    public void increment() {
    
    
        count++;
    }

    public int getCount() {
    
    
        return count;
    }
}

public class MyThread implements Runnable {
    
    
    private Counter counter;

    public MyThread(Counter counter) {
    
    
        this.counter = counter;
    }

    @Override
    public void run() {
    
    
        for (int i = 0; i < 10000; i++) {
    
    
            counter.increment();
        }
    }
}

public class Main {
    
    
    public static void main(String[] args) throws InterruptedException {
    
    
        Counter counter = new Counter();
        Thread thread1 = new Thread(new MyThread(counter));
        Thread thread2 = new Thread(new MyThread(counter));
        thread1.start();
        thread2.start();
        thread1.join();
        thread2.join();
        System.out.println("Count: " + counter.getCount());
    }
}

In the above code, there is a Counter class that represents a counter. It has an increment() method for adding 1 to the counter, and a getCount() method for getting the current value of the counter.

Next, define a MyThread class to represent a thread, which holds a Counter object and repeats the operation of adding 1 to the counter 10,000 times.

Finally, start two threads in the Main class and wait for them to complete execution, and finally output the value of the counter.

However, since two threads modify the counter at the same time, thread safety problems will occur without any synchronization mechanism, resulting in the counter value not necessarily being correct.

Therefore, we can use the atomic class AtomicInteger provided by Java to ensure the atomicity of the counter:

public class Counter {
    
    
    private AtomicInteger count = new AtomicInteger(0);

    public void increment() {
    
    
        count.getAndIncrement();
    }

    public int getCount() {
    
    
        return count.get();
    }
}

This ensures that the counter increment operation is atomic, thus avoiding thread safety issues.

Guess you like

Origin blog.csdn.net/qq_35971258/article/details/129147511