Java concurrency foundation (1): synchronized lock synchronization

Hello, I am watching the mountains.

Synchronized is a built-in synchronization lock implementation in java. A keyword realizes the lock of shared resources. There are 3 usage scenarios for synchronized, and the locked objects are also different for different scenarios:

  1. Ordinary method: the lock object is the current instance object
  2. Static method: the lock object is the Class object of the class
  3. Method block: the lock object is the object in synchronized brackets

Synchronized realization principle

Synchronized is to implement the lock mechanism by entering and exiting the Monitor object, and the code block is implemented by a pair of monitorenter/monitorexit instructions. After compilation, the monitorenter instruction is inserted at the beginning of the synchronization code block, and the monitorexit instruction is inserted at the end of the method and the exception. The JVM must ensure that the monitorenter and monitorexit appear in pairs. Any object has a Monitor associated with it, and if and only if a Monitor is held, it will be in a locked state.

When executing the monitorenter, first try to acquire the lock of the object. If the object is not locked or the current thread holds the lock, the lock counter is increased by 1; correspondingly, when the monitorexit instruction is executed, the lock counter is decreased by 1. When the counter decreases to 0, the lock is released. If the lock acquisition fails in the monitorenter, the current thread will be blocked until the object lock is released.

Before JDK6, the implementation of Monitor relied on the mutex lock implementation within the operating system (usually the Mutex Lock implementation was used). Thread blocking would switch between user mode and kernel mode, so the synchronization operation was an indiscriminate heavyweight lock. .

Later, the JDK upgraded the synchronized. In order to avoid switching between user mode and kernel mode when the thread is blocked, it will add a spin operation before the operating system blocks the thread. Then three different monitors are implemented: Biased Locking, Lightweight Locking, and Heavyweight Locking. After JDK6, the performance of synchronized has been greatly improved. Compared with ReentrantLock, the performance is not bad, but the use of ReentrantLock is more flexible.

Adaptive Spinning

The biggest impact of synchronized on performance is the implementation of blocking. Suspending threads and resuming threads require the help of the operating system to complete, and need to switch from user mode to kernel mode, and state transitions require a lot of CPU time.

In most of our applications, the locked state of shared data will only last for a short period of time, and the time spent suspending and resuming threads for this period of time is not worthwhile. Moreover, most processors nowadays are multi-core processors. If you let the latter thread wait for a while without releasing the CPU, wait for the former to release the lock, and then the latter thread can immediately acquire the lock to perform the task. This is the so-called spinning. Let the thread execute a busy loop, and go around for a while, and see if the lock is released every time it turns. If it is released, it will directly acquire the lock, and if it is not released, it will turn around again.

Spinlock was introduced in JDK 1.4.2 ( -XX:+UseSpinningopened with parameters), and JDK 1.6 is turned on by default. Spin lock cannot replace blocking, because spin wait avoids the overhead of thread switching, but it takes up CPU time. If the lock takes a short time, the spin wait effect is good, otherwise, it is a waste of performance. So adaptive spin lock was introduced in JDK 1.6: If the same lock object, spin waiting has just succeeded, and the thread holding the lock is running, then this spin is likely to succeed, and the spin waiting will be allowed to continue. It takes longer. Conversely, if the spin is rarely successful for a certain lock, then it is very likely to omit the spin process directly to avoid wasting CPU resources.

Lock upgrade

Java object header

The lock used for synchronized exists in the Java object header, and the data stored in the Mark Word in the object header will change with the change of the flag bit. The changes are as follows:

Java object header Mark Word

Biased Locking

In most cases, the lock not only does not exist in multi-thread competition, but is always acquired by the same thread multiple times. In order to make the thread acquire the lock at a lower cost, a biased lock is introduced.

When a thread accesses the synchronization block and acquires the lock, the thread ID of the lock bias is stored in the lock record in the object header and the stack frame. Later, the thread does not need to perform CAS operations to lock and unlock when entering and exiting the synchronization block. , Just simply test whether there is a bias lock pointing to the current thread stored in the Mark Word of the object header. The introduction of biased locks is to minimize unnecessary lightweight lock execution paths without multi-threaded competition, because the acquisition and release of lightweight locks rely on multiple CAS atomic instructions, and biased locks only need to replace ThreadID Time to rely on a CAS atomic instruction (because the biased lock must be revoked in the event of multi-threaded competition, the performance loss of the revoking operation of the biased lock must be less than the saved performance of the CAS atomic instruction).

Partial lock acquisition

  1. When the lock object is acquired by the thread for the first time, the flag bit of the object header is set to 01, and the bias mode is set to 1, indicating that it enters the bias mode.
  2. Test whether the thread ID points to the current thread, if it is, execute the synchronized code block, if not, go to 3
  3. Use the CAS operation to record the thread ID of the lock obtained in the Mark Word of the object. If it succeeds, execute the synchronization code block. If it fails, it means that there has been a biased lock held by other threads, and start to try the current thread to obtain a biased lock.
  4. When the global safety point is reached (no bytecode is being executed), the thread holding the bias lock will be suspended and the thread status will be checked. If the thread has ended, set the object header to the lock-free state (the flag bit is "01"), and then re-bias the new thread; if the thread is still alive, revoke the bias lock and upgrade to the lightweight lock state (the flag bit is "00"). At this time, the lightweight lock is held by the thread that originally held the biased lock and continues to execute its synchronization code, and the competing threads will spin and wait to obtain the lightweight lock.

Bias lock release

The release of the biased lock uses a lazy release mechanism: the biased lock is released only when competition occurs. The release process is the fourth step mentioned above, so I won't repeat it here.

Close the bias lock

The skew lock is not suitable for all application scenarios. The revoke operation (revoke) is a heavier behavior. Only when there are more synchronized blocks that do not really compete, can it show a significant improvement. In practice, skew locks have always been controversial. Some people even think that when you need to use a lot of concurrent libraries, it often means that you don't need skew locks.

So if you are sure that the lock in the application is usually in a competitive state, you can turn off the biased lock through the JVM parameter:, -XX:-UseBiasedLocking=falsethen the program will enter the lightweight lock state by default.

Lightweight Locking

Lightweight locks are not used to replace heavyweight locks. Its original intention is to reduce the performance loss of traditional heavyweight locks using operating system mutexes without multithreading competition.

Lightweight lock acquisition

  1. If the lock state of the synchronization object is unlocked (the lock flag is in the "01" state, whether it is a biased lock is "0"), the virtual machine will first create a lock record (Lock Record) in the stack frame of the current thread The space is used to store a copy of the current Mark Word of the lock object, which is officially called Displaced Mark Word. At this time, the state of the thread stack and object header is shown in the following figure:
    Lock Record
  2. Copy the Mark Word in the object header to the Lock Record.
  3. After the copy is successful, the virtual machine will use the CAS operation to try to update the Mark Word of the object to a pointer to the Lock Record, and point the owner pointer in the Lock record to the object mark word.
  4. If it succeeds, the current thread holds the object lock, and sets the Mark Word lock flag of the object header to "00", indicating that the object is in a lightweight lock state, and the synchronization code block is executed. At this time, the state of the thread stack and object header is shown in the following figure:
    Lock Record
  5. If the update fails, check whether the Mark Word of the object header points to the stack frame of the current thread. If it is, it means that the current thread has the lock, and the synchronization code block is executed directly.
  6. If not, it means that multiple threads compete for the lock. If there is currently only one waiting thread, try to acquire the lock by spinning. When the spin exceeds a certain number of times, or another thread competes for the lock, the lightweight lock expands into a heavyweight lock. The heavyweight lock blocks all threads except the thread that owns the lock to prevent the CPU from idling. The status value of the lock flag becomes "10". The Mark Word stores the pointer to the heavyweight lock (mutual exclusion) and waits later. The locked thread also enters the blocking state.

Lightweight lock to unlock

The timing for unlocking the lightweight lock is when the synchronization block of the current thread is executed.

  1. Try to replace the current Mark Word with the Displaced Mark Word object copied in the thread through the CAS operation.
  2. If successful, the entire synchronization process is complete
  3. If it fails, it means that there is competition and the lock swells into a heavyweight lock. When the lock is released, the suspended thread will be awakened.

Heavyweight lock

Lightweight locks are adapted to the scenario where threads execute synchronized blocks almost alternately. If there is access to the same lock object at the same time (the first thread holds the lock, the second thread spins more than a certain number of times), the lightweight lock It will expand into a heavyweight lock, the lock mark bit of Mark Word is updated to 10, and Mark Word points to a mutex (heavyweight lock).

The heavyweight lock is realized by a monitor lock (monitor) inside the object. The essence of the monitor lock is the Mutex Lock (mutual exclusion lock) that relies on the underlying operating system. The operating system needs to switch from user mode to core mode to realize the switching between threads. This cost is very high, and the transition between states takes a relatively long time. This is why the synchronized heavyweight lock is inefficient before JDK 1.6.

The following figure shows the Mark Word data transformation of the object head between the bias lock, the lightweight lock and the heavyweight lock:

Conversion between bias lock, lightweight lock and heavyweight lock

There is a relatively complete lock upgrade process on the Internet:

Lock upgrade process

Lock Elimination

Lock elimination refers to the virtual machine just-in-time compiler in the running process, for some synchronized code, if it detects that there is no possibility of shared data competition, it will delete the lock. In other words, the just-in-time compiler deletes unnecessary locking operations according to the situation.

The basis of lock elimination is escape analysis. Simply put, escape analysis is to analyze the dynamic scope of the object. There are three situations:

  • No escape: the scope of the object is only in this thread and this method
  • Method escape: After the object is defined in the method, it is referenced by the external method
  • Thread escape: After the object is defined in the method, it is referenced by the external thread

The just-in-time compiler will optimize the processing for different situations of the object:

  • Allocation on the object stack (Stack Allocations, HotSpot does not support): Create objects directly on the stack.
  • Scalar Replacement: Disassemble the object and directly create the member variables used by the method. The premise is that the object does not escape the scope of the method.
  • Synchronization Elimination: It is lock elimination, provided that the object does not escape the thread.

For lock elimination, in escape analysis, those locked objects that will not escape the thread can directly delete synchronization locks.

Look at an example through the code:

public void elimination1() {
    
    
    final Object lock = new Object();
    synchronized (lock) {
    
    
        System.out.println("lock 对象没有只会作用域本线程,所以会锁消除。");
    }
}

public String elimination2() {
    
    
    final StringBuffer sb = new StringBuffer();
    sb.append("Hello, ").append("World!");
    return sb.toString();
}

public StringBuffer notElimination() {
    
    
    final StringBuffer sb = new StringBuffer();
    sb.append("Hello, ").append("World!");
    return sb;
}

elimination1()The lockscope of the lock object in the method is only within the method, and there is no escape from the thread, and so is the case elimination2()in the middle sb, so the synchronization lock of these two methods will be eliminated. But the notElimination()method sbis a method returns a value, may be modified or otherwise modify the other thread, so, just look at this method does not eliminate the lock, call the method will have to see.

Lock Coarsening

In principle, when we write code, we should limit the scope of the synchronization block as small as possible. The number of operations that need to be synchronized is as small as possible. When there is lock contention, the waiting thread acquires the lock as soon as possible. But sometimes, if a series of consecutive operations repeatedly lock and unlock the same object, and even the locking operation appears in the loop body, even if there is no thread competition, frequent mutual exclusion synchronization operations will also lead to Unnecessary performance loss. If the virtual machine detects that a series of fragmented operations all lock the same object, it will extend (coarse) the scope of lock synchronization to the outside of the entire sequence of operations.

For example in the example above elimination2()method, StringBufferthe appendsynchronous method, frequent operation, the lock may be roughened, the final result would be similar to (but similar, not the real situation):

public String elimination2() {
    
    
    final StringBuilder sb = new StringBuilder();
    synchronized (sb) {
    
    
        sb.append("Hello, ").append("World!");
        return sb.toString();
    }
}

or

public synchronized String elimination3() {
    
    
    final StringBuilder sb = new StringBuilder();
    sb.append("Hello, ").append("World!");
    return sb.toString();
}

Summary at the end of the article

  1. There are two points that affect performance in synchronization operations:
    1. The lock and unlock process requires additional operations
    2. The cost of switching between user mode and kernel mode is relatively high
  2. synchronized has a lot of optimizations in JDK 1.6: hierarchical locks (biased locks, lightweight locks, heavyweight locks), lock elimination, lock coarsening, etc.
  3. synchronized reuses the Mark Word status bit of the object header to achieve different levels of lock implementation.

reference

  • "In-depth understanding of the Java virtual machine"
  • "The Art of Concurrent Programming in Java"

Hello, I am watching the mountain, the public account: watching the mountain hut, a 10-year-old ape, Apache Storm, WxJava, Cynomys open source contributor. Swim in the code world, enjoy life in drama.

Personal homepage: https://www.howardliu.cn
Personal blog post: Java concurrency foundation (1): synchronized lock synchronization
CSDN homepage: http://blog.csdn.net/liuxinghao
CSDN blog post: Java concurrency foundation (1): synchronized Lock synchronization

Public number: Watching the mountain hut

Guess you like

Origin blog.csdn.net/conansix/article/details/115286598