The underlying implementation of synchronized that you don't know

There are many articles on the Internet about the underlying implementation of synchronized. However, in many articles, either the author did not read the code at all, just summarized and copied from other articles on the Internet, and it is inevitable that there are some errors; or many points are covered in one stroke, and there is no statement about why this is achieved. Readers like me are still unfinished.

This series of articles will conduct a comprehensive analysis of HotSpot's synchronized lock implementation, including the principle and source code analysis of the locking, unlocking, lock upgrade process of biased locks, lightweight locks, and heavyweight locks. I hope to give students who are on the road to researching synchronized Some help.

It took about two weeks to realize the code (it’s a shame after spending so long, mainly because I’m not familiar with C++, JVM underlying mechanism, JVM debugging, and assembly code). I basically read the code involved in synchronization. It also includes adding logs to the JVM to verify your own conjectures. Generally speaking, you have a more comprehensive and clear understanding of synchronized, but the level is limited, and some details are unavoidable and some omissions. Please correct me.

This article will give a general introduction to the synchronized mechanism, including the object header used to carry the lock state, several types of locks, the locking and unlocking processes of various types of locks, and when lock upgrades will occur. It should be noted that this article aims to introduce the background and concepts. When describing some processes, only the main case is mentioned. The implementation details and different branches of runtime are analyzed in detail in the following articles.

1. Introduction to synchronized

Java provides two basic semantics for achieving synchronization: synchronized method and synchronized block. Let's take a look at a demo:

  1. public class SyncTest {

  2.    public void syncBlock(){

  3.        synchronized (this){

  4.            System.out.println("hello block");

  5.        }

  6.    }

  7.    public synchronized void syncMethod(){

  8.        System.out.println("hello method");

  9.    }

  10. }

When SyncTest.java is compiled into a class file, the bytecodes of the synchronized keyword and synchronized method are slightly different. We can use the javap -v command to view the JVM bytecode information corresponding to the class file. Part of the information is as follows:

 
  1. {

  2.  public void syncBlock();

  3.    descriptor: ()V

  4.    flags: ACC_PUBLIC

  5.    Code:

  6.      stack=2, locals=3, args_size=1

  7.         0: aload_0

  8.         1: dup

  9.         2: astore_1

  10.         3: monitorenter                      // monitorenter指令进入同步块

  11.         4: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;

  12.         7: ldc           #3                  // String hello block

  13.         9: invokevirtual #4                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V

  14.        12: aload_1

  15.        13: monitorexit                          // monitorexit指令退出同步块

  16.        14: goto          22

  17.        17: astore_2

  18.        18: aload_1

  19.        19: monitorexit                          // monitorexit指令退出同步块

  20.        20: aload_2

  21.        21: athrow

  22.        22: return

  23.      Exception table:

  24.         from    to  target type

  25.             4    14    17   any

  26.            17    20    17   any

  27.  

  28.  

  29.  public synchronized void syncMethod();

  30.    descriptor: ()V

  31.    flags: ACC_PUBLIC, ACC_SYNCHRONIZED      //添加了ACC_SYNCHRONIZED标记

  32.    Code:

  33.      stack=2, locals=1, args_size=1

  34.         0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;

  35.         3: ldc           #5                  // String hello method

  36.         5: invokevirtual #4                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V

  37.         8: return

  38.  

  39. }

As can be seen from the Chinese comment above, for the synchronized keyword, javac will generate the corresponding monitorenter and monitorexit instructions when compiling, corresponding to the entry and exit of the synchronized synchronization block. The reason for the two monitorexit instructions is: It is guaranteed that the lock can be released even when an exception is thrown, so javac adds an implicit try-finally to the synchronization code block, and the monitorexit command is called to release the lock in the finally. For the synchronized method, javac generates an ACCSYNCHRONIZED keyword for it. When the JVM makes a method call, if it finds that the called method is modified by ACCSYNCHRONIZED, it will first try to obtain the lock.

At the bottom of the JVM, the implementation of these two synchronized semantics is roughly the same, and one of them will be selected for detailed analysis later.

Two, several forms of lock

The traditional lock (that is, the heavyweight lock mentioned below) relies on the synchronization function of the system. The mutex mutex lock is used on Linux. The bottom layer implementation relies on futex. You can read these articles about futex. These synchronization functions are all involved. The cost of switching between user mode and kernel mode and context switching of processes is high. For situations where the synchronized keyword is added but there is no multi-thread competition at runtime, or the two threads are close to alternate execution, the efficiency of using the traditional lock mechanism will undoubtedly be relatively low.

Prior to JDK 1.6, synchronized had only a traditional lock mechanism, which left developers with the impression that the synchronized keyword had poor performance compared to other synchronization mechanisms.

In JDK 1.6, two new lock mechanisms were introduced: biased locks and lightweight locks. They were introduced to solve the performance overhead caused by the use of traditional locking mechanisms in scenarios where there is no multithreaded competition or basically no competition.

Before looking at the implementation of these lock mechanisms, let's first understand the object header, which is the basis for implementing multiple lock mechanisms.

1. Object head

Because any object in Java can be used as a lock, there must be a mapping relationship that stores the object and its corresponding lock information (such as which thread currently holds the lock and which threads are waiting). A very intuitive method is to use a global map to store this mapping relationship, but there are some problems in this way: the thread safety of the map needs to be guaranteed, and different synchronized will affect each other, and the performance will be poor; in addition, when the synchronization object When there are more, the map may occupy more memory.

So the best way is to store this mapping relationship in the object header, because the object header itself also has some hashcode and GC related data, so if the lock information can coexist with these information in the object header, it would be great.

In JVM, an object will have an object header in addition to its own data in memory. For ordinary objects, there are two types of information in the object header: mark word and type pointer. In addition, for arrays, there will be a record of the length of the array.

The type pointer is a pointer to the class object to which the object belongs. The mark word is used to store information such as the HashCode, GC generation age, and lock status of the object. The mark word length is 32 bytes on a 32-bit system and 64 bytes on a 64-bit system. In order to store more data in a limited space, the storage format is not fixed. The format of each state on a 32-bit system is as follows:

You can see that the lock information also exists in the mark word of the object. When the object state is biasable, the mark word stores the biased thread ID; when the state is lightweight locked, the mark word stores a pointer to the Lock Record in the thread stack; when the state is When the lock is inflated, it is a pointer to the monitor object in the heap.

2. Heavyweight lock

Heavyweight locks are what we often call locks in the traditional sense. They use the synchronization mechanism at the bottom of the operating system to achieve thread synchronization in Java.

In the heavyweight lock state, the mark word of the object is a pointer to a monitor object in the heap.

A monitor object includes several key fields: cxq (ContentionList in the figure below), EntryList, WaitSet, owner.

Among them, cxq, EntryList, and WaitSet are all linked list structures of ObjectWaiter, and owner points to the thread holding the lock. 

When a thread tries to acquire a lock, if the lock is already occupied, the thread will be encapsulated into an ObjectWaiter object and inserted into the cxq queue tail, and then the current thread will be suspended. Before the thread holding the lock releases the lock, it will move all the elements in cxq to EntryList and wake up the first thread of EntryList.

If a thread calls the Object#wait method in the synchronization block, the ObjectWaiter corresponding to the thread will be removed from the EntryList and added to the WaitSet, and then the lock will be released. When the waiting thread is notified, the corresponding ObjectWaiter will be moved from WaitSet to EntryList.

The above is just a brief description of the heavyweight lock process, which involves many details, such as where the ObjectMonitor object comes from? When releasing the lock, should the elements in cxq be moved to the end or the head of EntryList? When notfiy, should ObjectWaiter be moved to the end or the head of EntryList?

The specific details will be analyzed in the article on heavyweight locks.

3. Lightweight lock

Developers of JVM found that in many cases, when a Java program is running, there is no competition for the code in the synchronization block, and different threads alternately execute the code in the synchronization block. In this case, using heavyweight locks is not necessary. Therefore, JVM introduced the concept of lightweight locks.

Before the thread executes the synchronization block, the JVM will first create a Lock Record in the stack frame of the current thread, which includes a mark word used to store the object header (officially called Displaced Mark Word) and a pointer to the object . The right part of the figure below is a Lock Record.

Locking process:

1. Create a Lock Record in the thread stack and point its obj (Object reference in the figure above) field to the lock object.

2. Store the address of the Lock Record in the mark word of the object header directly through the CAS instruction. If the object is in a lock-free state, the modification is successful, which means that the thread has obtained a lightweight lock. If it fails, go to step 3.

3. If the current thread already holds the lock, it means that this is a lock reentry. Set the first part of the Lock Record (Displaced Mark Word) to null, which acts as a reentrant counter. Then it ends.

4. This step shows that competition has occurred and needs to be expanded into a heavyweight lock.

Unlocking process:

1. Traverse the thread stack and find all Lock Records whose obj field is equal to the current lock object.

2. If the Displaced Mark Word of Lock Record is null, it means that this is a reentry. Set obj to null and continue.

3. If the Displaced Mark Word of Lock Record is not null, use the CAS command to restore the mark word of the object header to Displaced Mark Word. If it succeeds, continue, otherwise it will expand into a heavyweight lock.

4. Bias lock

Java is a language that supports multi-threading. Therefore, in many second-party packages and basic libraries, in order to ensure that the code can run normally in the case of multi-threading, which is what we often call thread safety, synchronization semantics such as synchronized are added. But when the application is actually running, it is likely that only one thread will call the relevant synchronization method. For example, the following demo:

 
  1. import java.util.ArrayList;

  2. import java.util.List;

  3.  

  4. public class SyncDemo1 {

  5.  

  6.    public static void main(String[] args) {

  7.        SyncDemo1 syncDemo1 = new SyncDemo1();

  8.        for (int i = 0; i < 100; i++) {

  9.            syncDemo1.addString("test:" + i);

  10.        }

  11.    }

  12.  

  13.    private List<String> list = new ArrayList<>();

  14.  

  15.    public synchronized void addString(String s) {

  16.        list.add(s);

  17.    }

  18.  

  19. }

In this demo, in order to ensure thread safety when manipulating the list, the addString method is modified with synchronized, but in actual use, only one thread calls the method. For lightweight locks, each time addString is called, add There is a CAS operation for unlocking the lock; for heavyweight locks, there will also be one or more CAS operations for locking (the "one" and "multiple" quantifiers here are only for the demo and not applicable to all scenarios) .

In JDK1.6, in order to improve the performance of an object in a long period of time when only one thread is used as a lock object, a biased lock is introduced. When the lock is acquired for the first time, there will be a CAS operation. When the thread acquires the lock, it will only execute a few simple commands instead of the relatively expensive CAS command. Let's take a look at how the bias lock is done.

Object creation

When the JVM has enabled the biased lock mode (opened by default above 1.6), when a new object is created, if the class to which the object belongs does not turn off the biased lock mode (when will the biased mode of a class be turned off, I will say that by default all classes The bias mode of the newly created object is turned on), the mark word of the newly created object will be in a biasable state. At this time, the thread id in the mark word (see the mark word format in the biased state above) is 0, which means it is not biased to anything Thread, also known as anonymously biased.

Locking process

Case 1: When the object is locked by a thread for the first time, it is found to be in an anonymously biased state, and the CAS instruction is used to change the thread id in the mark word from 0 to the current thread id. If it succeeds, it means that the bias lock is obtained and the code in the synchronization block continues to be executed. Otherwise, the bias lock will be revoked and upgraded to a lightweight lock.

Case 2: When the biased thread enters the synchronization block again, it is found that the lock object is biased to the current thread. After passing some additional checks (see the following article for details), a Displaced Mark Word will be added to the stack of the current thread In the empty Lock Record, then continue to execute the code of the synchronization block, because the thread private stack is manipulated, so the CAS instruction is not needed; it can be seen that in the biased lock mode, when the biased thread tries to acquire the lock again , Just perform a few simple operations. In this case, the performance overhead caused by the synchronized keyword can basically be ignored.

case 3. When other threads enter the synchronization block and find that there are already biased threads, they will enter the logic of revoking the biased lock. Generally speaking, they will check whether the biased thread is still alive in safepoint. If it is alive and If the lock is still in the synchronized block, the lock is upgraded to a lightweight lock. The original biased thread continues to own the lock, and the current thread enters the lock upgrade logic; if the biased thread no longer survives or is not in the synchronized block, it will The mark word of the object header is changed to unlocked, and then upgraded to a lightweight lock.

It can be seen that the timing of the bias lock upgrade is: when the lock has been biased, as long as another thread tries to obtain the bias lock, the bias lock will be upgraded to a lightweight lock. Of course, this statement is not absolute, because there is also a mechanism for batch re-biasing.

Unlocking process

When another thread tries to acquire the lock, it is based on the lock record of the traversal biased thread to determine whether the thread is still executing the code in the synchronization block. Therefore, the unlocking of the biased lock is very simple, just set the obj field of the most recent lock record in the stack to null. It should be noted that the thread id in the object header is not modified in the unlocking step of the biased lock.

The following figure shows the conversion process of the lock state: 

In addition, the bias lock is not started immediately by default. After the program is started, there is usually a delay of several seconds. You can turn off the delay by command -XX:BiasedLockingStartupDelay=0.

Batch re-biasing and revocation

It can be seen from the locking and unlocking process of the bias lock above that when only one thread repeatedly enters the synchronization block, the performance overhead caused by the bias lock can basically be ignored, but when other threads try to obtain the lock, it needs to wait until safe At point, the deflection lock is cancelled to a lock-free state or upgraded to a lightweight/heavyweight lock. The word safe point is often mentioned in GC. It represents a state in which all threads are suspended (about this meaning). You can read this article for details. In short, the revocation of biased locks has a certain cost. If there is multi-threaded competition in the runtime scene itself, the existence of biased locks not only cannot improve performance, but also leads to performance degradation. Therefore, a batch re-biasing/revocation mechanism has been added to the JVM.

 

There are two situations as follows: (see Section 4 of the official paper):

https://www.oracle.com/technetwork/java/biasedlocking-oopsla2006-wp-149958.pdf

1. One thread creates a large number of objects and performs initial synchronization operations, and then uses these objects as locks in another thread for subsequent operations. In this case, a large number of biased lock undo operations will result.

2. It is inappropriate to use biased locks in scenarios where there is obvious multi-threaded competition, such as producer/consumer queues.

The bulk rebias mechanism is to solve the first scenario. Bulk revoke is to solve the second scenario.

The method is to maintain a bias lock revocation counter for each class in the unit of class. Each time an object of the class undergoes a bias revocation operation, the counter is +1. When this value reaches the re-bias threshold (default 20), The JVM thinks that there is a problem with the bias lock of the class, so it will perform batch re-bias. Each class object has a corresponding epoch field, and the mark word of each object in the biased lock state also has this field. Its initial value is the value of the epoch in the class when the object is created. Each time a batch re-bias occurs, the value is +1, and the stack of all threads in the JVM is traversed to find all the bias locks of the class that are in the locked state, and the epoch field is changed to the new value. The next time the lock is acquired, it is found that the epoch value of the current object is not equal to the epoch of the class, then even if the current object has been biased to other threads, the undo operation will not be performed, but the Thread Id of its mark word will be changed directly through the CAS operation The current thread ID.

When the re-bias threshold is reached, assuming that the class counter continues to increase, when it reaches the threshold of batch cancellation (default 40), the JVM considers that there is multi-threaded competition in the usage scenario of the class, and will mark the class as unbiased. For the lock of this class, go directly to the logic of lightweight lock.

Three, summary

Synchronized in Java has three forms: biased locks, lightweight locks, and heavyweight locks, which correspond to the three situations where the lock is held by only one thread, different threads hold the lock alternately, and the multithreaded competition lock. When the conditions are not met, the lock will be upgraded in the order of bias lock -> lightweight lock -> heavyweight lock. JVM locks can also be degraded, but the conditions are very harsh, which is beyond the scope of our discussion. This article is mainly to make a basic introduction to Java's synchronized, and there will be a more detailed analysis later.


 

 

Guess you like

Origin blog.csdn.net/bjmsb/article/details/108682356