Concurrent programming (2)—Java object memory layout and introduction to synchronized biased locks, lightweight locks, and heavyweight locks

1. Java object memory layout

1. Object memory layout

An object is laid out at the bottom of Java (the right half is the continuous address space of the array), as shown below:
Insert image description here

There are three parts in total:

1. Object header : stores the metadata of the object, such as hash code, GC generation age, lock status flag, locks held by threads, etc.
2. Instance data : stores the actual data content of the object, that is, various types of variables defined by the programmer.
3. Fill it : In order for the JVM to access the data inside the object faster, additional space will be filled behind the instance data so that the size of the object can be divisible by the memory management system of the virtual machine (usually a multiple of 8) .

The size of the specific object header and the size of the instance data are related to the specific implementation of the Java virtual machine, the type of the object, the virtual machine runtime parameters, etc., and are generally not fixed values. It should be noted that the memory layout of array objects is different from that of ordinary objects. Array objects will additionally store array length information.

1.1. Object header

Click to view hotspot official website documentation

(1) mark word mark word

The object header consists of two parts:

Insert image description here
Insert image description here

1. Object mark (mark word) : stores the metadata of the object, such as hash code, GC generation age, lock status flag, and lock held by the thread.
2. Class meta information (klass pointer) : stores the pointer to the object. The first address where class metadata (klass) is located.

On a 64-bit operating system, the Markword occupies 8 bytes and the type pointer occupies 8 bytes, totaling 16 bytes. In other words, if you just create a new object, the object header will directly occupy 16 bytes (but not necessarily, the type pointer may be compressed).

Insert image description here

(2) klass pointer type pointer

You can refer to the figure below, the type pointer points to the method area, for example, there is a Customer class, new a Customer instance, the type pointer of this instance points to the Customer class meta information in the method area.

Insert image description here

1.2. Instance data

Stores the field data information of the class , including the attribute information of the parent class; if it is an array instance part, it also needs to include the length of the array. This part of the memory is aligned according to 4 bytes.

An example is as follows:

public class MarkwordDemo {
    
    

    public static void main(String[] args) {
    
    
        new Apple();
    }
}
class Apple {
    
    
}

Directly new an empty attribute Apple instance, it already occupies 16 bytes in memory (regardless of type pointer compression). What if there are other attributes in the Apple class? As follows:

public class MarkwordDemo {
    
    

    public static void main(String[] args) {
    
    
        new Apple();
    }
}
class Apple {
    
    
    int size = 100;
    char a = 'a';
}

A inttype occupies 4 bytes, and a charcharacter occupies 1 byte, so a new Apple instance will occupy 16+5 = 21 bytes, but it will eventually occupy 24 bytes, because in order to facilitate memory management at the bottom of Java, It needs to be aligned and padded, and is generally a multiple of 8, so it is 24 bytes.

1.3. Fill it

The virtual machine requires that the starting address of the object must be an integer multiple of 8 bytes. The padding data does not have to exist. It is just for byte alignment. This part of the memory is supplementally aligned according to 8 bytes.

2. Research on the bottom layer of synchronization lock

markOop.hppThere is the following comment in the source code, as shown below :

Insert image description here

After simplifying the above comments, we get the 64-bit virtual machine object header diagram, as follows:

Insert image description here

Knowing the basic internal structure of the object, let's take a look at how the previous synchronized synchronization lock changes in the object header.

1. Java View object memory layout

You can use the Java tool class jol to help view the layout creation of new Object() in memory, as shown below:

1. First introduce dependencies

It is recommended to use version 0.9 for dependent packages . Other versions may have different effects, so be careful.

		<dependency>
			<groupId>org.openjdk.jol</groupId>
			<artifactId>jol-core</artifactId>
			<version>0.9</version>
		</dependency>

2. Demo code

class MyObject {
    
    
}

public class ObjectMarkWordDemo {
    
    
    public static void main(String[] args) {
    
    
        System.out.println(ClassLayout.parseInstance(new Object()).toPrintable());
    }
}

Direct new class MyObject(), and then view the memory layout through the ClassLayout tool class. The output results are as follows:

Insert image description here

Insert image description here

Add two types of variables to the MyObject class as follows:

class MyObject {
    
    
	int i = 25;
	boolean flag = false;
}

public class ObjectMarkWordDemo {
    
    
    public static void main(String[] args) {
    
    
        System.out.println(ClassLayout.parseInstance(new MyObject()).toPrintable());
    }
}

Then the Java memory layout after output is as shown below:

Insert image description here

As you can see from the above, the type pointer should normally occupy 8 bytes, but now it occupies 4 bytes. We can use the command to query which commands were run when the JVM was started:

java -XX:+PrintCommandLineFlags -version

Insert image description here

As can be seen from the above +number, the JVM adopts type pointer compression by default.Can save memory space, now modify this parameter setting, as shown below:

-XX:-UseCompressedClassPointers

Insert image description here

After turning it on, under retest, the output results are as follows:

Insert image description here

You already know how to view Java memory layout above. Now let’s learn more about synchronized lock optimization and lock upgrade.

2. Research on synchronized locks

Let’s take a look at the mark word memory structure in the object header mardkword, as shown below:

Insert image description here

synchornized lock optimization background:

Using locks can achieve security, but it can also cause performance degradation. Lock-free can be thread-based and improve program performance, but it will reduce security. So how can we achieve a balance?

Therefore, synchornized has been adopted since jdk1.5 锁升级to improve program performance and ensure program security.

Insert image description here

Before jdk1.5, synchornized and heavyweight locks of the operating system were used. Every time the lock was locked, a 用户态switch 内核态was required. The switch was accompanied by a lot of data copy processes, and the performance was very low.

Insert image description here

Java threads are mapped to native threads of the operating system. If you want to block or wake up a thread, the operating system needs to intervene, and you need to switch between user mode and kernel mode. This switching will consume a lot of system resources, because user mode and The kernel state has its own dedicated memory space, dedicated registers, etc. Switching from the user state to the kernel state requires passing many variables and parameters to the kernel. The kernel also needs to save some register values, variables, etc. mapped by the user state during the switch, so as to facilitate After the kernel mode call is completed, switch back to user mode and continue working.

In early versions of Java, synchornized is a heavyweight lock and is inefficient because the monitor ( Monitor) relies on Mutex Lockthe implementation of the underlying operating system. Suspending and resuming threads need to be transferred to the kernel state. Blocking or waking up a Java thread requires the operating system. This is done by switching the CPU state. This state switching requires CPU time. If the content in the code block is too simple, the cost of this switching is too high.

For example, if we add the synchronized keyword to the code block , the code is as follows:

class MyObject {
    
    
    int a = 25;
    char b = 'b';
}
public class ObjectMarkWordDemo {
    
    
    public static void main(String[] args) {
    
    
        MyObject myObject = new MyObject();

        new Thread(()->{
    
    
            synchronized (myObject) {
    
    
                System.out.println(">>>>>>");
            }
        }).start();
        
    }
}

Adding a synchronized keyword at the Java level will add an invisible lock by default at the bottom layer - Monitor 锁as shown below:

Insert image description here

So Monitorhow is it related to Java objects and threads?

  1. If a Java object is locked by a thread, the lock word in the markword field of the object will point to the Monitorstarting address.
  2. MonitorThe Owner field will store the thread ID that owns the associated object lock.

3. Lock optimization process

(1) No lock

Look at the following code without locking, as shown below:

public class ObjectMarkWordDemo {
    
    
    public static void main(String[] args) throws InterruptedException {
    
    
        Object abc = new Object();
        System.out.println(ClassLayout.parseInstance(abc).toPrintable());
    }
}

If there is no lock, the normal markword of an object in the Java memory is as shown below:

Insert image description here

The information printed out through Java is as follows:

Insert image description here

Note that when looking at the results shown above, the blue box indicates 001that at this time 无锁状态, 无锁状态时the 31 bits in the red box represent hashCodeone of them, one of which is to ignore the 0 padding. However, it was found that the hashCode was not displayed because this operation 懒加载requires a method adjustment to trigger the hashCode. For example, the following code:

public class ObjectMarkWordDemo {
    
    
    public static void main(String[] args) {
    
    
		MyObject myObject = new MyObject();
        System.out.println("十进制表示: myObject.hashCode() = " + myObject.hashCode());
        System.out.println("二进制表示:"+Integer.toBinaryString(myObject.hashCode()));
        System.out.println("十六进制表示:"+Integer.toHexString(myObject.hashCode()));
        System.out.println(ClassLayout.parseInstance(myObject).toPrintable());
    }
}

The output is as follows:

十进制表示: myObject.hashCode() = 1435804085
二进制表示:1010101100101001010000110110101
十六进制表示:5594a1b5

Insert image description here

For the convenience of observation, print out each digit of the hashCode encoding. Start copying from the right to the left ( starting from the right to the left, 8 bytes are copied to form a long string. The first 25 bits belong to unusedthe hashCode (blue frame), and the red frame 3 The bit indicates lock-related and 001indicates 无锁status ). The first copy: 1010101 (the first 0 is a complement, do not copy, it is just one of the unused bits in the first 25 bits), the second copy: 10010100, the third copy: 10100001, the fourth copy: 10110101 Then the string is exactly the same as the secondary system printed above (1010101100101001010000110110101). These 31 bits are the stored hashCode.

It can also be seen from the above that the lock-free state at this time is 001represented by .

(2) Bias lock

When a piece of synchronized code has been accessed multiple times by the same thread, since there is only one thread, the thread will automatically acquire the lock on subsequent accesses, as shown in the following figure (red box):

Insert image description here

As long as any thread acquires the biased lock, the current thread pointer will be saved in the first 54 bits of this object ( 无锁save the hashCode code), and the biased lock position will be set to 1.

Why do we need 2 bits to represent the lock flag?
Specifically, the synchronized lock is initially in a lock-free state. When the first thread competes for the lock, it will modify the lock flag in the object header to a biased lock, and then record the thread ID in the object header, indicating that the thread Obtained bias lock. When the second thread competes for the lock, if it finds that the thread ID recorded in the object header is consistent with the current thread ID, then it can obtain the lock. Otherwise, it needs to cancel the bias lock and switch to a lightweight lock state. When multiple threads compete for a lock, they will enter a heavyweight lock state. Therefore, in order to implement the lock upgrade process, Java adds two bits to the object header to represent the lock flag to achieve the conversion process from 无锁state to 偏向锁state, then to 轻量级锁state, and finally to 重量级锁state.
There are four situations here, so 2bit is used to represent the four situations above.

If a thread executes synchornizedthe code block, the JVM uses the CAS operation to record the thread pointer ID into the mark word, and modifies the bias lock bit to indicate that the current thread has obtained the lock. The lock object becomes 偏向锁(modify the lock flag in the object header through CAS).After executing the synchronized code block, the thread will not actively release the bias lock.

The thread acquires the lock and can execute the synchronized code block. When thread 2 reaches the synchronized code block, it will determine whether the thread holding the lock at this time is itself. If it is its own thread ID, it means that the lock of this object is still held, and the synchronized code block can continue to be executed. Since the bias lock has not been actively released before, there is no need to re-lock here (there is no need to call the operating system's Mutexlock again). If there is only one thread using the lock from beginning to end, it is obvious that the biased lock has almost no additional overhead here and the performance is extremely high.

Check whether the bias lock is on

java -XX:+PrintFlagsInitial | grep BiasedLock

operation result:

intx BiasedLockingBulkRebiasThreshold          = 20                                  {
    
    product}
intx BiasedLockingBulkRevokeThreshold          = 40                                  {
    
    product}
intx BiasedLockingDecayTime                    = 25000                               {
    
    product}
// 然后偏向锁开启之后默认会有4s钟的延迟,测试的时候需要注意,可以将这个值设置成0,方便查看效果
intx BiasedLockingStartupDelay                 = 4000                                {
    
    product}
bool TraceBiasedLocking                        = false                               {
    
    product}
// JVM 默认开启了偏向锁的设置
bool UseBiasedLocking                          = true                                {
    
    product}

Turn on the bias lock setting

-XX:+UseBiasedLocking -XX:BiasedLockingStartupDelay=0

By showing the resultsUseBiasedLocking = trueIt can be known that the JVM turns on the bias lock by default, but it does not turn on the bias lock immediately when the program starts, but requires a delay of 4 seconds before the bias lock is actually turned on.

Why is there a 4s delay in opening the bias lock?
Since acquiring a biased lock requires a certain amount of time, the JVM does not turn on the biased lock immediately when the object is created. Instead, the JVM waits for a certain amount of time (default is 4 seconds) after the object is created to observe the usage of the object. If only one thread accesses the object during this period, the JVM will set the lock flag of the object to a biased lock and record the thread ID in the object header, indicating that the thread has acquired the lock of the object. . If multiple threads access the object during this period, the JVM will not set the lock flag of the object to a biased lock, but directly set the lock flag to a lightweight lock or a heavyweight lock. Use regular locking methods.

This strategy of waiting for a certain period of time before turning on biased locks is to avoid frequently creating and destroying objects in a short period of time, causing the overhead of biased locks to be greater than the performance loss of locking.

The demonstration effect is that the bias lock delay needs to be set to 0s, as shown in the figure:

Insert image description here
The commands in the VM are as follows:

-XX:BiasedLockingStartupDelay=0

The demo code is as follows:

public class ObjectMarkWordDemo {
    
    
    public static void main(String[] args) {
    
    

        Object abc = new Object();
        new Thread(() -> {
    
    
            synchronized (abc) {
    
    
                // 注意这里不要写任何代码操作
                System.out.println(ClassLayout.parseInstance(abc).toPrintable());
            }
        }).start();
    }
}

Note : Do not write other code on the above output statement

The output is as follows:

Insert image description here

Can be changed to internal layout 偏向锁 101. But now only when there is no lock competition, if the competition is found 锁撤销, it will proceed and release the lock 轻量级锁.

(3) Bias lock cancellation

Insert image description here

Question: Does biased lock revocation cause severe performance degradation?

偏向锁撤销It refers to the process of restoring the object's lock state to a lock-free state due to competition or other reasons in the biased lock state. Bias lock cancellation refers to canceling the bias lock and returning to the lock-free state.

In the biased lock state, if other threads try to acquire the lock, the biased lock needs to be revoked first. The process of revoking the biased lock requires checking whether the hashCode of the object has changed. If the hashCode changes, the biased lock needs to be revoked. Otherwise, the lock can be directly upgraded to a lightweight lock. In the process of canceling the bias lock, it is necessary to re-bias, clear the bias lock flag, set the thread ID to 0, etc.

The process of biased lock revocation is relatively performance-intensive, so it is necessary to avoid biased lock revocation as much as possible, especially in high-concurrency scenarios.

Optimization: In the case of competitive incentives, biased locks can be turned off and directly upgraded to lightweight locks.

偏向锁撤销The case is as follows:

public class ObjectMarkWordDemo {
    
    
    public static void main(String[] args) throws InterruptedException {
    
    

        Object abc = new Object();

        synchronized (abc) {
    
    
            System.out.println("偏向锁:" + ClassLayout.parseInstance(abc).toPrintable());
        }
        System.out.println("偏向锁:" + ClassLayout.parseInstance(abc).toPrintable());
        new Thread(() -> {
    
    
            synchronized (abc) {
    
    	
            	System.out.println(">>>>>>发生竞争锁,触发偏向锁撤销...");
            }
        }).start();
        System.out.println("偏向锁撤销:" + ClassLayout.parseInstance(abc).toPrintable());
    }
}

Insert image description here

(4) Lock Record

Question: What is Lock Record?

Insert image description here

Thread A will create a space in the stack frame during running, called Lock Recordrecord, to store lock records. When the virtual machine detects that this object is in a lock-free state, it will create this space on the stack frame of this thread to store Mark Wordinformation related to the lock.

Lock RecordThe data inside is all copied Mark Wordinside, because the lock-related information is Mark Wordon it. At the same time, the official name of this copy process is: Displaced Mark Word. Finally, through the CAS spin operation, the pointer of this stack frame is written to the Mark Word. If the writing is successful, it means that thread A has successfully acquired the lock. If the write fails, it means that the lock is occupied by other threads.

(5) Lightweight lock

轻量级锁交替执行同步To improve efficiency when threads are close to code.

CASThe main purpose is to reduce the performance consumption caused by heavyweight locks using operating system mutexes without multi-thread competition. To put it bluntly, spin first and then block. Upgrade timing, when the bias lock function is turned off or multi-threads compete for the bias lock, it will be upgraded to a lightweight lock.

If thread A has already obtained the lock, then thread B comes to snatch the lock of the object. Since the lock of the object has been obtained by thread A, the lock is currently a bias lock; and thread B is competing to find the thread in Makr Word. If the ID is not its own, thread B will enter the CAS spin operation hoping to obtain the lock. At this time, there are two situations in the operation of thread B:

① If the lock is acquired successfully, directly replace the thread ID in Mark Word with thread B's own ID, and then bias it toward thread B. The lock remains in the biased lock state, thread A ends, and thread B takes over.

Insert image description here

② If the lock acquisition fails, the biased lock is upgraded to a lightweight lock. At this time, the lightweight lock is held by the thread that originally held the biased lock and continues to execute its synchronization code block, while the competing thread B will enter spin Wait for this lightweight lock.

Insert image description here

The lightweight lock spins too many times, resulting in a waste of CPU resources. Before JDK6, the default spin was 10 times or the number of spin threads exceeded half of the number of CPU cores and the spin was immediately given up and upgraded to 重量级锁 10.

Modify the number of spins command:

-XX:PreBlockSpin=10

轻量级锁The case is as follows (the bias lock delay time can be restored):

-XX:BiasedLockingStartupDelay=4000
package com.xxl.job.admin.mytest;


import org.openjdk.jol.info.ClassLayout;

import java.util.concurrent.TimeUnit;

public class ObjectMarkWordDemo {
    
    

    public void test() {
    
    
        Object obj = new Object();
        synchronized (obj) {
    
    
            System.out.println("111");
        }
        synchronized (obj) {
    
    
            System.out.println("111");
        }
        synchronized (obj) {
    
    
            System.out.println("111");
        }
    }

    public static void main(String[] args) throws InterruptedException {
    
    
        // 打印 JVM 相关的信息
        // System.out.println(VM.current().details());
        // 打印每个对象是否为 8 的整数倍大小
        // System.out.println(VM.current().objectAlignment());
        MyObject myObject = new MyObject();
        System.out.println(Integer.toHexString(myObject.hashCode()));
        new Thread(()->{
    
    
            // 在 myObject 对象头上进行加锁(默认直接干到轻量级锁,这里我非要把他干到偏向锁状态)
            // 默认是开启偏向锁的,所以这里我们只需要把开启偏向锁的延迟时间修改成 0 方便看效果 -XX:+BiasedLockingStartupDelay=0
            synchronized (myObject) {
    
    
                // 给这个线程加锁,并且还设置了偏向线程 ID
                System.out.println(ClassLayout.parseInstance(myObject).toPrintable());
            }
        }).start();

        TimeUnit.MICROSECONDS.sleep(500);

        // 锁被释放了,所以这里打印的肯定是无锁状态 001
        System.out.println(ClassLayout.parseInstance(myObject).toPrintable());

    }
}
class MyObject {
    
    

}

operation result:

76fb509a
# WARNING: Unable to attach Serviceability Agent. You can try again with escalated privileges. Two options: a) use -Djol.tryWithSudo=true to try with sudo; b) echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
com.xxl.job.admin.mytest.MyObject object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           e8 29 c3 0a (11101000 00101001 11000011 00001010) (180562408)
      4     4        (object header)                           03 00 00 00 (00000011 00000000 00000000 00000000) (3)
      8     4        (object header)                           44 c1 00 f8 (01000100 11000001 00000000 11111000) (-134168252)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

# WARNING: Unable to attach Serviceability Agent. You can try again with escalated privileges. Two options: a) use -Djol.tryWithSudo=true to try with sudo; b) echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
com.xxl.job.admin.mytest.MyObject object internals:
 OFFSET  SIZE   TYPE DESCRIPTION                               VALUE
      0     4        (object header)                           e8 29 c3 0a (11101000 00101001 11000011 00001010) (180562408)
      4     4        (object header)                           03 00 00 00 (00000011 00000000 00000000 00000000) (3)
      8     4        (object header)                           44 c1 00 f8 (01000100 11000001 00000000 11111000) (-134168252)
     12     4        (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

(6) Heavyweight lock

When lightweight locks keep spinning, you need to consider whether using heavyweight locks can improve performance.

public class ObjectMarkWordDemo {
    
    
    public static void main(String[] args) throws InterruptedException {
    
    

        Object abc = new Object();
        for (int i = 0; i < 2; i++) {
    
    
            Thread thread = new Thread(() -> {
    
    
                synchronized (abc) {
    
    
                    System.out.println("重量级锁:" + ClassLayout.parseInstance(abc).toPrintable());
                }
            });
            thread.join();
            thread.start();
        }
    }
}

Insert image description here

(7) Lock roughening

Similar to the following example:

public class ObjectMarkWordDemo {
    
    
    Object abc = new Object();

    public  void show() {
    
    
        synchronized (abc) {
    
    
            // 复杂操作
        }
        synchronized (abc) {
    
    
            // 复杂操作
        }
        synchronized (abc) {
    
    
            // 复杂操作
        }
    }
}

The same lock is always used, and the execution time before and after is very short. The JVM will optimize and directly merge these locks into one large lock, which can be called to 锁膨胀provide program performance.

(8) Lock elimination

Similar to the following example:

public class ObjectMarkWordDemo {
    
    

    public void show() {
    
    
        Object abc = new Object();
        synchronized (abc) {
    
    
            // 复杂操作
        }
    }
 }

The show() method will lock every time, but this lock has no meaning at all, so the bottom layer of the JVM will optimize it to improve program performance.

3. Commonly used commands

Set JVM heap size

-Xms10m -Xmx10m

Query which commands are run when the JVM is started

java -XX:+PrintCommandLineFlags -version

Turn off compression configuration of class pointers in object headers

-XX:-UseCompressedClassPointers

Check whether the bias lock is on

java -XX:+PrintFlagsInitial | grep BiasedLock

operation result:

intx BiasedLockingBulkRebiasThreshold          = 20                                  {
    
    product}
intx BiasedLockingBulkRevokeThreshold          = 40                                  {
    
    product}
intx BiasedLockingDecayTime                    = 25000                               {
    
    product}
// 然后偏向锁开启之后默认会有4s钟的延迟,测试的时候需要注意,可以将这个值设置成0,方便查看效果
intx BiasedLockingStartupDelay                 = 4000                                {
    
    product}
bool TraceBiasedLocking                        = false                               {
    
    product}
// JVM 默认开启了偏向锁的设置
bool UseBiasedLocking                          = true                                {
    
    product}

Turn on bias lock setting

-XX:+UseBiasedLocking -XX:BiasedLockingStartupDelay=0

Guess you like

Origin blog.csdn.net/qq_35971258/article/details/129286257