1. Java object memory layout
1. Object memory layout
An object is laid out at the bottom of Java (the right half is the continuous address space of the array), as shown below:
There are three parts in total:
1. Object header : stores the metadata of the object, such as hash code, GC generation age, lock status flag, locks held by threads, etc.
2. Instance data : stores the actual data content of the object, that is, various types of variables defined by the programmer.
3. Fill it : In order for the JVM to access the data inside the object faster, additional space will be filled behind the instance data so that the size of the object can be divisible by the memory management system of the virtual machine (usually a multiple of 8) .
The size of the specific object header and the size of the instance data are related to the specific implementation of the Java virtual machine, the type of the object, the virtual machine runtime parameters, etc., and are generally not fixed values. It should be noted that the memory layout of array objects is different from that of ordinary objects. Array objects will additionally store array length information.
1.1. Object header
Click to view hotspot official website documentation
(1) mark word mark word
The object header consists of two parts:
1. Object mark (mark word) : stores the metadata of the object, such as hash code, GC generation age, lock status flag, and lock held by the thread.
2. Class meta information (klass pointer) : stores the pointer to the object. The first address where class metadata (klass) is located.
On a 64-bit operating system, the Markword occupies 8 bytes and the type pointer occupies 8 bytes, totaling 16 bytes. In other words, if you just create a new object, the object header will directly occupy 16 bytes (but not necessarily, the type pointer may be compressed).
(2) klass pointer type pointer
You can refer to the figure below, the type pointer points to the method area, for example, there is a Customer class, new a Customer instance, the type pointer of this instance points to the Customer class meta information in the method area.
1.2. Instance data
Stores the field data information of the class , including the attribute information of the parent class; if it is an array instance part, it also needs to include the length of the array. This part of the memory is aligned according to 4 bytes.
An example is as follows:
public class MarkwordDemo {
public static void main(String[] args) {
new Apple();
}
}
class Apple {
}
Directly new an empty attribute Apple instance, it already occupies 16 bytes in memory (regardless of type pointer compression). What if there are other attributes in the Apple class? As follows:
public class MarkwordDemo {
public static void main(String[] args) {
new Apple();
}
}
class Apple {
int size = 100;
char a = 'a';
}
A int
type occupies 4 bytes, and a char
character occupies 1 byte, so a new Apple instance will occupy 16+5 = 21 bytes, but it will eventually occupy 24 bytes, because in order to facilitate memory management at the bottom of Java, It needs to be aligned and padded, and is generally a multiple of 8, so it is 24 bytes.
1.3. Fill it
The virtual machine requires that the starting address of the object must be an integer multiple of 8 bytes. The padding data does not have to exist. It is just for byte alignment. This part of the memory is supplementally aligned according to 8 bytes.
2. Research on the bottom layer of synchronization lock
markOop.hpp
There is the following comment in the source code, as shown below :
After simplifying the above comments, we get the 64-bit virtual machine object header diagram, as follows:
Knowing the basic internal structure of the object, let's take a look at how the previous synchronized synchronization lock changes in the object header.
1. Java View object memory layout
You can use the Java tool class jol to help view the layout creation of new Object() in memory, as shown below:
1. First introduce dependencies
It is recommended to use version 0.9 for dependent packages . Other versions may have different effects, so be careful.
<dependency>
<groupId>org.openjdk.jol</groupId>
<artifactId>jol-core</artifactId>
<version>0.9</version>
</dependency>
2. Demo code
class MyObject {
}
public class ObjectMarkWordDemo {
public static void main(String[] args) {
System.out.println(ClassLayout.parseInstance(new Object()).toPrintable());
}
}
Direct new class MyObject(), and then view the memory layout through the ClassLayout tool class. The output results are as follows:
Add two types of variables to the MyObject class as follows:
class MyObject {
int i = 25;
boolean flag = false;
}
public class ObjectMarkWordDemo {
public static void main(String[] args) {
System.out.println(ClassLayout.parseInstance(new MyObject()).toPrintable());
}
}
Then the Java memory layout after output is as shown below:
As you can see from the above, the type pointer should normally occupy 8 bytes, but now it occupies 4 bytes. We can use the command to query which commands were run when the JVM was started:
java -XX:+PrintCommandLineFlags -version
As can be seen from the above +
number, the JVM adopts type pointer compression by default.Can save memory space, now modify this parameter setting, as shown below:
-XX:-UseCompressedClassPointers
After turning it on, under retest, the output results are as follows:
You already know how to view Java memory layout above. Now let’s learn more about synchronized lock optimization and lock upgrade.
2. Research on synchronized locks
Let’s take a look at the mark word memory structure in the object header mardkword
, as shown below:
synchornized lock optimization background:
Using locks can achieve security, but it can also cause performance degradation. Lock-free can be thread-based and improve program performance, but it will reduce security. So how can we achieve a balance?
Therefore, synchornized has been adopted since jdk1.5 锁升级
to improve program performance and ensure program security.
Before jdk1.5, synchornized and heavyweight locks of the operating system were used. Every time the lock was locked, a 用户态
switch 内核态
was required. The switch was accompanied by a lot of data copy processes, and the performance was very low.
Java threads are mapped to native threads of the operating system. If you want to block or wake up a thread, the operating system needs to intervene, and you need to switch between user mode and kernel mode. This switching will consume a lot of system resources, because user mode and The kernel state has its own dedicated memory space, dedicated registers, etc. Switching from the user state to the kernel state requires passing many variables and parameters to the kernel. The kernel also needs to save some register values, variables, etc. mapped by the user state during the switch, so as to facilitate After the kernel mode call is completed, switch back to user mode and continue working.
In early versions of Java, synchornized is a heavyweight lock and is inefficient because the monitor ( Monitor
) relies on Mutex Lock
the implementation of the underlying operating system. Suspending and resuming threads need to be transferred to the kernel state. Blocking or waking up a Java thread requires the operating system. This is done by switching the CPU state. This state switching requires CPU time. If the content in the code block is too simple, the cost of this switching is too high.
For example, if we add the synchronized keyword to the code block , the code is as follows:
class MyObject {
int a = 25;
char b = 'b';
}
public class ObjectMarkWordDemo {
public static void main(String[] args) {
MyObject myObject = new MyObject();
new Thread(()->{
synchronized (myObject) {
System.out.println(">>>>>>");
}
}).start();
}
}
Adding a synchronized keyword at the Java level will add an invisible lock by default at the bottom layer - Monitor 锁
as shown below:
So Monitor
how is it related to Java objects and threads?
- If a Java object is locked by a thread, the lock word in the markword field of the object will point to the
Monitor
starting address. Monitor
The Owner field will store the thread ID that owns the associated object lock.
3. Lock optimization process
(1) No lock
Look at the following code without locking, as shown below:
public class ObjectMarkWordDemo {
public static void main(String[] args) throws InterruptedException {
Object abc = new Object();
System.out.println(ClassLayout.parseInstance(abc).toPrintable());
}
}
If there is no lock, the normal markword of an object in the Java memory is as shown below:
The information printed out through Java is as follows:
Note that when looking at the results shown above, the blue box indicates 001
that at this time 无锁状态
, 无锁状态时
the 31 bits in the red box represent hashCode
one of them, one of which is to ignore the 0 padding. However, it was found that the hashCode was not displayed because this operation 懒加载
requires a method adjustment to trigger the hashCode. For example, the following code:
public class ObjectMarkWordDemo {
public static void main(String[] args) {
MyObject myObject = new MyObject();
System.out.println("十进制表示: myObject.hashCode() = " + myObject.hashCode());
System.out.println("二进制表示:"+Integer.toBinaryString(myObject.hashCode()));
System.out.println("十六进制表示:"+Integer.toHexString(myObject.hashCode()));
System.out.println(ClassLayout.parseInstance(myObject).toPrintable());
}
}
The output is as follows:
十进制表示: myObject.hashCode() = 1435804085
二进制表示:1010101100101001010000110110101
十六进制表示:5594a1b5
For the convenience of observation, print out each digit of the hashCode encoding. Start copying from the right to the left ( starting from the right to the left, 8 bytes are copied to form a long string. The first 25 bits belong to unused
the hashCode (blue frame), and the red frame 3 The bit indicates lock-related and 001
indicates 无锁
status ). The first copy: 1010101 (the first 0 is a complement, do not copy, it is just one of the unused bits in the first 25 bits), the second copy: 10010100, the third copy: 10100001, the fourth copy: 10110101 Then the string is exactly the same as the secondary system printed above (1010101100101001010000110110101). These 31 bits are the stored hashCode.
It can also be seen from the above that the lock-free state at this time is 001
represented by .
(2) Bias lock
When a piece of synchronized code has been accessed multiple times by the same thread, since there is only one thread, the thread will automatically acquire the lock on subsequent accesses, as shown in the following figure (red box):
As long as any thread acquires the biased lock, the current thread pointer will be saved in the first 54 bits of this object ( 无锁
save the hashCode code), and the biased lock position will be set to 1.
Why do we need 2 bits to represent the lock flag?
Specifically, the synchronized lock is initially in a lock-free state. When the first thread competes for the lock, it will modify the lock flag in the object header to a biased lock, and then record the thread ID in the object header, indicating that the thread Obtained bias lock. When the second thread competes for the lock, if it finds that the thread ID recorded in the object header is consistent with the current thread ID, then it can obtain the lock. Otherwise, it needs to cancel the bias lock and switch to a lightweight lock state. When multiple threads compete for a lock, they will enter a heavyweight lock state. Therefore, in order to implement the lock upgrade process, Java adds two bits to the object header to represent the lock flag to achieve the conversion process from无锁
state to偏向锁
state, then to轻量级锁
state, and finally to重量级锁
state.
There are four situations here, so 2bit is used to represent the four situations above.
If a thread executes synchornized
the code block, the JVM uses the CAS operation to record the thread pointer ID into the mark word, and modifies the bias lock bit to indicate that the current thread has obtained the lock. The lock object becomes 偏向锁
(modify the lock flag in the object header through CAS).After executing the synchronized code block, the thread will not actively release the bias lock.
The thread acquires the lock and can execute the synchronized code block. When thread 2 reaches the synchronized code block, it will determine whether the thread holding the lock at this time is itself. If it is its own thread ID, it means that the lock of this object is still held, and the synchronized code block can continue to be executed. Since the bias lock has not been actively released before, there is no need to re-lock here (there is no need to call the operating system's Mutex
lock again). If there is only one thread using the lock from beginning to end, it is obvious that the biased lock has almost no additional overhead here and the performance is extremely high.
① Check whether the bias lock is on
java -XX:+PrintFlagsInitial | grep BiasedLock
operation result:
intx BiasedLockingBulkRebiasThreshold = 20 {
product}
intx BiasedLockingBulkRevokeThreshold = 40 {
product}
intx BiasedLockingDecayTime = 25000 {
product}
// 然后偏向锁开启之后默认会有4s钟的延迟,测试的时候需要注意,可以将这个值设置成0,方便查看效果
intx BiasedLockingStartupDelay = 4000 {
product}
bool TraceBiasedLocking = false {
product}
// JVM 默认开启了偏向锁的设置
bool UseBiasedLocking = true {
product}
② Turn on the bias lock setting
-XX:+UseBiasedLocking -XX:BiasedLockingStartupDelay=0
By showing the resultsUseBiasedLocking = trueIt can be known that the JVM turns on the bias lock by default, but it does not turn on the bias lock immediately when the program starts, but requires a delay of 4 seconds before the bias lock is actually turned on.
Why is there a 4s delay in opening the bias lock?
Since acquiring a biased lock requires a certain amount of time, the JVM does not turn on the biased lock immediately when the object is created. Instead, the JVM waits for a certain amount of time (default is 4 seconds) after the object is created to observe the usage of the object. If only one thread accesses the object during this period, the JVM will set the lock flag of the object to a biased lock and record the thread ID in the object header, indicating that the thread has acquired the lock of the object. . If multiple threads access the object during this period, the JVM will not set the lock flag of the object to a biased lock, but directly set the lock flag to a lightweight lock or a heavyweight lock. Use regular locking methods.
This strategy of waiting for a certain period of time before turning on biased locks is to avoid frequently creating and destroying objects in a short period of time, causing the overhead of biased locks to be greater than the performance loss of locking.
The demonstration effect is that the bias lock delay needs to be set to 0s, as shown in the figure:
The commands in the VM are as follows:
-XX:BiasedLockingStartupDelay=0
The demo code is as follows:
public class ObjectMarkWordDemo {
public static void main(String[] args) {
Object abc = new Object();
new Thread(() -> {
synchronized (abc) {
// 注意这里不要写任何代码操作
System.out.println(ClassLayout.parseInstance(abc).toPrintable());
}
}).start();
}
}
Note : Do not write other code on the above output statement
The output is as follows:
Can be changed to internal layout 偏向锁 101
. But now only when there is no lock competition, if the competition is found 锁撤销
, it will proceed and release the lock 轻量级锁
.
(3) Bias lock cancellation
Question: Does biased lock revocation cause severe performance degradation?
偏向锁撤销
It refers to the process of restoring the object's lock state to a lock-free state due to competition or other reasons in the biased lock state. Bias lock cancellation refers to canceling the bias lock and returning to the lock-free state.
In the biased lock state, if other threads try to acquire the lock, the biased lock needs to be revoked first. The process of revoking the biased lock requires checking whether the hashCode of the object has changed. If the hashCode changes, the biased lock needs to be revoked. Otherwise, the lock can be directly upgraded to a lightweight lock. In the process of canceling the bias lock, it is necessary to re-bias, clear the bias lock flag, set the thread ID to 0, etc.
The process of biased lock revocation is relatively performance-intensive, so it is necessary to avoid biased lock revocation as much as possible, especially in high-concurrency scenarios.
Optimization: In the case of competitive incentives, biased locks can be turned off and directly upgraded to lightweight locks.
偏向锁撤销
The case is as follows:
public class ObjectMarkWordDemo {
public static void main(String[] args) throws InterruptedException {
Object abc = new Object();
synchronized (abc) {
System.out.println("偏向锁:" + ClassLayout.parseInstance(abc).toPrintable());
}
System.out.println("偏向锁:" + ClassLayout.parseInstance(abc).toPrintable());
new Thread(() -> {
synchronized (abc) {
System.out.println(">>>>>>发生竞争锁,触发偏向锁撤销...");
}
}).start();
System.out.println("偏向锁撤销:" + ClassLayout.parseInstance(abc).toPrintable());
}
}
(4) Lock Record
Question: What is Lock Record?
Thread A will create a space in the stack frame during running, called Lock Record
record, to store lock records. When the virtual machine detects that this object is in a lock-free state, it will create this space on the stack frame of this thread to store Mark Word
information related to the lock.
Lock Record
The data inside is all copied Mark Word
inside, because the lock-related information is Mark Word
on it. At the same time, the official name of this copy process is: Displaced Mark Word
. Finally, through the CAS spin operation, the pointer of this stack frame is written to the Mark Word. If the writing is successful, it means that thread A has successfully acquired the lock. If the write fails, it means that the lock is occupied by other threads.
(5) Lightweight lock
轻量级锁
交替执行同步
To improve efficiency when threads are close to code.
CAS
The main purpose is to reduce the performance consumption caused by heavyweight locks using operating system mutexes without multi-thread competition. To put it bluntly, spin first and then block. Upgrade timing, when the bias lock function is turned off or multi-threads compete for the bias lock, it will be upgraded to a lightweight lock.
If thread A has already obtained the lock, then thread B comes to snatch the lock of the object. Since the lock of the object has been obtained by thread A, the lock is currently a bias lock; and thread B is competing to find the thread in Makr Word. If the ID is not its own, thread B will enter the CAS spin operation hoping to obtain the lock. At this time, there are two situations in the operation of thread B:
① If the lock is acquired successfully, directly replace the thread ID in Mark Word with thread B's own ID, and then bias it toward thread B. The lock remains in the biased lock state, thread A ends, and thread B takes over.
② If the lock acquisition fails, the biased lock is upgraded to a lightweight lock. At this time, the lightweight lock is held by the thread that originally held the biased lock and continues to execute its synchronization code block, while the competing thread B will enter spin Wait for this lightweight lock.
The lightweight lock spins too many times, resulting in a waste of CPU resources. Before JDK6, the default spin was 10 times or the number of spin threads exceeded half of the number of CPU cores and the spin was immediately given up and upgraded to 重量级锁 10
.
Modify the number of spins command:
-XX:PreBlockSpin=10
轻量级锁
The case is as follows (the bias lock delay time can be restored):
-XX:BiasedLockingStartupDelay=4000
package com.xxl.job.admin.mytest;
import org.openjdk.jol.info.ClassLayout;
import java.util.concurrent.TimeUnit;
public class ObjectMarkWordDemo {
public void test() {
Object obj = new Object();
synchronized (obj) {
System.out.println("111");
}
synchronized (obj) {
System.out.println("111");
}
synchronized (obj) {
System.out.println("111");
}
}
public static void main(String[] args) throws InterruptedException {
// 打印 JVM 相关的信息
// System.out.println(VM.current().details());
// 打印每个对象是否为 8 的整数倍大小
// System.out.println(VM.current().objectAlignment());
MyObject myObject = new MyObject();
System.out.println(Integer.toHexString(myObject.hashCode()));
new Thread(()->{
// 在 myObject 对象头上进行加锁(默认直接干到轻量级锁,这里我非要把他干到偏向锁状态)
// 默认是开启偏向锁的,所以这里我们只需要把开启偏向锁的延迟时间修改成 0 方便看效果 -XX:+BiasedLockingStartupDelay=0
synchronized (myObject) {
// 给这个线程加锁,并且还设置了偏向线程 ID
System.out.println(ClassLayout.parseInstance(myObject).toPrintable());
}
}).start();
TimeUnit.MICROSECONDS.sleep(500);
// 锁被释放了,所以这里打印的肯定是无锁状态 001
System.out.println(ClassLayout.parseInstance(myObject).toPrintable());
}
}
class MyObject {
}
operation result:
76fb509a
# WARNING: Unable to attach Serviceability Agent. You can try again with escalated privileges. Two options: a) use -Djol.tryWithSudo=true to try with sudo; b) echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
com.xxl.job.admin.mytest.MyObject object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) e8 29 c3 0a (11101000 00101001 11000011 00001010) (180562408)
4 4 (object header) 03 00 00 00 (00000011 00000000 00000000 00000000) (3)
8 4 (object header) 44 c1 00 f8 (01000100 11000001 00000000 11111000) (-134168252)
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
# WARNING: Unable to attach Serviceability Agent. You can try again with escalated privileges. Two options: a) use -Djol.tryWithSudo=true to try with sudo; b) echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
com.xxl.job.admin.mytest.MyObject object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) e8 29 c3 0a (11101000 00101001 11000011 00001010) (180562408)
4 4 (object header) 03 00 00 00 (00000011 00000000 00000000 00000000) (3)
8 4 (object header) 44 c1 00 f8 (01000100 11000001 00000000 11111000) (-134168252)
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
(6) Heavyweight lock
When lightweight locks keep spinning, you need to consider whether using heavyweight locks can improve performance.
public class ObjectMarkWordDemo {
public static void main(String[] args) throws InterruptedException {
Object abc = new Object();
for (int i = 0; i < 2; i++) {
Thread thread = new Thread(() -> {
synchronized (abc) {
System.out.println("重量级锁:" + ClassLayout.parseInstance(abc).toPrintable());
}
});
thread.join();
thread.start();
}
}
}
(7) Lock roughening
Similar to the following example:
public class ObjectMarkWordDemo {
Object abc = new Object();
public void show() {
synchronized (abc) {
// 复杂操作
}
synchronized (abc) {
// 复杂操作
}
synchronized (abc) {
// 复杂操作
}
}
}
The same lock is always used, and the execution time before and after is very short. The JVM will optimize and directly merge these locks into one large lock, which can be called to 锁膨胀
provide program performance.
(8) Lock elimination
Similar to the following example:
public class ObjectMarkWordDemo {
public void show() {
Object abc = new Object();
synchronized (abc) {
// 复杂操作
}
}
}
The show() method will lock every time, but this lock has no meaning at all, so the bottom layer of the JVM will optimize it to improve program performance.
3. Commonly used commands
Set JVM heap size
-Xms10m -Xmx10m
Query which commands are run when the JVM is started
java -XX:+PrintCommandLineFlags -version
Turn off compression configuration of class pointers in object headers
-XX:-UseCompressedClassPointers
Check whether the bias lock is on
java -XX:+PrintFlagsInitial | grep BiasedLock
operation result:
intx BiasedLockingBulkRebiasThreshold = 20 {
product}
intx BiasedLockingBulkRevokeThreshold = 40 {
product}
intx BiasedLockingDecayTime = 25000 {
product}
// 然后偏向锁开启之后默认会有4s钟的延迟,测试的时候需要注意,可以将这个值设置成0,方便查看效果
intx BiasedLockingStartupDelay = 4000 {
product}
bool TraceBiasedLocking = false {
product}
// JVM 默认开启了偏向锁的设置
bool UseBiasedLocking = true {
product}
Turn on bias lock setting
-XX:+UseBiasedLocking -XX:BiasedLockingStartupDelay=0