What is a synchronized heavyweight lock

Today we continue to learn the synchronized upgrade process, and there is only the last step left: lightweight lock -> heavyweight lock.

Through today's content, I hope to help you answer all questions about synchronized? In addition to lock coarsening, lock elimination and Java 8's optimization of synchronized, all problems.

Acquire heavyweight locks

Demystify the upgrade of biased locks from the source code Finally, if there is competition in synchronizer#slow_enter , it will call the ObjectSynchronizer::inflate method to upgrade (inflate) lightweight locks.

void ObjectSynchronizer::slow_enter(Handle obj, BasicLock* lock, TRAPS) {
......
ObjectSynchronizer::inflate(THREAD, obj(), inflate_cause_monitor_enter)->enter(THREAD);
}

Obtain the heavyweight lock ObjectMonitor through ObjectSynchronizer::inflate, and then execute the ObjectMonitor::enter method.

Tips：

This method is mentioned in the 8 questions you must know about threads (middle) ;
The problem is lock escalation (expansion), but the focus is not there ObjectSynchronizer::inflate, so the code analysis is placed in the heavyweight lock source code analysis .

lock structure

Before understanding the logic of ObjectMonitor::enter, first look at the structure of ObjectMonitor :

class ObjectMonitor {
private:
// 保存与ObjectMonitor关联Object的markOop
volatile markOop   _header;
// 与ObjectMonitor关联的Object
void*     volatile _object;
protected:
// ObjectMonitor的拥有者
void *  volatile _owner;
// 递归计数
volatile intptr_t  _recursions;
// 等待线程队列，cxq移入/Object.notify唤醒的线程
ObjectWaiter * volatile _EntryList;
private:
// 竞争队列
ObjectWaiter * volatile _cxq;
// ObjectMonitor的维护线程
Thread * volatile _Responsible;
protected:
// 线程挂起队列（调用Object.wait）
ObjectWaiter * volatile _WaitSet;
}

The _header field stores the markOop of the Object, why should it be like this? Because there is no space to store the markOop of the Object after the lock is upgraded, it is stored in the _header so that it can be restored to the state before locking when exiting .

Tips：

In fact, basicLock also stores the markOop of the object;
The waiting thread in EntryList comes from cxq moving in, or Object.notify wakes up but is not executed.

Implementation of reentrancy

The objectMonito#enter method can be divided into three parts, the first is the scene of successful competition or reentry :

// 获取当前线程Self
Thread * const Self = THREAD;

// CAS抢占锁，如果失败则返回_owner
void * cur = Atomic::cmpxchg(Self, &_owner, (void*)NULL);

if (cur == NULL) {
	// CAS抢占锁成功直接返回
	return;
}

// CAS失败场景
// 重量级锁重入
if (cur == Self) {
	// 递归计数+1
	_recursions++;
	return;
}

// 当前线程是否曾持有轻量级锁
// 可以看做是特殊的重入
if (Self->is_lock_owned ((address)cur)) {
	// 递归计数器置为1
	_recursions = 1;
	_owner = Self;
	return;
}

In both reentrant and upgrade scenarios, _recursions will be operated. _recursions records the number of times to enter the ObjectMonitor, and the corresponding number of exit operations must be performed to complete the unlocking.

adaptive spin

The above are all scenarios where the lock is successfully acquired, so what about the scenario where competition results in failure? Let's look at the part of adaptive spin , ObjectMonitor's penultimate pursuit of "lightweight" :

// 尝试自旋来竞争锁
Self->_Stalled = intptr_t(this);
if (Knob_SpinEarly && TrySpin (Self) > 0) {
	Self->_Stalled = 0;
	return;
}

The objectMonitor#TrySpin method is the support for adaptive spinning . Added after Java 1.6, the default number of spins is removed, and the decision on the number of spins is given to the JVM.

The JVM decides based on the last spin of the lock. If the spin is successful just now and the thread holding the lock is executing, the JVM will allow another spin attempt. If the spin of the lock fails frequently, the JVM will skip the spin process directly .

Tips：

The original code analysis of adaptive spin is placed in the source code analysis of heavyweight locks ;
objectMonitor#TryLock is very simple, and the key technology is still CAS.

Implementation of mutual exclusion

So far, both CAS and spin are technologies that have appeared in biased locks and lightweight locks. Why does ObjectMonitor have a "heavyweight" reputation?

And finally the scenario where the competition fails:

// 此处省略了修改当前线程状态的代码
for (;;) {
	EnterI(THREAD);
}

In fact, after entering ObjectMonitor#EnterI , the "lightweight" locking method is also tried first:

void ObjectMonitor::EnterI(TRAPS) {
	if (TryLock (Self) > 0) {
		return;
	}

	if (TrySpin (Self) > 0) {
		return;
	}
}

Next is the real implementation of the heavyweight:

// 将当前线程（Self）封装为ObjectWaiter的node
ObjectWaiter node(Self);
Self->_ParkEvent->reset();
node._prev   = (ObjectWaiter *) 0xBAD;
node.TState  = ObjectWaiter::TS_CXQ;

// 将node插入到cxq的头部
ObjectWaiter * nxt;
for (;;) {
	node._next = nxt = _cxq;
	if (Atomic::cmpxchg(&node, &_cxq, nxt) == nxt)
		break;

	// 为了减少插入到cxq头部的次数，试试能否直接获取到锁
	if (TryLock (Self) > 0) {
		return;
	}
}

The logic is clear at a glance, encapsulate the ObjectWaiter object, and add it to the head of the cxq queue. Then go down and execute:

// 将当前线程（Self）设置为当前ObjectMonitor的维护线程（_Responsible）
// SyncFlags的默认值为0，可以通过-XX:SyncFlags设置
if ((SyncFlags & 16) == 0 && nxt == NULL && _EntryList == NULL) {
	Atomic::replace_if_null(Self, &_Responsible);
}

for (;;) {
	// 尝试设置_Responsible
	if ((SyncFlags & 2) && _Responsible == NULL) {
		Atomic::replace_if_null(Self, &_Responsible);
	}
	// park当前线程
	if (_Responsible == Self || (SyncFlags & 1)) {
		Self->_ParkEvent->park((jlong) recheckInterval);  
		// 简单的退避算法，recheckInterval从1ms开始
		recheckInterval *= 8;
		if (recheckInterval > MAX_RECHECK_INTERVAL) {
			recheckInterval = MAX_RECHECK_INTERVAL;
		}
	} else {
		Self->_ParkEvent->park();
	}

	// 尝试获取锁
	if (TryLock(Self) > 0)
		break;
	if ((Knob_SpinAfterFutile & 1) && TrySpin(Self) > 0)  
		break;

	if (_succ == Self)
		_succ = NULL;
}

The logic is not complicated, the current thread is constantly parked, and it tries to acquire the lock after being woken up. Need to pay attention to -XX:SyncFlagthe setting of s:

At that timeSyncFlags == 0 , synchronized directly suspended the thread;
At that timeSyncFlags == 1 , synchronized will suspend the thread for the specified time.

The former is permanently suspended and needs to be woken up by other threads, while the latter is automatically woken up after a specified period of time .

Tips : 8 questions you must know about threads (middle) I have talked about park and parkEvent, and the bottom layer is realized through pthread_cond_wait and pthread_cond_timedwait.

release heavyweight lock

The source code and comments for releasing heavyweight locks are very long, we omit most of them and only look at the key parts.

reentry lock exit

We know that reentry is to continuously increase the count of _recursions, so the scenario of exiting reentry is very simple:

void ObjectMonitor::exit(bool not_suspended, TRAPS) {
	Thread * const Self = THREAD;

	// 第二次持有锁时，_recursions == 1
	// 重入场景只需要退出重入即可
	if (_recursions != 0) {
		_recursions--;
		return;
	}
	.....
		}

Constantly decrements the count of _recursions.

release and write

In the implementation of the JVM, when the current thread is the lock holder and has not re-entered, it will first release the lock it holds, then write the changes into the memory, and finally shoulder the responsibility of waking up the next thread . Let's first look at the logic of releasing and writing memory:

// 置空锁的持有者
OrderAccess::release_store(&_owner, (void*)NULL);

// storeload屏障，
OrderAccess::storeload();

// 没有竞争线程则直接退出
if ((intptr_t(_EntryList)|intptr_t(_cxq)) == 0 || _succ != NULL) {
	TEVENT(Inflated exit - simple egress);
	return;
	}

storeload barrier, for the following statement:

store1;
storeLoad;
load2

It is guaranteed that the write of the store1 instruction is visible to all processors before the execution of the load2 instruction.

Tips : Memory barriers are explained in detail in volatile.

wake-up strategy

After releasing the lock and writing to memory, you only need to wake up the next thread to "hand over" the right to use the lock. But there are two "waiting queues": cxq and EntryList, which one should I wake up from?

Before Java 11, different strategies were selected according to QMode:

QMode == 0, the default strategy, put cxq into EntryList;
QMode == 1, flip cxq, and put into EntryList;
QMode == 2, wake up directly from cxq;
QMode == 3, move cxq to the end of EntryList;
QMode == 4, move cxq to the head of EntryList.

Different strategies lead to different wake-up sequences. Now you know why synchronized is an unfair lock, right?

The objectMonitor#ExitEpilog method is very simple. It calls the unpark method corresponding to the park, so I won’t say much here.

Tips : Java 12's objectMonitor removes QMode, which means there is only one wake-up strategy.

Summarize

Let's make a summary of heavyweight locks. The heavyweight lock of synchronized is ObjectMonitor, and the key technologies it uses are CAS and park . Compared with mutex#Monitor , they have the same essence and encapsulate the park, but ObjectMonitor is a complex implementation with a lot of optimization.

We saw how heavyweight locks achieve reentrancy and the "unfairness" caused by the wake-up strategy. Then we often say that synchronized guarantees atomicity, order and visibility, how is it realized?

You can think about this question first, and the next article will make an all-round summary and end it for synchronized.

If this article is helpful to you, please give it a lot of praise and support. If there are any mistakes in the article, please criticize and correct. Finally, everyone is welcome to pay attention to Wang Youzhi, a financial man . See you next time!