探究 “条件变量signal时是否需要持有mutex”

Waiting Process

pthread_mutex_lock(&mutex);
while (condition == FALSE)
    pthread_cond_wait(&cond, &mutex);
pthread_mutex_unlock(&mutex);

Signaling Process

pthread_mutex_lock(&mutex);
condition = TRUE;
pthread_cond_signal(&cond);
pthread_mutex_unlock(&mutex); // unlock after signal, why not before???

参考 POSIX Standard - pthread_cond_wait - Features of Mutexes and Condition Variables

Features of Mutexes and Condition Variables

It had been suggested that the mutex acquisition and release be decoupled from condition wait. This was rejected because it is the combined nature of the operation that, in fact, facilitates realtime implementations. Those implementations can atomically move a high-priority thread between the condition variable and the mutex in a manner that is transparent to the caller. This can prevent extra context switches and provide more deterministic acquisition of a mutex when the waiting thread is signaled. Thus, fairness and priority issues can be dealt with directly by the scheduling discipline. Furthermore, the current condition wait operation matches existing practice.

Scheduling Behavior of Mutexes and Condition Variables

Synchronization primitives that attempt to interfere with scheduling policy by specifying an ordering rule are considered undesirable. Threads waiting on mutexes and condition variables are selected to proceed in an order dependent upon the scheduling policy rather than in some fixed order (for example, FIFO or priority). Thus, the scheduling policy determines which thread(s) are awakened and allowed to proceed.

结论：在signal之后unlock（什么是scheduling policy？？）

参考 PTHREAD_COND_BROADCAST(3P)POSIX Programmer’s ManualTHREAD_COND_BROADCAST(3P), PTHREAD_COND_TIMEDWAIT(3P)POSIX Programmer’s ManualTHREAD_COND_TIMEDWAIT(3P)

The pthread_cond_broadcast() or pthread_cond_signal() functions may be called by a thread whether or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait() have associated with the condition variable during their waits; however, if predictable scheduling behavior is required, then that mutex shall be locked by the thread calling pthread_cond_broadcast() or pthread_cond_signal().

结论：不一定要持有mutex，但如果要求调度行为符合预期，必须在signal之后unlock。（什么叫做调度可预期？？FIFO？？Priority？？）

参考 Calling pthread_cond_signal without locking mutex - Stack Overflow 第一个回答的评论区，"R…"的说法：（"R…"好像是某版本 pthread 作者）

Unlocking after ensures that a lower-priority thread won’t be able to steal the event from a higher-priority one, but if you’re not using priorities.

Unlocking before the signal will actually reduce the number of system calls/context switches and improve overall performance.

结论：

signal之后unlock，保证低优先级线程不会抢占高优先级线程。
signal之前unlock，降低内核开销（系统调用，上下文切换），提高效率

（注：评论区下和 “David Schwartz” 发生分歧，还挺有趣的）

参考 std::condition_variable::notify_one() - cppreference

The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock. However, some implementations (in particular many implementations of pthreads) recognize this situation and avoid this “hurry up and wait” scenario by transferring the waiting thread from the condition variable’s queue directly to the queue of the mutex within the notify call, without waking it up.
Notifying while under the lock may nevertheless be necessary when precise scheduling of events is required, e.g. if the waiting thread would exit the program if the condition is satisfied, causing destruction of the notifying thread’s condition_variable. A spurious wakeup after mutex unlock but before notify would result in notify called on a destroyed object.
通知线程不必保有等待线程所保有的同一互斥上的锁；实际上这么做是劣化，因为被通知线程将立即再次阻塞，等待通知线程释放锁。然而一些实现（尤其是许多 pthread 的实现）辨识此情形，在通知调用中，直接从条件变量队列转移等待线程到互斥队列，而不唤醒它，以避免此“急促并等待”场景。
然而，在要求精确调度事件时，可能必须在处于锁下时通知，例如，在若满足条件则线程将退出程序，导致析构通知线程的 condition_variable 的情况下。互斥解锁之后，但在通知前的虚假唤醒可能导致通知在被销毁对象上调用。

结论：

精确调度（maybe是说优先级？）下signal之后unlock
保有mutex时signal是pessimization，不过内核实现会从条件变量队列转移到互斥队列。（wait morphing）

参考 Operating Systems: Three Easy Pieces - 30 Condition Variables

TIP: ALWAYS HOLD THE LOCK WHILE SIGNALING

Although it is strictly not necessary in all cases, it is likely simplest and best to hold the lock while signaling when using condition variables. The example above shows a case where you must hold the lock for correctness; however, there are some other cases where it is likely OK not to, but probably is something you should avoid. Thus, for simplicity, hold the lock when calling signal.

结论：signal之后unlock

参考条件变量用例–解锁与signal的顺序问题 - CSDN

上下文切换
调度行为预测
wait morphing（当锁被持有时，直接将线程从条件变量队列移动到互斥锁队列，而无需上下文切换）
先unlock再signal，资源被其他线程偷了（因为设计上考虑了spurious wakeup，所以资源被劫不会导致逻辑问题）
优先级反转（高优先级等待；低优先级先unlock，在signal之前被中优先级抢占，造成高优先级被中优先级阻碍）
如果先signal，假设锁使用了优先级天花板或继承协议（参考《Programming.With.Posix.Threads》第5.5.5.1节和5.5.5.2节），则可以保证T1在解锁后，T2会立即被调度。
先unlock，有可能有人把cv删了。。。

参考 Where does the wait queue for threads lies in POSIX pthread mutex lock and unlock?

futex：内核维护mutex队列的结构；
futex.c - kernal v2.6.39.4

参考 Synchronization primitives in the Linux kernel. Part 4.

参考 basic question about concurrency - Google Forum - Dave Butenhof(author of Programming with POSIX Threads)

《Programming with POSIX Threads》作者的观点：

The problem (from the realtime side) with condition variables is that if you can signal/broadcast without holding the mutex, and any thread currently running can acquire an unlocked mutex and check a predicate without reference to the condition variable, then you can have an indirect priority inversion.
…
…
The quoted statement simply indicates that if the producer continues to hold the mutex while waking consumer thread A, it will be able to run and block on the mutex in priority order before consumer thread B can acquire the mutex. (With wait morphing, the condition variable operation immediately transfers the highest priority waiter to the mutex with minimal overhead; but on any strict realtime implementation the act of unblocking a high priority thread will immediately preempt a lower priority thread and allow it to block on the mutex immediately.) Now, consumer thread B may not contend for the mutex before the unlock, in which case the “right thread” goes next, or it may contend anywhere during this sequence and be sorted into the proper priority order along with consumer thread A; in either case, the higher priority thread will have the first chance at the queue.
That’s what “predictable scheduling behavior” means in this context. In more pragmatic terms, I’m pretty sure this means that someone on the realtime side of the virtual aisle thought it was a bad idea to recommend signaling “outside” the mutex, and the working group compromised by adding that statement, which was sufficient to ward off (or possibly to remove) a formal objection to the draft standard. This is how a lot of the text in the standard grew.

看最上面的waiting线程的程序，如果signal时没有持有mutex，这时候新来一个waiting，抢到了锁，紧接着检查condition（当然已经ok）。那么可能导致优先级反转。
相反，如果应用层不在意线程优先级（比如任务队列会管理优先级），那么signal之前unlock可以提高效率。

（注：里面还谈到了不少历史，，有趣）

总结：

如果realtime场景下在意调度行为predictable，（最好必须）signal之后再unlock。(predictable scheduling behavior)
如果在应用层不在意thread优先级和执行顺序，那么可以先unlock。

探究 “条件变量signal时是否需要持有mutex”

Features of Mutexes and Condition Variables

Scheduling Behavior of Mutexes and Condition Variables

TIP: ALWAYS HOLD THE LOCK WHILE SIGNALING

猜你喜欢