9 Introduction to kernel synchronization

Prevent concurrent access to shared resources from causing system instability.

9.1 Critical section and race conditions

Measures: Ensure the atomicity of code execution in the critical section-that is, the operation cannot be interrupted before the execution ends.

Race condition: Two threads of execution may execute simultaneously in the same critical section.

Synchronization: Avoiding concurrency and preventing race conditions are called synchronization.

The transaction in the critical section must occur completely, or not at all, but must not be interrupted.

The lock is implemented using atomic operations.

9.2 Locking

Only one thread can hold the lock at a time, so only one thread can operate the queue at a time.

9.2.1 Causes of concurrent execution

User space: Because the user program may be preempted and rescheduled by the scheduler at any time, and the newly scheduled process and the previous process share data.

Kernel space:

Conditions that require special attention:

If the system generates an interrupt while a piece of kernel code is operating a certain resource, and the interrupt handler still needs to access this resource, this is a bug.
If a piece of kernel code can be preempted during access to a shared resource, this is also a bug.
If the kernel code sleeps in a critical section, this is also a bug. (Hy: It is not a bug if it is locked)
Two processors must not access the same shared data at the same time.

9.2.2 Know what to protect

Any code that may be accessed concurrently needs to be protected almost without exception.

No need to lock:

The local data of the execution thread.
Data that is only accessed by a specific process. (Because the process only executes on one processor at a time)

Most kernel data structures need to be locked!

Data locking experience: If there are other threads of execution that can access the data, then some form of lock is added to the data; if anything else can see it, then it must be locked. Remember: lock the data, not the code.

9.3 Deadlock

Rules to avoid deadlock:

If a function acquires locks in a certain order, then any other function must acquire these locks (or a subset of them) in the same order.

As long as multiple locks are used nested, they must be acquired in the same order.

9.4 Contention and scalability

Lock contention (lock contention) refers to when the lock is being occupied, there are other threads trying to obtain the lock.

Since the function of the lock is to enable the program to access resources in a serial manner , the use of the lock will undoubtedly reduce the performance of the system.

If the lock is too thick or too thin, the difference is often only between the first line. When the lock contention is serious, too thick locks will reduce scalability; when the lock contention is not obvious, too fine locks will increase system overhead and bring waste, both of which will cause system performance degradation. But remember: the initial design of the locking scheme should be simple , and the locking scheme can be further refined only when needed.

10 Kernel synchronization method

10.1 Atomic manipulation

Atomic operations are the cornerstone of other synchronization methods. Atomic operations can ensure that instructions are executed in an atomic manner-the execution process will not be interrupted.

Two groups of atomic operation interfaces: one group operates on integers, and the other group operates on individual bits.

Some architectures do lack simple atomic operation instructions, but they also provide instructions to lock the memory bus for single-step execution to ensure that other operations that change memory cannot occur simultaneously.

10.1.1 Atomic integer manipulation

On most architectures, reading a word itself is an atomic operation, that is, it is impossible to complete the reading of a word during a write operation.

A word-length read always occurs atomically, and it is never possible to write to the same word interleaved; a read always returns a complete word, which occurs either before or after the write operation, and it is never possible to write in the process of.

Sequentiality refers to the requirements of the read and write sequence, implemented through barrier instructions.

10.1.2 64-bit atomic operations

10.1.3 Atomic bit operations

The kernel also provides a set of non-atomic bit functions corresponding to the above operations. The prefix of non-atomic function names has two more underscores. For example, the non-atomic form corresponding to test_bit() is __test_bit().

10.2 Spin lock

A spinlock can only be held by at most one executable thread. If an execution thread tries to acquire a spinlock that is already held, then the thread will continue to busy loop -waiting for the lock to be available again.

At any time, the spin lock can prevent more than one execution thread from entering the critical section at the same time.

A contended spin lock causes the thread requesting it to spin while waiting for the lock to be available again (especially a waste of processor time), so the spin lock should not be held for a long time. The time to hold the spin lock is preferably less than the time to complete two context switches.

The original intention of the spin lock: Lightweight locking in a short time.

10.2.1 Spin lock method

DEFINE_SPINLOCK（mr_lock）;

spin_lock(&mr_lock);

/*Critical section...*/

spin_unlock(&mr_lock);

Spin locks provide the protection mechanism needed to prevent concurrent access for multiprocessor machines.

The spin lock will not be added when compiling with a single processor; if the kernel preemption is prohibited, the spin lock will be completely removed from the kernel during compilation.

Spin locks are not recursive!

When using the spin lock in the interrupt handler, you must first disable the local interrupt before acquiring the lock (hy: after entering the interrupt, all interrupts will be automatically masked on the arm architecture hardware), otherwise, the interrupt handler will be interrupted The kernel code that is holding the lock. Note that only the interrupt on the current processor needs to be turned off. If the interrupt occurs on a different processor, even if the interrupt handler spins on the same lock, it will not prevent the holder of the lock (on a different processor) from finally releasing the lock.

The kernel provides an interface to disable interrupts while requesting a lock, as follows.

DEFINE_SPINLOCK（mr_lock）;

unsigned long flags;

spin_lock_irqsave(&mr_lock，flags);

/*Critical section...*/

spin_unlock_irqrestore(&mr_lock，flags);

The function spin_lock_irqsave() saves the current state of the interrupt, and disables the local interrupt, and then goes to acquire the specified lock.

DEFINE_SPINLOCK（mr_lock）;

spin_lock_irq(&mr_lock);

/*Critical section...*/

spin_unlock_irq(&mr_lock);

If you can determine that the interrupt is activated before locking, you can use spin_lock_irq() and spin_unlock_irq().

10.2.2 Other operations for spin locks

10.2.3 Spin lock and the lower half

spin_lock_bh() and spin_unlock_bh(): Acquiring the lock will prohibit all execution of the lower half.

When the lower part and the process share data, because the lower part can preempt the process, the data shared in the process must be protected.

Tasklets of the same type share data	No need to lock	Two tasklets of the same type are not allowed to execute simultaneously.
Different types of tasklets share data	Spin lock	Different types of tasklets can be executed simultaneously on different processors. There is no need to prohibit the lower half, because tasklets will never preempt each other on the same processor .
Soft interrupt shared data	Spin lock	Whether it is the same type or different types of soft interrupts, they can all be executed on different processors at the same time. But a soft interrupt on the same processor will never preempt another soft interrupt , so there is no need to prohibit the lower half
The process and the lower half share data	(Progress) 1. Prohibit the lower half of processing 2. Lock
Interrupt the sharing of data between the upper and lower halves	(Lower part) 1. Disable interrupt 2. Add spin lock
Sharing data in the work queue	Lock

10.3 Read-write spin lock

10.4 Semaphore

Semaphores provide better processor utilization than spin locks, but semaphores are more expensive than spin locks.

Often when it needs to be synchronized with the user space, the code will need to sleep. At this time, using semaphores is the only option.

The spin lock will prohibit kernel preemption, and the semaphore will not prohibit kernel preemption.

10.4.1 Counting semaphores and binary semaphores

The semaphore can allow any number of lock holders at the same time, and the spin lock allows at most one task to hold it at a time. In general, mutually exclusive semaphores (semaphores whose count is equal to 1) are basically used. A semaphore with a count greater than 1 is called a counting semaphore, which allows multiple execution threads to access the critical area at a time, and the counting semaphore cannot be used for forced mutual exclusion.

The semaphore supports two atomic operations P (probing operation) and V (increment operation), later called down() and up().