[In-depth understanding of Linux kernel locks] 1. The origin of kernel locks

img
My circle: a gathering place for senior engineers
I am Dong Ge, a senior embedded software development engineer, engaged in embedded Linux driver development and system development, and worked for a Fortune 500 company!
Creation concept: Focus on sharing high-quality embedded articles, so that everyone can read something!
img

img
 
In Linuxthe device driver, we have to solve a problem: multiple processes concurrent access to shared resources, concurrent access will lead to race conditions.

 

1. Concurrency and race

Concurrency (Concurrency): Refers to multiple execution units being executed simultaneously and in parallel.

Race (RaceConditions): Access to shared resources by concurrently executed units can easily lead to race conditions.

Shared resources: global variables, static variables, etc. on hardware resources and software.

 

The way to solve the race is to ensure mutual exclusive access to shared resources.

Mutually exclusive access: When an execution unit accesses a shared resource, other execution units are prohibited from accessing it.

Critical section (Critical Sections): The area of ​​code that accesses a shared resource becomes a critical section. Critical sections need to be protected with some kind of mutual exclusion mechanism.

Common mutual exclusion mechanisms include : interrupt masking, atomic operations, spin locks, semaphores, mutexes, etc.

 

2. Occasions where race conditions occur

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-22GWyopk-1686191322786)(https://image-1305421143.cos.ap-nanjing.myqcloud.com/image /image-20230511140139520.png)]

  1. Symmetrical Multiple Processors (SMP) between multiple CPUs

Multiple CPUs use a common system bus and can access common peripherals and memory. In SMPthe case of multi-core (CPU0、CPU1)race conditions can occur in:

  • CPU0between the process of and CPU1the process of
  • CPU0CPU1between the process and the interrupt of
  • CPU0CPU1between the interruption of and the interruption of

 

  1. Within a single CPU, between the process and the process that preempted it

In a single CPU, multiple processes execute concurrently. When the execution time slice of one process is exhausted, it may be interrupted by another high-priority process, and a race condition will occur.

 

  1. Between interrupts (soft interrupts, hard interrupts, Tasklets, bottom half) and processes

When a process is executing, an external/internal interrupt (soft interrupt, hard interrupt, tasklet, etc.) interrupts it, which will cause a race condition to occur.

 

3. Compilation out of order and execution out of order

In addition to the race conditions caused by concurrent access, it is also necessary to understand some problems caused by some characteristics of compilers and processors.

3.1 Compilation out of order

Modern high-performance compilers have the ability of out-of-order optimization in object code optimization . In order to improve the cache hit rate and the work efficiency of the CPU's Load/Store unit as much as possible, the compiler can out-of-order memory access instructions to reduce logic unnecessary memory accesses.

Therefore, after the compiler optimization is turned on, the generated assembly code is not executed strictly according to the logical order of the code, which is normal.

In order to solve the problem of compilation disorder, you can add barrier()a compilation barrier ,

This barrier prevents compiler optimizations. Before and after setting the barrier, it can ensure that the executed statements are not disordered.

By adding barrier()a compilation barrier, the correct execution order can be guaranteed.

example:

#define barrier() __asm__ __volatile__("": : :"memory")

int main(int argc,char *argv[])
{
    
    
	int a = 0,b,c,d[4096],e;
	e = d[4095];
    barrier();
	b = a;
	c = a;
	return 0;
}

 

3.2 Execution out of order

Compilation out-of-order is the behavior of the compiler, and execution out-of-order is the behavior of the processor at runtime.

**Advanced ones CPUoften reorder and execute memory access instructions according to their own cache characteristics! **This leads to multiple sequential instructions, and the instructions issued later may still be executed first.

This kind of out-of-order execution is very common among multiples CPU, as well as within a single .CPU

3.2.1 Between multiple CPUs

In order to solve the situation that CPUthe behavior of one CPUis visible to another between multiple cores, ARMthe processor introduces a memory barrier instruction:

  • DMB (data memory barrier), to ensure that all instructions before the instruction, the memory access is completed, and then access the memory access action after the instruction
  • DSB (Data Synchronization Barrier), to ensure that all memory access instructions before this instruction are executed (memory access, cache, jump prediction, TLB maintenance, etc.)
  • ISB (Instruction Synchronization Barrier), Flushpipeline, ensures that all instructions executed after ISB are obtained from cache or memory.

 

3.2.2 Inside a single CPU

In the single CPU, we often encounter when accessing peripheral registers, some peripheral registers have high requirements on the read and write sequence, in order to avoid execution disorder, some memory barrier instructions are needed at this time CPU.

CPUInternally, in order to solve this kind of problem, CPUsome memory barrier instructions are provided:

can refer to Documentation/memory-devices.txtandDocumentation/io_ordering.txt

  • Read and write barriers:mb()
  • Read barrier:rmb()
  • write barrier:wmb()
  • register read barrier__iormb()__
  • register write barrier__iowmb()__
#define writeb_relaxed(v,c)	__raw_writeb(v,c)
#define writew_relaxed(v,c)	__raw_writew((__force u16) cpu_to_le16(v),c)
#define writel_relaxed(v,c)	__raw_writel((__force u32) cpu_to_le32(v),c)

#define readb(c)		({
      
       u8  __v = readb_relaxed(c); __iormb(); __v; })
#define readw(c)		({
      
       u16 __v = readw_relaxed(c); __iormb(); __v; })
#define readl(c)		({
      
       u32 __v = readl_relaxed(c); __iormb(); __v; })

#define writeb(v,c)		({
      
       __iowmb(); writeb_relaxed(v,c); })
#define writew(v,c)		({
      
       __iowmb(); writew_relaxed(v,c); })
#define writel(v,c)		({
      
       __iowmb(); writel_relaxed(v,c); })

writelThe difference between and writel_relaxedis whether there is a barrier.

 

4. Summary

As can be seen from the above, in order to solve the

  1. Race problems caused by concurrency
  2. High-performance compiler compilation out-of-order problem
  3. CPUExecution disorder caused by high performance

CPUAnd ARMthe memory barrier instructions provided by the processor, etc., this is also the meaning of the kernel lock.

Like + follow, never get lost

img
Welcome to pay attention to Official Account & Planet [Embedded Art], original by Dong Ge!

Guess you like

Origin blog.csdn.net/dong__ge/article/details/131102649