After graduating from two books for three years, why did you get Tencent T8offer? -Arrangement of Linux kernel interview questions (including answers)

1. What kinds of kernel locks are there in Linux?

The synchronization mechanism of Linux has been continuously developed and improved since 2.0 to 2.6. From the initial atomic operation, to the later semaphore, from the big kernel lock to today's spin lock. The development of these synchronization mechanisms is accompanied by the transition of Linux from a single processor to a symmetric multi-processor; along with the transition from a non-preemptive kernel to a preemptive kernel. Linux's lock mechanism is becoming more effective and more complex.

The spin lock can only be held by one executable thread at most. If an execution thread tries to request a spin lock that has been contented and has been held), then this thread will always be busy loop-spinning-waiting The lock is available again. If the lock is not contended, the thread of execution that requested it can immediately obtain it and continue. Spin locks can prevent more than one execution thread from entering the critical section at any time.

The sleep characteristics of the semaphore make the semaphore suitable for situations where the lock will be held for a long time; it can only be used in the context of the process, because it cannot be scheduled in the interrupt context; in addition, when the code holds the semaphore, it cannot Hold the spin lock again.

Synchronization mechanisms in the Linux kernel: APIs for atomic operations, semaphores, read-write semaphores, and spin locks. Other synchronization mechanisms include large kernel locks, read-write locks, large reader locks, RCU (Read-Copy Update, as the name implies It is read-copy modification), and sequence lock.

2. What is the meaning of user mode and kernel mode in Linux?

Operating systems such as MS-DOS run in a single CPU mode, but some Unix-like operating systems use dual modes, which can effectively realize time sharing. On Linux machines, the CPU is either in trusted kernel mode or restricted user mode. Except for the kernel itself in kernel mode, all user processes are running in user mode.

Kernel-mode code has unlimited access to all processor instruction sets and all memory and I/O space. If a user-mode process wants to enjoy this privilege, it must issue a request to a device driver or other kernel-mode code through a system call. In addition, user-mode code allows page faults, while kernel-mode code does not.

In earlier kernels, only user-mode processes can be context switched out and preempted by other processes. Unless the following two situations occur, the kernel mode code can always monopolize the CPU:
(1) It voluntarily gives up the CPU;
(2) An interrupt or exception occurs.
The kernel introduces kernel preemption, and most kernel mode codes can also be preempted.

3. How to apply for a large block of kernel memory?

In the Linux kernel environment, the success rate of applying for a large block of memory decreases with the increase of system running time. Although it is possible to apply for memory with physical discontinuities but continuous virtual addresses through the vmalloc series of calls, its use efficiency is not high after all The memory address space of vmalloc on bit systems is limited. Therefore, the general recommendation is to apply for a large block of memory during the system startup phase, but the probability of success is only relatively high, not 100%. If the program really cares about the success of this application, it can only retreat to "Boot Memory"). The following is a sample code to apply for and export the startup memory:

void* x_bootmem = NULL; EXPORT_SYMBOL(x_bootmem); 
unsigned long x_bootmem_size = 0; EXPORT_SYMBOL(x_bootmem_size); 
static int init x_bootmem_setup(char *str) 
{ 
x_bootmem_size = memparse(str, &str); x_bootmem = alloc_bootmem(x_bootmem_size); 
printk("Reserved %lu bytes from %p for x\n", x_bootmem_size, x_bootmem); return 1; 
} 
  setup("x-bootmem=", x_bootmem_setup);

It can be seen that its application is relatively simple, but the advantages and disadvantages are always symbiotic, and it inevitably has its own limitations: the
memory application code can only be connected to the kernel, and cannot be used in the module. The allocated memory will not be used and counted by the page allocator and slab allocator, which means it is outside the visible memory of the system, even if you release it somewhere in the future.
General users will only apply for a large block of memory. If you need to implement complex memory management on it, you need to implement it yourself. When memory allocation failure is not allowed, reserving memory space by starting the memory will be our only choice.

4. What are the main ways of communication between user processes?

1) Pipe): Pipes can be used for communication between processes with affinities, allowing communication between a process and another process that has a common ancestor with it.
2) Named pipe): Named pipe overcomes the nameless pipe Therefore, in addition to the functions that the pipe has, it also allows communication between unrelated processes. Named pipes have corresponding file names in the file system. The named pipe is created by the command mkfifo or the system call mkfifo.
3) Signal): Signal is a more complex communication method, used to notify the receiving process that a certain event has occurred. In addition to inter-process communication, the process can also send signals to the process itself; in addition to supporting
Unix early signal semantic functions In addition to sigal, it also supports the signal function
sigaction whose semantics conform to the Posix.1 standard. In fact, this function is based on BSD. In order to achieve a reliable signal mechanism, BSD can unify the external interface and reimplement the signal function with the sigaction function).
4) Message Queue: The message queue is a linked list of messages, including Posix message queue system
5) Message queue. Processes with sufficient permissions can add messages to the queue, and processes with read permissions can read messages in the queue. The message queue overcomes the lack of signal carrying information, the pipeline can only carry unformatted byte streams and the buffer size is limited.
6) semaphore: mainly used as a means of synchronization between processes and different threads of the same process.
7) Socket): A more general inter-process communication mechanism that can be used for inter-process communication between different machines. It was originally developed by the BSD branch of the Unix system, but now it can generally be ported to other Unix-like systems: Linux and System V variants support sockets.

5. What are the functions to apply for kernel memory through the partner system?

On the physical page management, a zone based buddy system is implemented. A separate buddy system is used to manage the memory in different areas, and free pages are monitored independently. Corresponding interface alloc_pages(gfp_mask, order), _ _get_free_pages(gfp_mask,
order), etc. Supplementary knowledge:
1. Principle description

Page Global Directory
Page Middle Directory (Page Middle Directory) The
page global directory contains the addresses of several page upper-level directories, which in turn contain the addresses of several page middle directories, and the page middle directories contain the addresses of several page tables, and each page table entry points to A page frame.
Linux uses a 4KB page frame as a standard memory allocation unit.

5.1.
Buddy system algorithm In order to avoid this situation, the buddy system algorithm (buddy system) is introduced into the Linux kernel. Group all free page frames into 11 block linked lists. Each block linked list contains page frame blocks with sizes of 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, and 1024 consecutive page frames. . You can apply for a maximum of 1024 consecutive page frames, corresponding to 4MB of continuous memory. The physical address of the first page frame of each page frame block is an integer multiple of the block size.
When the page frame block is released, it will actively merge two consecutive page frame blocks into one larger page frame block.
The slab allocator is derived from the allocation algorithm of Solaris 2.4. It works on the physical memory page frame allocator to manage the cache of objects of a specific size and perform fast and efficient memory allocation.

5.2. Common memory allocation function
unsigned long get_free_pages(gfp_t gfp_mask, unsigned int order) The
get_free_pages function is the most primitive memory allocation method, which directly obtains the original page frame from the partner system, and the return value is the starting address of the first page frame. get_free_pages only encapsulates the alloc_pages function in implementation. From code analysis, the alloc_pages function will allocate a length of 1< struct kmem_cache *kmem_cache_create(const char *name, size_t size void ( ctor)(void , struct kmem_cache *, unsigned long),
void *kmem_cache_alloc(struct kmem_cache *c, gfp_t flags) kmem_cache_create/ kmem_cache_alloc is a memory allocation method based on the slab allocator, which is suitable for the occasions where the same size memory block is repeatedly allocated and released. First, use kmem_cache_create to create a cache area, and then use kmem_cache_alloc Get a new memory block from the cache area. The maximum memory that kmem_cache_alloc can allocate at one time is
defined by the MAX_OBJ_ORDER macro in the mm/slab.c file . In the default 2.6.18 kernel version, this macro is defined as 5, so once

You can apply for at most 1<<5 * 4KB, which is 128KB of continuous physical memory. Analyzing the kernel source code, it is found that
when the size parameter of the kmem_cache_create function is larger than 128KB, BUG() is called. The test result verifies the analysis result, and the kernel crashes when using kmem_cache_create to allocate more than 128KB of memory. void *kmalloc(size_t size, gfp_t flags)

5.3. The
previous memory allocation methods of vmalloc are physically continuous, which can ensure a low average access time. However, in some occasions, the request for the memory area is not very frequent, and a higher memory access time is acceptable. This is to allocate a linearly continuous, physically discontinuous address. The advantage is that it can be allocated at a time. Large chunks of memory. Figure 3-1 shows the address range used by the memory allocated by vmalloc. vmalloc has no clear limit on the size of memory that can be allocated at one time. For performance reasons, the vmalloc function should be used with caution. During the test, a maximum of 1GB of space can be allocated at a time.

5.4.dma_alloc_coherent
ma_addr_t *dma_handle, gfp_t gfp)

5.5.
ioremap ioremap is a more direct memory "allocation" method. When using, directly specify the physical starting address and the size of the memory to be allocated, and then map the physical address to the kernel address space. The physical address space used by ioremap is determined in advance, and it is not the same as the above several memory allocation methods. It is not a new physical memory allocation. Ioremap is mostly used for device drivers, allowing the CPU to directly access the IO space of external devices. The memory that ioremap can map is determined by the original physical memory space, so there is no test.
If a large amount of contiguous physical memory is to be allocated, none of the above allocation functions can be satisfied, so we can only use a special way to reserve part of the memory during the Linux kernel boot phase.
void* alloc_bootmem(unsigned long size)

5.6 Reserve top memory through kernel boot parameters

3. Comparison of several allocation functions
get_free_pages directly operate on the page frame 4MB Suitable for allocating a larger amount of continuous physical memory
kmalloc achieves 128KB based on kmem_cache_alloc The most common allocation method, you can use dma_alloc_coherent to achieve based on alloc_pages when you need memory smaller than the page frame size 4MB is suitable for DMA operation alloc_bootmem When starting the kernel, a section of memory is reserved, which is invisible to the kernel and smaller than the physical memory size, which requires high memory management

6. What are the key data structures of the Linux virtual file system? (Write at least four)

struct super_block, struct inode, struct file, struct dentry;

7. In which data structure is the operation function of the file or device stored?

struct file_operations

8. What are the files in Linux?

Execution files, ordinary files, catalog files, link files and device files, pipeline files.

9. What are the system calls for creating a process?

clone(),fork(),vfork(); System call service routines: sys_clone,sys_fork,sys_vfork;

10. There are several ways to call schedule() for process switching?

1. System call do_fork();
2. Timing interrupt do_timer();
3. Wake up the process wake_up_process
4. Change the scheduling strategy of the process setscheduler();
5. System call polite sys_sched_yield();

11. Does the Linux scheduler schedule processes according to their dynamic priority or static priority?

The Liunx scheduler schedules processes according to the dynamic priority of the process, but the dynamic priority is calculated according to the algorithm based on the static priority, and the two are two related values. Because high-priority processes are always scheduled before low-priority processes, in order to prevent multiple high-priority processes from occupying CPU resources, causing other processes to not occupy the CPU, the concept of dynamic priority is cited

12. What is the core data structure of process scheduling?

struct runqueue， struct task_struct, struct sched_struct

13. How to load and unload a module?

insmod load, rmmod unload

14. Where are the modules and applications running in?

The module runs in the kernel space, and the application runs in the user space

15. Is the floating point operation in Linux implemented by the application or by the kernel?

Application program implementation. Floating-point operations in Linux are implemented using mathematical library functions. Library functions can be called after the application is linked, but cannot be called by the kernel link. These calculations are run in the application, and then the results are fed back to the system. If the Linux kernel must perform floating point operations, you need to select math-emu when building the kernel, and use software to simulate floating point operations. It is said that there are two costs for this: the user needs to rebuild the kernel when installing the driver, which may affect To other applications, these applications also use math-emu when doing floating-point operations, greatly reducing efficiency.

16. Can module programs use linkable library functions?

The module program runs in the kernel space and cannot link library functions.

17. What is cached in the TLB?

TLB, page table cache, when a linear address is converted to a physical address for the first time, the correspondence between the linear address and the physical address is placed in the TLB, which is used to speed up the conversion when the linear address is accessed next time.

18. What kinds of devices are there in Linux?

Character devices and block devices. The exception is the network card, which does not directly correspond to the device file. The mknod system call is used to create the device file.

19. What is the key data structure of the character device driver?

Character device descriptor struct cdev, cdev_alloc() is used to dynamically allocate cdev descriptors,
cdev_add() is used to register a cdev descriptor, cdev contains a struct kobject type

Data structure It is the core data structure

20. What functions does the device driver include?

open(),read(),write(),llseek(),realse();

21. How to uniquely identify a device?

Linux uses a device number to uniquely identify a device. The device number is divided into: major device number and minor device number. Generally, the major device number indicates the driver corresponding to the device, and the minor device number corresponds to the device pointed to by the device file, which is used in the kernel. dev_t represents the device number, generally it is 32 bits in length, of which 12 bits are used to represent the major device number, and 20 bits are used to represent the minor device number, using MKDEV (int major, int minor); used to generate a dev_t type object .

22. In what way does Linux implement system calls?

It is realized by software interrupt. First, the user program sets the parameters for the system call. One of the numbers is the system call number. After the parameter setting is completed, the program executes the system call instruction. The soft interrupt on x86 is generated by an int. This instruction will cause An exception generates an event, which will cause the processor to jump to the kernel mode and jump to a new address. And begin to process the exception handler there, the exception handler at this time is the system call program.

23. What is the function of Linux soft interrupt and work queue?

The soft interrupt and work queue in Linux are interrupt processing.

1. The soft interrupt is generally a general term for "delayable functions". It cannot sleep or block. It is in the interrupt context and cannot be switched into the city. The soft interrupt cannot be interrupted by itself, but can only be interrupted by the hardware interrupt (top half) ), can run concurrently on multiple CPUs. So the soft interrupt must be designed as a reentrant function, so a spin lock is also needed to protect its data structure.
2. The function in the work queue is in the context of the process, it can sleep or be blocked, and can switch between different processes. Different tasks have been completed.

Neither the deferrable function nor the work queue can access the user's process space. The deferrable function cannot have any running process during execution. The function of the work queue is executed by the kernel process, and he cannot access the user space address.

Recently I am going to summarize a series of interviews. Interested friends can pay attention to a small wave. thank!
Friends who are interested in technology, interviews, and job hunting are welcome to join my technical exchange group, which will be upgraded to a paid group later, and everyone will exchange technology together~ (202432010) Join in advance in time (the group has some interview questions I compiled, E-books, technical video tutorials, etc., welcome to pick them up)
Insert picture description here