Hundred battles c++ (os1)

Locks in Linux

Mutual exclusion lock: mutex , used to ensure that only one thread can access the object at any time. When the lock acquisition operation fails, the thread will go to sleep and wake up when the lock is released

Read-write lock: rwlock , divided into read lock and write lock. During read operations, multiple threads can be allowed to obtain read operations at the same time. But only one thread can acquire the write lock at the same time. Other threads that fail to acquire the write lock will go to sleep until they are woken up when the write lock is released. Note: Write locks block other read-write locks. When a thread acquires a write lock while writing, the read lock cannot be acquired by other threads; writers have priority over readers (once there are writers, subsequent readers must wait, and writers are given priority when waking up). It is suitable for occasions where the frequency of reading data is much greater than the frequency of writing data.

Spin lock: spinlock , only one thread can access the object at any time. But when the lock acquisition operation fails, it will not go to sleep, but will spin in place. , the caller keeps looping there to see if the spinlock holder has released the lock. until the lock is released. This saves the consumption of the thread from sleep state to wake-up, and will greatly improve the efficiency in the environment where the locking time is short. But if the locking time is too long, it will be a waste of CPU resources.

Talk about the difference between process and thread? Describe it in detail. How to choose process and thread

The program segment, data segment and PCB constitute the process entity

Process ready state has allocated all resources except CPU.

The PCB contains processor status, process scheduling information, process control information, and process identifiers.

Therefore, intermittent operation can be realized, and as a symbol of the basic unit of independent operation, it can provide information required for process management, provide information required for process scheduling, and realize synchronization and communication with other processes.

1. Because the process has an independent stack space and data segment, whenever a new process is started, it must be allocated an independent address space, and numerous data tables are established to maintain its code segment, stack segment and data segment, and the system overhead It is relatively large, but threads are different. Threads have independent stack space, but they share data segments. They use the same address space with each other, share most of the data, and the switching speed is faster than processes, and the efficiency is high. The independent characteristics of each process make the process security relatively high, and because the process has an independent address space, after a process crashes, it will not affect other processes in protected mode, and threads are just different execution paths in a process. When one thread dies, the whole process dies.

2. It is reflected in the communication mechanism. Because the processes do not interfere with each other and are independent of each other, the communication mechanism of the process is relatively complicated, such as communication mechanisms such as pipelines, signals, message queues, shared memory, and sockets. Data segment so the communication mechanism is very convenient. .

3. All threads belonging to the same process share the resources of the process, including file descriptors. The different processes are independent of each other.

4. A thread must belong to only one process, and a process can have multiple threads and at least one thread;

5. A process is the smallest unit of resource allocation, and a thread is the smallest unit of CPU scheduling; a process is an independent unit for resource allocation and scheduling by the system

The choice of process and thread depends on the following points

1. It is necessary to create and destroy priority threads frequently; because it is very expensive for a process to create and destroy a process.

2. The thread switching speed is fast, so when a large number of calculations are required, threads are used when switching frequently, and time-consuming operations use threads to improve the response of the application

3. Because the efficiency of the CPU system using threads is more dominant, it may be developed to use processes for multi-machine distribution and threads for multi-core distribution;

4. Threads are used in parallel operations, such as concurrent threads on the server side of the C/S architecture to respond to user requests;

5. When you need more stability and security, it is suitable to choose a process; when you need speed, it is better to choose a thread.

state switch 

Please tell me about the Linux virtual address space

In order to prevent different processes from running in physical memory at the same time and trampling on physical memory, virtual memory is used.

The virtual memory technology makes different processes run, and what it sees is that it occupies the 4G memory of the current system alone. All processes share the same physical memory, and each process only maps and stores the virtual memory space it currently needs to physical memory. In fact, when each process is created and loaded, the kernel just "creates" the layout of the virtual memory for the process, specifically, it initializes the memory-related linked list in the process control table. And the code (such as .text .data segment) is copied to the physical memory, just establish the mapping between the virtual memory and the disk file (called memory mapping), and wait until the corresponding program is run, it will pass the page fault exception , to copy the data. In addition, during the running process of the process, it is necessary to dynamically allocate memory. For example, when malloc, only virtual memory is allocated, that is, corresponding settings are made for the page table entries corresponding to this virtual memory. When the process actually accesses this data, the fault occurs. page exception.

Request paging system, request segmentation system and request segment paging system are all aimed at virtual memory, and realize information replacement between memory and external memory through requests.

Benefits of virtual memory:

1. Expand the address space;

2. Memory protection: Each process runs in its own virtual memory address space and cannot interfere with each other. Virtual memory also provides write protection for specific memory addresses, which can prevent code or data from being maliciously tampered with.

3. Fair memory allocation. After using virtual memory, each process is equivalent to having the same size of virtual memory space.

4. When the process communicates, it can be realized by means of virtual memory sharing.

5. When different processes use the same code, such as the code in the library file, only one such code can be stored in the physical memory, and different processes only need to map their own virtual memory to save memory.

6. Virtual memory is very suitable for use in multiprogramming systems, and fragments of many programs are stored in memory at the same time. While a program is waiting for part of it to be read into memory, it can dedicate the CPU to another process. Multiple processes can be kept in the memory, and the system concurrency is improved

7. When the program needs to allocate continuous memory space, it only needs to allocate continuous space in the virtual memory space instead of the continuous space of the actual physical memory, and fragmentation can be used

The cost of virtual memory:

1. The management of virtual memory requires the establishment of many data structures, which take up additional memory

2. The translation from virtual address to physical address increases the execution time of instructions.

3. Page swapping in and out requires disk I/O, which is time-consuming

4. If there is only part of the data in a page, memory will be wasted.

1 page is the physical unit of information, and paging is to realize the discrete allocation method, so as to reduce the external fraction of memory and improve the utilization rate of memory. Paging is only due to the needs of system administration and not the needs of users

   A segment is a logical unit of information, and the purpose of segmentation is to better meet the needs of users

2. The size of the page is fixed. The system divides the logical address into two parts: the page number and the address in the page. The length of the segment is not fixed, but it depends on the program written by the user.

3 The paged job address space is one-dimensional, that is, a single linear address space. The segmented job address space is two-dimensional. When identifying an address, it is necessary to give the segment name and the address within the segment.

Please talk about the synchronization method between threads, it is best to say the specific system call

amount of signal

A semaphore is a special variable that can be used for thread synchronization. It only takes natural values ​​and supports only two operations:

P(SV): If the semaphore SV is greater than 0 , decrement it by one; if the SV value is 0 , suspend the thread.

V(SV) : If other processes are suspended due to waiting for SV , wake up, and then increase SV+1 ; otherwise, directly increase SV+1 .

Its system call is:

sem_wait ( sem_t *sem ): Decrement the semaphore by 1 in an atomic operation . If the semaphore value is 0 , sem_wait will be blocked until the semaphore has a non- zero value.

sem_post ( sem_t *sem) : atomically increments the semaphore value by 1 . When the semaphore is greater than 0 , other threads that are calling sem_wait to wait for the semaphore will be woken up.

mutex

Mutex, also known as mutual exclusion lock, is mainly used for mutual exclusion of threads, cannot guarantee sequential access, and can be synchronized with conditional locks. When entering the critical section       , the mutex needs to be acquired and locked; when leaving the critical section, the mutex needs to be unlocked to wake up other threads waiting for the mutex. Its main system calls are as follows:

pthread_mutex_init: Initialize the mutex

pthread_mutex_destroy : Destroy the mutex

pthread_mutex_lock : Lock a mutex in an atomic operation. If the target mutex is already locked, the call to pthread_mutex_lock will block until the mutex owner unlocks it.

pthread_mutex_unlock: Unlock a mutex in an atomic operation.

condition variable

Condition variables, also known as condition locks, are used to synchronize the value of shared data between threads. Condition variables provide an inter-thread communication mechanism: when a shared data reaches a certain value, wake up one / multiple threads waiting for the shared data. That is, when a shared variable is equal to a certain value, call  signal/broadcast . At this time, locks are required when operating shared variables. Its main system calls are as follows:

pthread_cond_init: Initialize condition variables

pthread_cond_destroy : Destroy condition variables

pthread_cond_signal : Wake up a thread waiting on the target condition variable. Which thread is awakened depends on the scheduling policy and priority.

pthread_cond_wait : Wait for the target condition variable. A locked mutex is required to ensure the atomicity of the operation. In this function, it is first unlocked before entering the wait state, and then it will be locked again after receiving the signal to ensure that the thread can correctly access the shared resource.

Please tell me about the Linux virtual address space

In order to prevent different processes from running in physical memory at the same time and trampling on physical memory, virtual memory is used.

The virtual memory technology makes different processes run, and what it sees is that it occupies the 4G memory of the current system alone. All processes share the same physical memory, and each process only maps and stores the virtual memory space it currently needs to physical memory. In fact, when each process is created and loaded, the kernel just "creates" the layout of the virtual memory for the process, specifically, it initializes the memory-related linked list in the process control table. And the code (such as .text .data segment) is copied to the physical memory, just establish the mapping between the virtual memory and the disk file (called memory mapping), and wait until the corresponding program is run, it will pass the page fault exception , to copy the data. In addition, during the running process of the process, it is necessary to dynamically allocate memory. For example, when malloc, only virtual memory is allocated, that is, corresponding settings are made for the page table entries corresponding to this virtual memory. When the process actually accesses this data, the fault occurs. page exception.

Request paging system, request segmentation system and request segment paging system are all aimed at virtual memory, and realize information replacement between memory and external memory through requests.

Benefits of virtual memory:

1. Expand the address space;

2. Memory protection: Each process runs in its own virtual memory address space and cannot interfere with each other. Virtual memory also provides write protection for specific memory addresses, which can prevent code or data from being maliciously tampered with.

3. Fair memory allocation. After using virtual memory, each process is equivalent to having the same size of virtual memory space.

4. When the process communicates, it can be realized by means of virtual memory sharing.

5. When different processes use the same code, such as the code in the library file, only one such code can be stored in the physical memory, and different processes only need to map their own virtual memory to save memory.

6. Virtual memory is very suitable for use in multiprogramming systems, and fragments of many programs are stored in memory at the same time. While a program is waiting for part of it to be read into memory, it can dedicate the CPU to another process. Multiple processes can be kept in the memory, and the system concurrency is improved

7. When the program needs to allocate continuous memory space, it only needs to allocate continuous space in the virtual memory space instead of the continuous space of the actual physical memory, and fragmentation can be used

The cost of virtual memory:

1. The management of virtual memory requires the establishment of many data structures, which take up additional memory

2. The translation from virtual address to physical address increases the execution time of instructions.

3. Page swapping in and out requires disk I/O, which is time-consuming

4. If there is only part of the data in a page, memory will be wasted.

1 page is the physical unit of information, and paging is to realize the discrete allocation method, so as to reduce the external fraction of memory and improve the utilization rate of memory. Paging is only due to the needs of system administration and not the needs of users

   A segment is a logical unit of information, and the purpose of segmentation is to better meet the needs of users

2. The size of the page is fixed. The system divides the logical address into two parts: the page number and the address in the page. The length of the segment is not fixed, but it depends on the program written by the user.

3 The paged job address space is one-dimensional, that is, a single linear address space. The segmented job address space is two-dimensional. When identifying an address, it is necessary to give the segment name and the address within the segment.

Please talk about the synchronization method between threads, it is best to say the specific system call

amount of signal

A semaphore is a special variable that can be used for thread synchronization. It only takes natural values ​​and supports only two operations:

P(SV): If the semaphore SV is greater than 0 , decrement it by one; if the SV value is 0 , suspend the thread.

V(SV) : If other processes are suspended due to waiting for SV , wake up, and then increase SV+1 ; otherwise, directly increase SV+1 .

Its system call is:

sem_wait ( sem_t *sem ): Decrement the semaphore by 1 in an atomic operation . If the semaphore value is 0 , sem_wait will be blocked until the semaphore has a non- zero value.

sem_post ( sem_t *sem) : atomically increments the semaphore value by 1 . When the semaphore is greater than 0 , other threads that are calling sem_wait to wait for the semaphore will be woken up.

mutex

Mutex, also known as mutual exclusion lock, is mainly used for mutual exclusion of threads, cannot guarantee sequential access, and can be synchronized with conditional locks. When entering the critical section       , the mutex needs to be acquired and locked; when leaving the critical section, the mutex needs to be unlocked to wake up other threads waiting for the mutex. Its main system calls are as follows:

pthread_mutex_init: Initialize the mutex

pthread_mutex_destroy : Destroy the mutex

pthread_mutex_lock : Lock a mutex in an atomic operation. If the target mutex is already locked, the call to pthread_mutex_lock will block until the mutex owner unlocks it.

pthread_mutex_unlock: Unlock a mutex in an atomic operation.

condition variable

Condition variables, also known as condition locks, are used to synchronize the value of shared data between threads. Condition variables provide an inter-thread communication mechanism: when a shared data reaches a certain value, wake up one / multiple threads waiting for the shared data. That is, when a shared variable is equal to a certain value, call  signal/broadcast . At this time, locks are required when operating shared variables. Its main system calls are as follows:

pthread_cond_init: Initialize condition variables

pthread_cond_destroy : Destroy condition variables

pthread_cond_signal : Wake up a thread waiting on the target condition variable. Which thread is awakened depends on the scheduling policy and priority.

pthread_cond_wait : Wait for the target condition variable. A locked mutex is required to ensure the atomicity of the operation. In this function, it is first unlocked before entering the wait state, and then it will be locked again after receiving the signal to ensure that the thread can correctly access the shared resource.

Page faults in the operating system

Memory allocation functions such as malloc() and mmap() only establish the process virtual address space during allocation, and do not allocate physical memory corresponding to the virtual memory. When a process accesses virtual memory that has no mapping relationship, the processor automatically triggers a page fault exception.

Page fault interrupt: In the demand paging system, it can be determined whether the page to be accessed exists in memory by querying the status bit in the page table. Whenever the page to be accessed is not in the memory, a page fault interrupt will be generated. At this time, the operating system will find the missing page in the external memory according to the external memory address in the page table, and transfer it into the memory.

A page fault itself is a kind of interruption, and like a general interruption, it needs to go through 4 processing steps:

1. Protect the CPU site

2. Analyze the cause of interruption

3. Transfer to the page fault interrupt handler for processing

4. Restore the CPU site and continue to execute

However, the page fault interrupt is a special interrupt generated by the hardware when the page to be accessed does not exist in the memory, so it is different from the general interrupt:

1. Generate and process page fault interrupt signal during instruction execution

2. During the execution of an instruction, multiple page fault interrupts may occur

3. The page fault interrupt return is to execute an instruction that caused the interrupt, and the general interrupt return is to execute the next instruction.

The difference between Vfork and fork clone

1. fork  (): The child process copies the data segment and code segment of the parent process

    vfork  ( ): The child process shares the data segment with the parent process

2. The execution order of fork () parent and child processes is uncertain

    vfork guarantees that the child process runs first, and the data is shared with the parent process before exec or exit is called , before it calls exec

     Or the parent process may be scheduled to run after exit .

3. vfork () ensures that the child process runs first, and the parent process may be scheduled to run after she calls exec or exit . if in

   Before calling these two functions, the child process depends on the further actions of the parent process, which will cause a deadlock.

The fork () function is used to create a new  process . The new process is called a child process, and the original process is called a parent process. Fork  () has two return values, and the child process returns 0 . The parent process returns the process ID of the child process, and the process IDs are all non-zero positive integers, so the value returned by the parent process must be greater than zero.

Linux 's fork() is implemented using copy-on-write pages. Copy-on-write is a technology that can delay or even eliminate copying data. The kernel does not copy the entire address space at this time, but lets the parent process and the child process share a copy. Data is copied only when it needs to be written, so that each process has its own copy. That is to say, resources are copied only when they need to be written, and before that, they are only shared in a read-only manner. This technique defers the copying of pages of the address space until the write actually occurs.

 

daemon process

A daemon is a special process that runs in the background and is not controlled by a terminal. Most servers in the Linux system are implemented through a daemon. The parent process of a daemon process is the init process.
 

1 ) Create a child process and the parent process exits

2 ) Create a new session in the child process

3 ) Change the current directory to the root directory

4 ) Reset the file permission mask

5) Close the file descriptor

Tell me what is IO blocking and non-blocking? What are the benefits of each? Do you know about multiplexing? Have you heard about select? Talk about the difference between him and epoll.

Into inter-thread communication

Pipeline (pipe): Pipeline is a half-duplex communication method, data can only flow in one direction, and can only be used between processes with kinship. Process affinity usually refers to the parent-child process relationship.

It can be regarded as a special file, and ordinary functions such as read and write can also be used for its reading and writing. But it is no ordinary file, does not belong to any other file system, and exists only in memory.

Mutual exclusion: When one process is reading and writing, other processes must wait. Synchronization: When a certain amount of data is written, it goes to sleep and waits until the reading process takes the data. When reading an empty pipe, it should also sleep and wait until the writing process writes before reading. Confirm the existence of the other party to communicate.

Named pipe (FIFO): The named pipe is also a half-duplex communication method, but it allows communication between unrelated processes. A FIFO has a pathname associated with it, and it exists in the file system as a special device file.

Semaphore: Semaphores are used to achieve mutual exclusion and synchronization between processes, and can also be used on threads. There are mainly posix semaphores and System V semaphores. Posix semaphores are generally used on threads, and System V semaphores are generally used on In the process, the functions of posix semaphores are generally underlined.

Message queue (message queue): A message queue is a linked list of messages, stored in the kernel and identified by a message queue identifier. The message queue overcomes the disadvantages of less signal transmission information, the pipeline can only carry unformatted byte streams, and the buffer size is limited. (priority, size)

1) The message queue is record-oriented, and the messages therein have a specific format and a specific priority.

2) The message queue is independent of the sending and receiving process. The message queue and its contents are not deleted when the process terminates.

3) The message queue can realize the random query of the message, and the message does not have to be read in the order of first in first out, but also can be read according to the type of the message.

Shared memory (shared memory): Shared memory is to map a section of memory that can be accessed by other processes. This shared memory is created by one process, but multiple processes can access it. Shared memory is the fastest method of IPC, and it is specifically designed to run inefficiently when other methods of interprocess communication run. It is often used in conjunction with other communication mechanisms, such as semaphores, to achieve synchronization and communication between processes.

Signal ( sinal ): Signal is a more complex communication method, which is used to notify the receiving process that an event has occurred, and kill -l can list all available signals. A signal is an event generated by UNIX and Linux systems in response to certain conditions, and the process that receives the signal takes some action accordingly. Usually the signal is generated by an error. But they can also be sent explicitly by one process to another as a way of communicating between processes or modifying behavior. The generation of a signal is called generation, and the reception of a signal is called capture. There is also a more robust sigaction

 

Socket (socket): The socket is also an inter-process communication mechanism. Unlike other communication mechanisms, it can be used for process communication between different machines.

Ways of communication between threads:

Critical section: Access public resources or a piece of code through multi-threaded serialization, which is fast and suitable for controlling data access;

Mutex Synchronized/Lock: The mutual exclusion object mechanism is adopted, and only the thread that owns the mutex object has the right to access public resources. Because there is only one mutex object, it can be guaranteed that the common resource will not be accessed by multiple threads at the same time

Semaphore Semphare: Designed to control a limited number of user resources, it allows multiple threads to access the same resource at the same time, but generally needs to limit the maximum number of threads that access this resource at the same time.

Event (signal), Wait/Notify: Keep multi-thread synchronization by means of notification operation, and also conveniently realize the comparison operation of multi-thread priority

Guess you like

Origin blog.csdn.net/hebtu666/article/details/127204838