School recruit operating system knowledge summary

School recruit mainly on operating system threads in the process, the process of communication, thread synchronization knowledge.

Linux system programming and theoretical knowledge combined with understanding of memory.

At the same time also relates linux network programming.

 

1. how to design their own approach to the use of single-threaded high-concurrency

In the single thread model can be used I / O multiplexing is used to improve the ability to handle multiple requests in a single thread, then the event-driven model, based on the asynchronous event callbacks to handle.

 

2. You talk about the concept of processes and threads, and why have thread process, including what is the difference, but also how they are each synchronized

Reference answer:

basic concept:

Encapsulation process is to run the program, the system is the basic unit for resource scheduling and allocation, to achieve the concurrent operating system;

A thread is subtasks process, is the basic unit of CPU scheduling and dispatching for real-time assurance procedures to achieve concurrent internal processes; thread is the smallest unit of execution and scheduling the operating system recognizes. Each thread holds a virtual processor alone: ​​alone registers, instruction counter and processor state. Each thread to perform different tasks, but sharing the same address space (that is, the same dynamic memory mapped files, object code, etc.), open files, queues and other kernel resources.

the difference:

1. A thread can belong to only one process, and a process can have multiple threads, but there is at least one thread. Thread depends on the process exist.

2. The process has a separate memory unit in the implementation process, and multiple threads share the process of memory. (Resources allocated to the process, all threads in the same process of sharing all the resources of the process. The same process multiple threads share code segments (code and constant), the data segment (global variables and static variables), the extended segment (heap memory) but each thread has its own stack segment, stack segment called running time, used to store all local variables and temporary variables.)

3. The process is the smallest unit of resource allocation, the thread is the smallest unit CPU scheduling;

4. overhead: Because when you create or undo the process, the system must be recomputed or recycled resources, such as memory space, I / o devices. Therefore, Caozuoxitong pay the cost will be significantly greater than the cost when you create or undo the thread. Similarly, during the process of switching involves saving the entire environment and the current process CPU CPU setting new environmental process is scheduled to run. And thread switching and saving of setting only a small number of registers, the operation does not involve memory management. Visible, the process is much greater than the cost of switching thread switching overhead.

The communication: multiple threads in the same process with the same address space, so that the synchronization and enable communication between them, it becomes relatively easy. Interprocess communication IPC, the threads can read and write directly process data segment (such as global variables) to communicate - need aid process synchronization and mutual exclusion means to ensure data consistency. In some systems, the thread switching, synchronization and communication are without the intervention of the operating system kernel

6. The process of programming and debugging simple high reliability, but creating destroy large overhead; thread on the contrary, small overhead, fast switching speed, but relatively complex programming and debugging.

7. The process does not affect between each other; thread a thread hang the whole process will lead to hang

8. The process to adapt to multi-core, multi-machine distributed; suitable for multi-core thread

 

Interprocess communication way:

Interprocess communication including piping, the IPC system (including message queues, semaphores, signals, shared memory, etc.), and the socket socket.

1. Line:

Pipe conduit includes unnamed and named pipes: pipe may be used for communication between the parent and child named pipe having a genetic relationship addition duct has the function, it also allows communication between unrelated processes

1.1 ordinary pipe PIPE:

1) It is half-duplex (i.e., data flows only in one direction), having a fixed end and a read write terminal

2) It can only be used for communication between processes with a genetic relationship (between parent and child or sibling also process)

3) It can be seen as a special kind of file that can be read for its use ordinary read, write and other functions. But it is not a regular file, it does not belong to any other file system, and exists only in memory.

1.2 Named Pipes FIFO:

1) FIFO data can be exchanged between unrelated processes

2) FIFO has associated therewith a path name, it exists in the file system device in the form of a special file.

2. System IPC:

2.1 Message Queuing

Message queue, the message is the link table, stored in the kernel. A message queue is marked by an identifier (i.e., queue ID). (Signaling message queue overcome little information, only the carrier pipe and the plain byte stream buffer size is limited, etc.) have write access process to add new information to the message queue according to a certain rule to obtain; the message queue has read permission must process the information can be read from the message queue;

Features:

1) is recorded for the message queue in which a message having a specific format and a specific priority.

2) independent of the transmission queue and message reception process. When the process terminates, the message queue and its contents will not be deleted.

3) message queue may be implemented random query message, the message need not be read in a FIFO order, it can be read by the message type.

2.2 semaphore semaphore

Semaphore (semaphore) and has introduced the IPC structure is different, it is a counter, multiple processes can be used to control access to shared resources. Mutual exclusion semaphore used to implement the synchronization between processes, and not for storing inter-process communication data.

Features:

1) the amount for interprocess synchronization signal, to transfer data requires a combination of shared memory between processes.

2) based on the operation amount signal PV system operation, the program operation is atomic semaphore operation.

3) Each time the operation of the PV signal is not limited to the amount of signal magnitude plus 1 or minus 1, and can be any positive integer addition and subtraction.

4) Support semaphore group.

2.3 signal signal

Signal is a more sophisticated means of communication, an event used to notify the receiving process has occurred.

2.4 shared memory (Shared Memory)

It allows multiple processes can access the same memory space, in time to see each other different processes can process data in shared memory was updated. In this way it relies on some sort of synchronization, such as mutexes and semaphores

Features:

1) Shared memory is the fastest IPC, because the process is a direct memory access

2) Because multiple processes can operate simultaneously, so the need for synchronization

3) + shared memory semaphore commonly used together, semaphore used to synchronize access to the shared memory

3. Sockets SOCKET:

socket is also an inter-process communication mechanism with other communication mechanism is different is that it can be used for interprocess communication between different hosts.

 

Way communication between threads:

The critical area: access via serial multi-threaded public resources or a piece of code, faster speed, for controlling data access;

Mutex Synchronized / Lock: using mutex mechanism, only those with permission mutex thread have access to public resources. Because only a mutex object, it is possible to ensure that public resources can not be accessed simultaneously by multiple threads

Semaphore Semphare: to control the user has a limited number of resources and design, it allows multiple threads at the same time to access the same resources, but in general need to limit access to this resource at a time of maximum number of threads.

Event (signal), Wait / Notify: to keep the multi-thread synchronization operation by way of notification, but also facilitate the realization of multi-thread priority of the compare operations

 

Please talk about concurrency (concurrency) and parallel (parallelism)

Reference answer:

Concurrency (concurrency): refers to the macro look at the two programs running simultaneously, such as multi-tasking on a single core cpu's. But the micro level instruction program is intertwined with two runs, interspersed with my instructions between your instructions, interspersed with you between my command, run a command only in a single cycle. This concurrency does not improve your computer's performance can only improve efficiency.

Parallel (parallelism): refers to run in the strict physical sense, such as multicore cpu, two programs are running on two nuclei, affecting each other, within a single period of each program are running their own instructions, that is, run two instructions. This shows that the parallel indeed improve the efficiency of the computer. So now to the multi-core cpu are developing.

 

You have to talk about the process, why there is a thread?

Reference answer:

The reason the thread produced:

Process allows multiple programs can execute concurrently, in order to increase throughput and system utilization of resources; but it has some drawbacks:

At the same time the process can only do one thing

During the execution process if the blockage, the whole process will be suspended, even if some of the work process does not depend on the resources waiting, still will not be executed.

 

Therefore, the operating system introduces a smaller size than the process threads as the basic unit of concurrent execution, thereby reducing the overhead of time and space program, when executed concurrently paid to improve concurrency. And processes compared to the advantages of thread as follows:

From the terms of the resources, the thread is a very "frugal" multitasking operating mode. In linux system, start a new process must be assigned to its separate address space, the establishment of a large number of data tables to maintain its code segment, stack and data segments, this is an "expensive" multi-task work.

From the handover efficiency in terms of running on multiple threads in a process using the same address space therebetween, and with each other between the thread switching time required for switching between processes is far less than the time required. According to statistics, the cost of a process is about 30 times the cost of a thread. (

From the communication mechanism is concerned, to facilitate inter-thread communication mechanisms. For different processes, they have separate data space, the data to be passed only by way of inter-process communication, this approach is not only time-consuming and inconvenient. Thread is not the case, due to the contribution of space data between threads in the same city, so the data a thread can be directly used by other threads, it is not only fast, but also convenient.

In addition to these advantages, multi-threaded program as a multi-task, complicated by the way, also has the following advantages:

1, the multi-CPU system more efficient. The operating system ensures that when the number of threads is not greater than the number of CPU, running on different threads on different CPU.

2, to improve the program structure. A long and complex process can be considered divided into multiple threads into several independent or semi-independent part of the run, such a program will be beneficial to understand and modify.

 

Will write multithreaded programs on a single core machine, need to consider whether the lock, and why?

Reference answer:

Write multithreaded programs on a single core machine, still you need to thread lock. Commonly used as thread-locking thread synchronization and communication. Multithreaded programs on a single core machine, thread synchronization problem still exists. Since the preemptive operating system, a time slice allocated for each thread generally, when a thread time slice is exhausted, the operating system will suspend and then run another thread. If these two threads share some data, without using a thread lock may result in the shared data modification conflict.

 

Make your way synchronization between threads to talk about the best state the specific system calls

Reference answer:

signal

Semaphore is a special variable, may be used to synchronize threads. It only takes a natural number, and only supports two modes of operation:

P (SV): if the signal is greater than the amount of SV 0, it is decremented by one; if the SV value is 0, then suspending the thread.

V (SV): If you have to wait for other processes because SV and hangs, wake up, then SV + 1; otherwise directly SV + 1.

Its system call:

sem_wait (sem_t * sem): atomically will signal minus 1, if the signal value is 0, then sem_wait will be blocked until the semaphore has a non-zero value.

sem_post (sem_t * sem): atomic operation signal magnitude +1. When the signal is greater than 0, the other being called sem_wait waiting for the semaphore thread will be woken up.

 

Mutex

Also known as mutex mutex, mainly for the thread mutually exclusive, we can not guarantee sequential access, and can be synchronized with the conditions of the lock. When entering the critical region, and the need to obtain mutex lock; when leaving the critical region, it is necessary to unlock the mutex, to wake up other threads waiting for the mutex. The main system call is as follows:

pthread_mutex_init: initialize mutex

pthread_mutex_destroy: Destruction mutex

pthread_mutex_lock: atomically to a mutex lock, if the target mutex is already locked, pthread_mutex_lock call will block until the holder of the mutex to unlock it.

pthread_mutex_unlock: atomic operation in a way to unlock a mutex.

 

Condition variable

Condition variables, also known as lock conditions, a value of the synchronization data is shared between threads. Condition variables provide an inter-thread communication mechanism: When a shared data reaches a certain value, Wake waiting for this one / several threads share data. That is, when a shared variable is equal to a certain value, call the signal / broadcast. At this point you need to lock operation shared variables. The main system call is as follows:

pthread_cond_init: condition variable initialization

pthread_cond_destroy: destroy condition variable

pthread_cond_signal: wake up a thread waiting for the target condition variable. Which thread wakes up depending on the scheduling policies and priorities.

pthread_cond_wait: waiting for the target condition variable. Require a mutex lock ensure atomicity operation. The first function to be unlocked before entering the wait state, then after receiving the signal will be locked again to ensure the correct thread access to shared resources.

 

You talk about multi-threaded and multi-process different

Reference answer:

The process is the smallest unit of resource allocation, while the smallest unit of CPU thread scheduling. Shared between multiple threads in the same process address space, inter-thread communication simple and complex synchronization, thread creation, destruction and switches simple, fast, small footprint, suitable for multi-core distributed systems, but affect each other between the threads, a thread terminated unexpectedly cause other threads in the same process is also terminated and reliability is weak. And among multiple processes to run their own separate address spaces, interprocess not affect each other, reliability program, but the process of creation, destruction and switching complex, slow and take up much memory, interprocess communication, complex, but simple synchronization for multi-core, multi-machine profile.

 

You talk about multi-process and multi-threaded usage scenarios

Reference answer:

The advantages of multi-process model is CPU

The main advantage of multi-threading model for the small cost of switching between threads, it is applied to I / O-intensive scenes work, therefore I / O-intensive work scene is often due to the I / O blocking cause frequent switching threads. Meanwhile, multi-threading model also applies to stand-alone multi-core distributed scenes.

Multi-process model for CPU-intensive. Meanwhile, multi-process model also applies to multi-machine distributed scenario, easy multi-machine expansion.

 

Please talk about your condition deadlock occurs and how to resolve a deadlock

Reference answer:

Deadlock is a phenomenon of two or more processes in the implementation process, due to lower competition for resources caused by waiting for each other. Four necessary conditions for deadlock occurs as follows:

Mutually exclusive conditions: the process does not allow the resources allocated to other processes to access, if other processes to access the resource, can only wait until after the release of the resource in possession of the resources used by the process is completed;

Request and kept Condition: The process gets some resources, but also makes a request for additional resources, but the resources may be occupied by another process, this time blocking the request, but the process does not release the resources they have occupied

Inalienable condition: Resources process has been obtained before using unfinished, inalienable, our only released after use

Loop waiting Condition: The process deadlock, there must be a process - endless chain between resources

The method of deadlocks, i.e. one of the four conditions for the above damage, mainly as follows:

Time allocation of resources, and thereby denying the request holding conditions

Be deprived of resources: that is, when the process of new resources have not been met, the release of resources have been occupied, thereby undermining the inalienable condition

Ordered resource allocation method: system for each type of resource is assigned a serial number, each process requests resources by increasing numbers, released on the contrary, in order to undermine the loop condition waiting

 

May I ask how inter-process communication

Reference answer:

Interprocess communication including piping, the IPC system (including message queues, semaphores, signals, shared memory, etc.), and the socket socket.

1. Line:

Pipe conduit includes unnamed and named pipes: pipe may be used for communication between the parent and child named pipe having a genetic relationship addition duct has the function, it also allows communication between unrelated processes

1.1 ordinary pipe PIPE:

1) It is half-duplex (i.e., data flows only in one direction), having a fixed end and a read write terminal

2) It can only be used for communication between processes with a genetic relationship (between parent and child or sibling also process)

3) It can be seen as a special kind of file that can be read for its use ordinary read, write and other functions. But it is not a regular file, it does not belong to any other file system, and exists only in memory.

1.2 Named Pipes FIFO:

1) FIFO data can be exchanged between unrelated processes

2) FIFO has associated therewith a path name, it exists in the file system device in the form of a special file.

 

2. System IPC:

2.1 Message Queuing

Message queue, the message is the link table, stored in the kernel. A message queue is marked by an identifier (i.e., queue ID). (Signaling message queue overcome little information, only the carrier pipe and the plain byte stream buffer size is limited, etc.) have write access process to add new information to the message queue according to a certain rule to obtain; the message queue has read permission must process the information can be read from the message queue;

Features:

1) is recorded for the message queue in which a message having a specific format and a specific priority.

2) independent of the transmission queue and message reception process. When the process terminates, the message queue and its contents will not be deleted.

3) message queue may be implemented random query message, the message need not be read in a FIFO order, it can be read by the message type.

2.2 semaphore semaphore

Semaphore (semaphore) and has introduced the IPC structure is different, it is a counter, multiple processes can be used to control access to shared resources. Mutual exclusion semaphore used to implement the synchronization between processes, and not for storing inter-process communication data.

Features:

1) the amount for interprocess synchronization signal, to transfer data requires a combination of shared memory between processes.

2) based on the operation amount signal PV system operation, the program operation is atomic semaphore operation.

3) Each time the operation of the PV signal is not limited to the amount of signal magnitude plus 1 or minus 1, and can be any positive integer addition and subtraction.

4) Support semaphore group.

2.3 signal signal

Signal is a more sophisticated means of communication, an event used to notify the receiving process has occurred.

2.4 shared memory (Shared Memory)

It allows multiple processes can access the same memory space, in time to see each other different processes can process data in shared memory was updated. In this way it relies on some sort of synchronization, such as mutexes and semaphores

Features:

1) Shared memory is the fastest IPC, because the process is a direct memory access

2) Because multiple processes can operate simultaneously, so the need for synchronization

3) + shared memory semaphore commonly used together, semaphore used to synchronize access to the shared memory

 

3. Sockets SOCKET:

socket is also an inter-process communication mechanism with other communication mechanism is different is that it can be used for interprocess communication between different hosts.

 

You talk about multi-threading, thread synchronization in several ways

concept:

 

Encapsulation process is to run the program, the system is the basic unit for resource scheduling and allocation, to achieve the concurrent operating system;

 

A thread is subtasks process, is the basic unit of CPU scheduling and dispatching for real-time assurance procedures to achieve concurrent internal processes; thread is the smallest unit of execution and scheduling the operating system recognizes. Each thread holds a virtual processor alone: ​​alone registers, instruction counter and processor state. Each thread to perform different tasks, but sharing the same address space (that is, the same dynamic memory mapped files, object code, etc.), open files, queues and other kernel resources.

Way communication between threads:

1, the critical region:

To access common resources or through a serial section of code of a multi-threaded, fast, suitable for controlling data access;

2, mutex Synchronized / Lock:

Using mutex mechanism, only those with permission mutex thread have access to public resources. Because only a mutex object, it is possible to ensure that public resources can not be accessed simultaneously by multiple threads

3, the semaphore Semphare:

The maximum number of threads to control the user has a limited number of resources and design, it allows multiple threads at the same time to access the same resource, but generally the same time need to limit access to this resource.

4 event (signal), Wait / Notify:

To keep the multi-thread synchronization operation by way of notification, but also facilitate the realization of multi-thread priority of the compare operations

 

Please say something about the mechanism, as well as the difference between a mutex and read-write lock mutex (mutex)

Reference answer:

1, and a read-write lock mutex differences:

Mutex: mutex, to ensure at any time, only one thread can access the object. When the operation failed to acquire the lock, the thread goes to sleep, waiting to be awakened when the lock is released.

Read-write lock: rwlock, divided into read and write locks. When in the read operation, it may allow multiple threads to get a read operation. But the same time, only one thread can obtain a write lock. Other failure to obtain write lock thread will go to sleep, wake up until the write lock is released. Note: write locks block other read-write locks. When there is a thread to acquire write locks when writing, reading another thread locks can not be acquired; writing is preferred to readers (who once wrote, subsequent reader must wait, Wake priority writer). Suitable for reading data frequency is much greater than the frequency of the occasion to write data.

Mutex lock and read the difference between:

1) read-write locks distinguish readers and writers, and does not distinguish between a mutex

2) mutex the same time allowing only one thread to access the object, regardless of read and write; only allows a writer to read and write locks within the same time, but at the same time allows multiple readers to read the object.

2, Linux lock mechanism of four kinds:

Mutex: mutex, to ensure at any time, only one thread can access the object. When wakes up to acquire the lock operation fails, the thread goes to sleep, waiting for the lock release

Read-write lock: rwlock, divided into read and write locks. When in the read operation, it may allow multiple threads to get a read operation. But the same time, only one thread can obtain a write lock. Other failure to obtain write lock thread will go to sleep, wake up until the write lock is released. Note: write locks block other read-write locks. When there is a thread to acquire write locks when writing, reading another thread locks can not be acquired; writing is preferred to readers (who once wrote, subsequent reader must wait, Wake priority writer). Suitable for reading data frequency is much greater than the frequency of the occasion to write data.

Spin locks: spinlock, likewise only one thread to access the object at any time. However, when the operation failed to acquire the lock, not go to sleep, but will spin in place until the lock is released. This saves the thread consumption during wakes from sleep to, and will greatly improve the efficiency in the short time locked environment. But if the lock for too long, will be a huge waste of CPU resources.

RCU: i.e. read-copy-update, when data is modified, the data needs to be read first, and then make a copy, and modify the copy. After editing, and then update old data into the new data. When using the RCU, the reader almost no synchronization overhead, lock neither need nor use atomic instructions, does not cause lock contention, and therefore would not have considered the deadlock problem. For greater synchronization overhead writer, it needs to copy the data to be modified, the lock mechanism must also modify the synchronous parallel operation of the other's writing. In a large number of read operations, a small amount of the write operation situation very efficient.

 

You talk about the process of state transition diagrams, dynamic ready, ready static, dynamic blocking, static obstruction

1) Create a state: the process is being created

2) ready state: the process is added to the ready queue waiting for CPU scheduling run

3) Execution state: The process is being run

4) blocking wait state: the process for some reason, such as waiting for I / O, wait for the device, temporarily can not run.

5) termination status: processes to run

 

2, switching technology

 

When multiple processes competing for memory resources, can cause memory is tight, and if the process is not ready at this time, the idle processing chance, I / 0 is much slower than the processor speed, the entire process may appear blocked waiting for I / O.

 

To solve these problems, we propose two solutions:

1) switching technology: external memory swapped out part of the process to free up memory space.

2) virtual storage technology: each process can only be part of the program and load data.

 

In terms of switching, the process will not run in the temporary memory, or temporarily unused data and programs, swapped out to external memory to free up enough memory space, the process already has the operating conditions, or the required data and processes swap program into memory.

 

Which appeared in the process of a pending state: the process is switched to the external memory, the process has become a status pending state.

 

3, blocking activity, still blocking, activity ready, still ready

1) blocking activity: process in memory, but for some reason is blocked.

2) still blocked: Process in the external memory, while being blocked for some reason.

3) activities ready: The process in memory, in a ready state, as long as the CPU and scheduling can be run directly.

4) Still Ready: the process in the external memory, in a ready state, as long as scheduled to the memory, CPU and scheduling can be run.

 

Thus arises:

Activities Ready - still ready (not enough memory, transferred to external memory)

Activities blocking - still blocked (insufficient memory, transferred to external memory)

Execution - Still Ready (time slice runs out)

 

Guess you like

Origin blog.csdn.net/wwxy1995/article/details/94674816