Linux: Interprocess Communication

Source: http://www.cnblogs.com/vamei

 

We have already explained in Linux Signal Basics that signals can be regarded as a rough interprocess communication ( IPC, interprocess communication ) way to transmit information to the closed memory space of the process. In order to pass more information between processes, we need other means of inter-process communication. These inter-process communication methods can be divided into two types:

  • Pipeline (PIPE) mechanism. In Linux Text Streaming , we mentioned that pipes can be used to connect the output of one process with the input of another process, thereby utilizing the file manipulation API to manage inter-process communication. In the shell, we often use pipes to connect multiple processes together, so that each process can cooperate to achieve complex functions.
  • Traditional IPC (interprocess communication). We mainly refer to the message queue (message queue), semaphore (semaphore), shared memory (shared memory). The feature of these IPCs is to allow sharing of resources between multiple processes, which is similar to the sharing of heap and global data among multiple threads. Due to the concurrency of multi-process tasks (each process contains one process, if there are multiple processes, there are multiple threads), so the problem of synchronization must also be solved when sharing resources (refer to Linux multi-threading and synchronization ).

 

Pipes and FIFO files

A primitive way of IPC is that all processes communicate through a file . For example I write my name and age on a piece of paper (document). Another person reading this piece of paper will know my name and age. He can also write his message on the same piece of paper, and when I read the piece of paper, I can also know the message of others. However, because the hard disk read and write is relatively slow, so this method is very inefficient. So, can we put this piece of paper into memory to increase read and write speed?

In Linux Text Streaming , we have explained how to use pipes to connect multiple processes in the shell. Similarly, in many programming languages, there are commands to implement similar mechanisms, such as using Popen and PIPE in Python subprocesses, and there are also popen library functions in C language to implement pipes (the pipes in the shell are written according to this) . A pipe is a buffer managed by the kernel, which is equivalent to a note we put into memory. One end of a pipe connects the output of a process. This process will put information into the pipe. The other end of the pipe is connected to the input of a process that takes out the information put into the pipe. A buffer doesn't need to be very large, it is designed as a ring data structure so that pipes can be recycled. When there is no information in the pipe, the process reading from the pipe waits until the process on the other end puts information. When the pipe is full of messages, the process trying to put the message will wait until the process on the other end gets the message. When both processes are terminated, the pipe also automatically disappears.


In principle, the pipeline is established using the fork mechanism (refer to the Linux process foundation and Linux from program to process ), so that two processes can be connected to the same PIPE. At the beginning, the two arrows above are connected to the same process Process 1 (two arrows connected to Process 1). When fork copies the process, it will also copy the two connections to the new process (Process 2). Subsequently, each process closes a connection that it does not need (the two black arrows are closed; Process 1 closes the input connection from PIPE, Process 2 closes the output connection to PIPE), so that the remaining red connections constitute PIPE as shown above.

Due to the fork mechanism, pipes can only be used between a parent process and a child process, or between two child processes with the same ancestor ( between related processes ). To solve this problem, Linux provides a FIFO way to connect processes. FIFO is also called named pipe (named PIPE).

FIFO (First in, First out) is a special file type that has a corresponding path in the file system. When one process opens the file for read (r) and another process opens the file for write (w), then the kernel will establish a pipe between the two processes, so the FIFO is actually controlled by the kernel. management, not dealing with hard drives. The reason why it is called FIFO is because the pipeline is essentially a first -in, first- out queue data structure, and the earliest data is read out first (like a conveyor belt, one end releases goods, and the other end picks up), thus ensuring the order of information exchange. . FIFO just borrows the file system (file system, refer to Linux file management background knowledge ) to name the pipe. Processes in write mode write to the FIFO file, and processes in read mode read from the FIFO file. When the FIFO file is deleted , the pipe connection also disappears. The advantage of FIFO is that we can identify the pipe by the path of the file, so that the connection between the unrelated processes can be established.

 

Traditional IPC

These traditional IPCs actually have a long history, so their implementation is not perfect (for example, we need a process to be responsible for deleting the established IPC). A common feature is that they do not use the file manipulation API. For any kind of IPC, you can establish multiple connections and use the key as a means of identification. We can use key-value in a process to use which connection we want (such as multiple message queues, and we choose to use one of them). The key value can be passed between processes through some IPC method (such as the PIPE, FIFO or writing to a file we mentioned above), or it can be built into the program during programming.

In the case where several processes share keys, these traditional IPCs are very similar to how multiple threads share resources (see Linux Multithreading and Synchronization ):

  • A semaphor e is similar to a mutex and is used to deal with synchronization issues. We say that mutex is like a toilet that can only accommodate one person, then semaphore is like a toilet that can accommodate N people . In fact, in a sense, semaphore is a counting lock (I think it is very easy to confuse semaphore and signal when translating semaphore into semaphore) , which is allowed to be acquired by N processes. When more processes try to acquire the semaphore, they must wait for the previous process to release the lock. When N is equal to 1, semaphore and mutex achieve exactly the same function. Many programming languages ​​also use semaphore to deal with the problem of multi-thread synchronization. A semaphore will remain in the kernel until some process removes it.
  • Shared memory is similar to sharing global data and heap with multiple threads. A process can take out part of its own memory space, allowing other processes to read and write. When using shared memory, we have to pay attention to synchronization issues. We can use semaphore synchronization, or we can create mutex or other thread synchronization variables in shared memory to synchronize. Since shared memory allows multiple processes to directly operate on the same memory area, it is the most efficient IPC method.

A message queue is similar to PIPE. It also creates a queue , and the message put into the queue first is taken out first. The difference is that a message queue allows multiple processes to put messages and multiple processes to get messages out . Each message may carry an integer identifier (message_type). You can classify messages by identifier (in the extreme case setting each message to a different identifier). When a process takes out a message from the queue, it can take it out in the first-in, first-out order, or it can take out only the message that matches a certain identifier (if there are multiple such messages, it can also be taken out in the first-in, first-out order). Another difference between message queue and PIPE is that it does not use the file API. Finally, a queue doesn't disappear automatically, it exists in the kernel until some process deletes the queue.

 

Multi-process cooperation can help us take full advantage of the multi-core and network era. Multi-process can effectively solve the problem of computing bottleneck. Internet communication is actually a problem of inter-process communication, but these multiple processes are distributed on different computers. The network connection is achieved through sockets. Due to the huge content of sockets, we will not go into depth here. A small note is that sockets can also be used for inter-process communication within a computer.

 

Summarize

PIPE, FIFO

semaphore, message queue, shared memory; key

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326775540&siteId=291194637