Linux C system programming (09) process management inter-process communication

The significance of inter-process communication is how to allow multiple processes to access data between each other. There are many ways to achieve this in Linux.


1 Overview of interprocess communication

Inter-process communication is to allow multiple processes to access each other, including runtime data and each other's code segments, which is very common in practical applications. The IPC mechanism merely provides a data transmission channel for data communication.    

While the process is running, its address space is not visible relative to other processes (this is just the concept of traditional processes), they are independent in the system and cannot access each other.

Common inter-process communication methods are: pipes, FIFO pipes, signals, semaphores, message queues, shared memory, sockets.

For pipes, there are half-duplex pipes and full-duplex pipes:

  1. Half-duplex pipeline: anonymous half-duplex pipeline FIFO
  2. Full-duplex pipeline: anonymous full-duplex pipeline, named full-duplex pipeline

2 Pipe

2.1 The concept of pipes

One of the common communication methods is to realize a data circulation pipeline between two processes. The pipeline can be unidirectional or bidirectional; the pipeline is easy to use, but it has many limitations. An anonymous duplex pipe has no real name in the system and cannot be seen in any way in the file system. It is just a resource of the process and will be cleared by the system as the process ends. Pipeline communication should be searched using grep command. as follows

$ls |grep ipc

Pipelines are divided into full-duplex pipes and half-duplex pipes in the direction of data flow. In the specific implementation process, the full-duplex pipeline is only a little different in the way the file is opened (the operation rules are also somewhat different, the full-duplex pipeline is much more complicated than the half-duplex).

2.2 Anonymous half-duplex pipeline

An anonymous pipe has no name, and there is no path name for the file descriptor used in the pipe. That is, there are no files of any significance, just two file descriptors associated with an index point in memory. Its characteristics are as follows:

  1. The data can only move in one direction.
  2. Can only communicate between processes that have a common ancestor, that is, between parent and child processes / sibling processes.
  3. Nevertheless, half-duplex pipes are the most common communication method.

Under the Linux environment, use the pipe function to create an anonymous half-duplex pipeline. The function prototype is as follows:

#include <unistd.h>
int pipe(int pipefd[2]);

See the linux function reference manual for details . For the array fd, it is not associated with any aka file, which is the origin of the anonymous pipe name.

2.3 Reading and writing operations of anonymous half-duplex pipelines

  • When reading and writing to the pipeline, use the read and write functions to operate on the pipeline. When a read end has been closed, a signal SIGPIPE will be generated, indicating that the read end of the pipe has been closed; and the write operation returns -1, The value of errno is EPIPE. The SIGPIPE signal can be captured. If the write process cannot capture / ignore the SIGPIPE signal, the write process will be interrupted.
  • The pipeline can be inherited. Under normal circumstances, the pipe function and the fork function are used together, but it should be noted: in order to maintain the order of the pipeline, when the parent process creates the pipeline. Only when the child process has inherited the pipeline, the parent process can perform the operation of closing the pipeline. If the pipeline is closed before the fork, the child process will not inherit the available pipeline.
  • When reading a pipe, the read function returns 0. There are two meanings. One is that there is no data in the pipe and the write end is closed, and the other is that there is no data in the pipe and the write end is still alive. These two The situation must be dealt with separately.

2.4 Standard library functions for creating pipelines

Under normal circumstances, a standard set of procedures are used for pipeline operations, so the above operations are defined in two standard library functions popen and pclose in ANSI / ISO C. The function prototype under Linux is as follows:

#include <stdio.h>
FILE *popen(const char *command, const char *type);
int pclose(FILE *stream);

See the linux function reference manual for details .


3 FIFO pipeline

3.1 The concept of FIFO pipeline

FIFO is also known as a famous pipe, it is a file type, which can be seen in the system. The communication method of FIFO is similar to that of using a file to transfer data in the process, except that the FIFO type file also has the characteristics of a pipe and clears the data at the same time when the data is read out. The mkfifo command in the shell can create a famous pipe.

3.2 Creation of FIFO pipeline

Creating a FIFO file is similar to creating a file. FIFO can also be accessed by path name. Use mkfifo function to realize FIFO under Linux, the function prototype is as follows:

#include <sys/types.h>
#include <sys/stat.h>
int mkfifo(const char *pathname, mode_t mode);

See the linux function reference manual for details .

3.3 FIFO read and write operations

General I / O functions can be used for FIFO files. But note: when using the open function to open a FIFO file, the O_NONBLOCK flag of the flag flag of the open function parameter is related to the return status of the function. The logical relationship of setting or not is as follows:

  1. If set: read-only open returns immediately. When writing only open, if no process opens the FIFO for reading, it returns -1. In other words, in the case of non-blocking, read and write must be simultaneously.
  2. If not set: open blocks depending on the situation. Read-only open is blocked until a process opens the FIFO for writing, and only write open is blocked until a process opens the FIFO for reading.

When all processes of the FIFO have been closed, an end-of-file character is generated for the reading process of the FIFO. The emergence of FIFO has solved the problem of a large number of intermediate files generated by the system in the application process. The FIFO can be called by the shell to transfer data from one process to another. The system does not have to worry about cleaning up unnecessary garbage for the intermediate channel, or to release the resources of the channel, it can be used by subsequent processes and avoid In addition, the scope limitation of anonymous pipes can be applied between unrelated processes.

3.4 Disadvantages of FIFO

For the server-client architecture, it can be processed, but the premise is that the FIFO interface provided by the server must be known in advance, as follows:

  1. The client sends a request to the server: need to know the server's public FIFO.
  2. The server sends back information to the client: a dedicated FIFO needs to be created for each client to respond. When the number reaches a certain level, the FIFO will overload the server and cause the server to crash.

4 System V IPC / POSIX IPC communication mechanism introduction

System V IPC includes three communication mechanisms: message queue, semaphore, and shared memory. This is an ancient way and has been replaced by POSIX IPC in recent versions. POSIX IPC and System V IPC refer to the same, but the function used is different, the implementation is also different. The IPC mechanism is different from pipes and FIFOs. Pipes and FIFO are based on the file system, while IPC is based on the kernel, you can use ipcs to view the current status of the system's IPC objects.

4.1 The concept of IPC objects

The IPC object is a tool for interprocess communication that is active at the kernel level. The existing IPC object is referenced and accessed through its identifier. This identifier is a non-negative integer that uniquely identifies an IPC object. This IPC object can be any of message queue, semaphore, and shared memory. Types of.

In Linux, identifiers are declared as integers, so the maximum value of possible identifiers is 65535 (2 ^ 16). Note: The identifier here is different from the file descriptor. When the file is opened with the open function, the value of the file descriptor returned is the subscript of the smallest available file descriptor array in the current process. When the IPC object is deleted / created, the value of the corresponding identifier will continue to increase. After reaching the maximum value, the return to zero will be assigned and used cyclically.

The IPC identifier only solves the problem of internally accessing an IPC object. How to allow multiple processes to access a specific IPC object also requires an external key. Each IPC object is associated with a key. This solves the problem of the convergence of multiple processes on an IPC object.

There are several ways to let multiple processes know that such a key exists:

  1. Use the file as the intermediate channel, create an IPC object process, and use the key IPC_PRIVATE to successfully establish the IPC object, and store the returned identifier in a file. Other processes refer to IPC object communication by reading this identifier.
  2. Define a key that is recognized by multiple processes. Each process uses this key to refer to the IPC object. For a process that creates an IPC object, if the key value has been combined with an IPC object, you should delete the IPC object and create a new one. IPC objects.
  3. In multi-process communication, for a specified key to refer to an IPC object, it may not be extensible, and in the case where the key value has been combined by an IPC object. So you must delete this existing object and then create a new one. But this may affect other processes that are using this object. The function ftok can solve this problem to a certain extent.

The function ftok can generate a key value using two parameters. The function prototype is as follows:

#include <sys/types.h>
#include <sys/ipc.h>
key_t ftok(const char *pathname, int proj_id);

See the linux function reference manual for details . The function combines the partial values ​​of the st_dev member and st_ino member of the stat structure in the parameter pathname file with the eighth bit of the parameter proj_id to generate a key value. Because only the partial values ​​of the st_dev member and the st_ino member are used, information will be lost, and it is not excluded that two different files use the same ID to obtain the same key value.

The system saves an ipc_perm structure for each IPC object, which describes the authority and owner of the IPC object. Each version of the kernel content has a different ipc_perm structure, which describes the authority and owner of the IPC object. Each version is different. For each IPC object, the system shares a struct ipc_perm data structure to store permission information to determine whether an IPC operation can access the IPC object. The ipc_perm structure is implemented as follows:

struct ipc_perm
{
    __kernel_key_t key;
    __kernel_uid_t uid;
    __kernel_gid_t gid;
    __kernel_uid_t cuid;
    __kernel_gid_t cgid;
    __kernel_mode_t mode;
    unsigned short seq;
}; 

Only the root / process that created the IPC object has the right to change the value of the ipc_perm structure. The defects of IPC objects are as follows:

  1. Too complicated programming interface, compared with other communication methods, the amount of code required by IPC is significantly increased.
  2. IPC does not use a common file system. Therefore, standard I / O operations cannot be used; new functions are added; since file descriptors are not used, multiple I / O function select / poll functions cannot be used to operate IPC objects.
  3. Lack of resource recovery mechanism, generally only the process reads the message / IPC owner / super user deletes this object. This is also a resource recovery mechanism that IPC lacks relative to pipes / FIFOs.

4.2 IPC system commands

Use the ipcs command in the shell to display the IPC status. Information output by ipcs:

  • key identifies the foreign key of the IPC object
  • shmid identifies the identifier of the IPC object
  • owner identifies the user to which the IPC belongs
  • perms logo permissions

Delete an IPC object command:

$ipcrm -m shmid号  

 View the current system IPC status command:

$ipcs -m

5 shared memory

Shared memory is the fastest method of all process space communication, it is a resource that exists at the kernel level. There are corresponding files described in the file system / proc directory.

5.1 The concept of shared memory

The shared memory mechanism relies on the principle: when the system kernel allocates an address to a process, the physical address of a process can be discontinuous through the paging mechanism, and at the same time, a section of memory can be allocated to different processes at the same time. For each shared storage segment, the kernel maintains a shmid_ds type structure for it. The definition of the shmid_ds structure is as follows:

struct shmid_ds
{
    struct ipc_perm shm_perm;                /* operation perms */
    int shm_segsz;                               /* size of segment (bytes) */
    __kernel_time_t shm_atime;          /* last attach time */
    __kernel_time_t shm_dtime;           /* last detach time */
    __kernel_time_t shm_ctime;           /* last change time */
    __kernel_ipc_pid_t shm_cpid;           /* pid of creator */
    __kernel_ipc_pid_t shm_lpid;           /* pid of last operator */
    unsigned short shm_nattch;                /* no. of current attaches */
    unsigned short shm_unused;                /* compatibility */
    void *shm_unused2;                          /* ditto - used by DIPC */
    void *shm_unused3;                          /* unused */
}; 

The structure shmid_ds will be slightly different according to different system kernel versions, and the size of the shared storage segment will be limited in different systems. Please consult the relevant manuals when applying.

5.2 Creation of shared memory

Use shmget function to create / open a shared memory area under linux. The prototype of the shmget function function is as follows:

#include <sys/ipc.h>
#include <sys/shm.h>
int shmget(key_t key, size_t size, int shmflg);

See the linux function reference manual for details .

5.3 Shared memory operation

Due to the special resource type of shared memory, the operation is different from ordinary files, which requires unique operation functions. Linux uses shared memory for multiple operations. The prototype of the shared memory management function is as follows:

#include <sys/ipc.h>
#include <sys/shm.h>
int shmctl(int shmid, int cmd, struct shmid_ds *buf);

See the linux function reference manual for details . note:

  1. After the fork, the child process inherits the connected shared memory address. After exec, the child process automatically detaches from the connected shared memory address. After the process ends, the connected shared memory address will automatically detach
  2. After the function is executed, the shared memory with the shared memory identifier of shmid is connected. After the connection is successful, the shared memory area object is mapped to the address space of the calling process, and then it can be accessed like the local space.

@ 3 When the operation on the shared memory segment ends, the shmdt function should be called to disconnect the shared memory. The function prototype is as follows:

#include <sys/types.h>
#include <sys/shm.h>
int shmdt(const void *shmaddr);

See the linux function reference manual for details .

5.4 Notes on using shared memory

  • Compared with other methods, shared memory makes data more transparent during reading and writing. When a piece of shared memory is successfully imported, it is just equivalent to a string pointer to point to a piece of memory, which can be freely accessed by the current user. However, the disadvantage of this is that additional structural control is required during the data writing / reading process, and at the same time, additional code is also required to assist the shared memory mechanism on multi-process synchronization / mutual exclusion.
  • In the shared memory segment, the default end of the string is the end of a message. Each process follows this rule and will not destroy the integrity of the data.

6 Semaphore

6.1 The concept of semaphore

The semaphore itself does not have the function of data transmission. It is only an identifier of external resources. It can be used to determine whether external resources are available. The semaphore is responsible for data mutual exclusion and synchronization during this process. When requesting a resource represented by a semaphore, the process needs to first read the value of the semaphore to determine whether the corresponding resource is available;

  1. When the value of the semaphore is greater than 0, it means that there are resources to request;
  2. When it is equal to 0, it means that there are no resources available, so the process will go to sleep until there are resources available.

When the process no longer uses a semaphore-controlled shared resource, the value of this semaphore is +1, and the increase and decrease of the semaphore are atomic operations. This is because the main function of the semaphore is to maintain the mutual exclusion of the resource / multiple processes Synchronous access, and in the creation / initialization of semaphores, atomic operations are not guaranteed. The kernel will set up a shmid_ds structure for each signal set, and use an unnamed structure to identify a semaphore. The definition varies depending on the Linux environment. The shmid_ds structure is defined as follows:

struct shmid_ds {
    struct ipc_perm    shm_perm;                /* operation perms */
    int    shm_segsz;                              /* size of segment (bytes) */
    __kernel_time_t    shm_atime;         /* last attach time */
    __kernel_time_t    shm_dtime;         /* last detach time */
    __kernel_time_t    shm_ctime;         /* last change time */
    __kernel_ipc_pid_t shm_cpid;         /* pid of creator */
    __kernel_ipc_pid_t shm_lpid;          /* pid of last operator */
    unsigned short     shm_nattch;              /* no. of current attaches */
    unsigned short     shm_unused;              /* compatibility */
    void               *shm_unused2;                /* ditto - used by DIPC */
    void               *shm_unused3;                /* unused */
};

The shmid_ds data structure represents each newly created shared memory. When shmget () creates a new piece of shared memory, it returns an identifier that can be used to reference the shmid_ds data structure of the shared memory.

6.2 Semaphore creation

Like shared memory, the system also needs to customize a series of proprietary operation functions (semget, semctl, etc.) for semaphores. Under linux, use the function semget to create / obtain a semaphore set ID. The prototype is as follows:

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
int semget(key_t key, int nsems, int semflg);

See the linux function reference manual for details .

6.3 Semaphore operation

Among the three IPC object types, the operation function of the semaphore set is much more complicated than the operation functions of the other two types, and the use of the same semaphore is also more extensive than the other two. Semaphore also has its own exclusive operation. Use semctl function to operate under Linux, the function prototype is as follows:

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
int semctl(int semid, int semnum, int cmd, ...);

See the linux function reference manual for details .


7 Message queue

7.1 Concept of message queue

A message queue is a group of data organized in a linked list structure, and is stored in the kernel. It is a data transmission method that is referenced by each process through a message queue identifier. It is also maintained by the kernel and is the most data-operable data transmission method among the three IPC objects. In the message queue, messages can be retrieved according to specific data types at will. Of course, in order to maintain the linked list, more memory resources are required, and data reading and writing are more complicated than shared memory, and the time overhead is also greater.

7.2 Creating a message queue

Under linux, use the msgget function to create / open a queue. The function prototype is as follows:

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
int msgget(key_t key, int msgflg);

See the linux function reference manual for details .  

7.3 Message queue operations, read and write

Under linux, use the msgsnd and msgrcv functions to read and write the message queue. The function prototype is as follows:

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
int msgsnd(int msqid, const void *msgp, size_t msgsz, int msgflg); //消息队列写操作
ssize_t msgrcv(int msqid, void *msgp, size_t msgsz, long msgtyp, int msgflg); //消息队列读操作

See the linux function reference manual for details .

Published 289 original articles · praised 47 · 30,000+ views

Guess you like

Origin blog.csdn.net/vviccc/article/details/105159556