Operating System 4: Process Communication Types and Communication Implementations

Table of contents

1. Types of process communication

(1) Shared Memory System (Shared-Memory System)

(2) Pipeline (pipe) communication system

(3) Message passing system

(4) Client-Server system (Client-Server system)

4.1 - Sockets

4.2 - Remote Procedure Calls and Remote Method Calls

2. Implementation of message passing communication

(1) Direct messaging system

1.1 - Direct communication primitives

1.2 - Format of the message

1.3 - How processes are synchronized

1.4 - Communication Links

(2) Mailbox communication

2.1 - Structure of the mailbox

2.2 - Mailbox communication primitives

3.3 - Types of mailboxes

3. Example of direct messaging system

(1) The data structure in the message buffer queue communication mechanism

(2) Send primitive

(3) Receive primitives


        Process communication refers to the exchange of information between processes. Due to the mutual exclusion and synchronization of processes, certain information needs to be exchanged between processes, so they are also a kind of process communication, but only a low-level process communication. For example, the semaphore mechanism, the reason for the low-level communication of this mechanism is:

  • Low efficiency: Producers can only put one product (message) into the buffer pool at a time, and consumers can only get one message from the buffer pool at a time. // less data
  • Communication is opaque to the user: the OS only provides shared memory for communication between processes. The setting of shared data structure, data transmission, mutual exclusion and synchronization of processes required for communication between processes must be implemented by programmers. Obviously, this is very inconvenient for users. //The process is cumbersome

When a large amount of data needs to be transferred between processes, the advanced communication tool         provided by the OS should be used . The main features of this tool are:

  • Efficiently transfer large amounts of data. Users can directly use advanced communication commands (primitives) to efficiently transfer large amounts of data. //Big amount of data
  • Easy to use. OS hides the specific details of implementing process communication, and provides users with a set of commands (primitives) for implementing advanced communication, which can be conveniently and directly used by users to implement communication between processes. In other words, the communication process is transparent to the user. This greatly reduces the complexity of communication programming. //Simple to use

1. Types of process communication

        With the development of OS, the mechanism for communication between processes is also developing, and has developed from an early low-level process communication mechanism to a high-level communication tool mechanism that can transmit large amounts of data. Currently, advanced communication mechanisms can be grouped into four broad categories: shared memory systems , piped communication systems , message passing systems , and client-server systems .

(1) Shared Memory System (Shared-Memory System)

        In a shared memory system, communicating processes share certain data structures or shared memory areas through which processes can communicate. Accordingly, they can be divided into the following two types: //shared storage

  • Communication based on shared data structures . In this communication method, all processes are required to share certain data structures, so as to realize the information exchange between the processes, such as the bounded buffer in the producer-consumer problem. The operating system only provides shared memory, and the programmer is responsible for setting up the common data structure and handling the synchronization between processes. This communication method is only suitable for transferring a relatively small amount of data, and the communication efficiency is low, which belongs to low-level communication . //Low amount of data -> use shared memory
  • Communication method based on shared storage area . In order to transmit a large amount of data , a shared storage area is designated in the memory . Processes can exchange information by reading or writing to the shared area to achieve communication. The form and location of data and even access control are the responsibility of the process, not OS. This type of communication belongs to advanced communication . Before communicating, the processes that need to communicate apply to the system for a partition in the shared storage area, and attach it to their own address space, so that the data in it can be read and written normally, and the reading and writing are completed or no longer Return it to shared storage when needed. //Large amount of data -> use memory

(2) Pipeline (pipe) communication system

        The so-called "pipeline" refers to a shared file used to connect a reading process and a writing process to realize communication between them , also known as a pipe file.

        The sending process (that is, the writing process) that provides input to the pipe (shared file) sends a large amount of data into the pipe in the form of a character stream; the receiving process (that is, the reading process) that accepts the output of the pipe receives (reads) data from the pipe. Since the sending process and the receiving process use pipelines to communicate, it is also called pipeline communication. //Pipeline communication principle

        This method was first introduced on UNIX systems and has since been introduced to many other operating systems due to its ability to efficiently transfer large amounts of data. In order to coordinate the communication between the two parties, the pipeline mechanism must provide the following three coordination capabilities:

  • Mutual exclusion: that is, when one process is performing read/write operations on the pipe, other (another) processes must wait.
  • Synchronization: When the write (input) process writes a certain amount of data (such as 4KB) into the pipe, it goes to sleep and waits until the read (output) process takes the data and then wakes it up. When the reading process reads an empty pipe, it should also sleep and wait until the writing process writes data into the pipe before waking it up.
  • Determine whether the other party exists: Only when it is confirmed that the other party exists can communication be carried out.

(3) Message passing system

        In this mechanism, the process does not need to use any shared storage area or data structure, but encapsulates the communication data in the unit of formatted message (message), and uses a set of communication commands provided by the operating system ( originally Language), message passing between processes to complete data exchange between processes. // use communication primitives

        This method hides the details of communication implementation, makes the communication process transparent to users, reduces the complexity and error rate of communication program design, and has become the most widely used inter-process communication mechanism. For example: in a computer network, a message is also called a message; in a microkernel operating system, the communication between the microkernel and the server adopts a message passing mechanism without exception ; It supports multiprocessor systems, distributed systems, and computer networks, so it has become the most important communication tool in these fields .

        The communication method based on the message passing system belongs to the advanced communication method, which can be further divided into two categories because of its different implementation methods:

  • Direct communication mode: means that the sending process uses the sending primitive provided by the OS to directly send the message to the target process.
  • Indirect communication mode: refers to the process of sending and receiving, which send and receive messages by sharing an intermediate entity (called a mailbox) to complete the communication between processes.

(4) Client-Server system (Client-Server system)

        Although the shared memory, message passing and other technologies mentioned above can also be used to realize the two-way communication between different computer processes, the communication mechanism of the client-server system has become the current mainstream communication in various application fields of the network environment. Implementation mechanism, its main implementation methods are divided into three categories: socket , remote procedure call and remote method call . //Mainstream communication mechanism

4.1 - Sockets

        Sockets originated from the UNIX version of the University of California, Kerry in the 1970s, and are the network communication interface under the UNIX operating system. At the beginning, sockets were designed for communication between multiple applications on the same host (that is, inter-process communication), mainly to solve the multiplexing problem of ports and physical lines when multiple pairs of processes communicate at the same time . //Differentiate ports, multiplex

        A socket is a data structure of a communication identification type, including the address of the communication destination, the port number used for communication , the transport layer protocol of the communication network , the network address of the process, and different system calls provided for client or server programs (or API functions ), etc., are the basic components of process communication and network communication. //address + port + protocol

        Sockets are designed for the client/server model. Generally, sockets include two types:

        File-based : the communication processes are all running in the environment of the same machine, the socket is supported by the local file system, a socket is associated with a special file, and the communication parties realize the communication by reading and writing the special file , its principle is similar to the pipeline mentioned above. // pipeline

        Network-based : This type usually uses an asymmetric communication method, that is, the sender needs to provide the recipient name. The processes of the communication parties run in the network environment of different hosts, and are assigned a pair of sockets, one belonging to the receiving process (or server side), and one belonging to the sending process (or client side).

        Generally, when the sending process sends a connection request, it randomly applies for a socket , and the host allocates a port for it, binds to the socket, and does not assign it to other processes. The receiving process has a globally recognized socket and a specified port (for example, the FTP server listens to port 21, and the Web server listens to port 80), and waits for client requests through the listening port. Therefore, any process can send connection requests and information requests to it to facilitate the establishment of communication connections between processes. Once the receiving process receives the request, it accepts the connection from the sending process and completes the connection, that is, the data transmitted between the hosts can be accurately sent to the communication process to realize inter-process communication; when the communication ends, the system closes the receiving process The socket disconnected. //How the socket works

        The advantage of sockets is that it is not only suitable for process communication within the same computer, but also for process communication between different computers in a network environment. Since each socket has a unique socket number (also known as socket identifier), all connections in the system hold a unique pair of socket and port connections, for different application processes or The communication of the network connection can be easily distinguished, which ensures the uniqueness of the logical link between the two communicating parties, facilitates the concurrent service of data transmission, and hides the communication facilities and implementation details, and adopts a unified interface for processing. //Advantages of using sockets

4.2 - Remote Procedure Calls and Remote Method Calls

        RPC (Remote Procedure Call) is a communication protocol for systems connected through a network. The protocol allows a process running on one host (local) system to call a process on another host (remote) system , and appears to the programmer as a regular procedure call without additional programming for this. Remote procedure calls may also be called remote method calls if the software involved uses object-oriented programming. //Remote procedure and remote method call are the same thing

        There are two processes responsible for handling remote procedure calls, one is the local client process, and the other is the remote server process. These two processes are usually called network daemons, and are mainly responsible for message transmission between networks. Both processes are blocked, waiting for messages.

        In order to make the remote procedure call look the same as the local procedure call, that is, to achieve the transparency of RPC, so that the caller does not feel that the calling process is executed on other hosts, RPC introduces the concept of a stub: Each remote process that can run independently on the local client has a client stub (client stubborn). When the local process calls the remote process, it actually calls the stub associated with the process; similarly, on the server side where each remote process is located, The corresponding actual executable process also has a server stub (stub) associated with it. The local client stub and the corresponding remote server stub are also generally in a blocked state, waiting for a message. //The principle of remote procedure call

        As shown in the figure above, the main steps of remote procedure calls are: //The current communication basis of microservices

  1. The caller of the local procedure calls the client stub associated locally with the remote procedure in a normal way, passes the corresponding parameters, and then transfers control to the client stub.
  2. The client stub executes, completes the establishment of messages including information such as the process name and call parameters, and transfers the control right to the local client process.
  3. The local client process completes the messaging with the server and sends the message to the remote server process.
  4. The remote server process transfers to execute after receiving the message, and finds the corresponding server stub according to the remote process name therein, and transfers the message to the stub.
  5. After the server stub receives the message, it changes from the blocking state to the execution state, unpacks the message to extract the parameters of the procedure call, and then calls the associated procedure on the server in a normal way
  6. After the server-side remote procedure finishes running, return the result to the server stub associated with it
  7. The server stub gets control to run, packages the result as a message, and transfers control to the remote server process
  8. The remote server process sends the message back to the client
  9. After the local client process receives the message, it stores the message in the associated client stub according to the process name, and then transfers the control to the client stub
  10. The client stub retrieves the result from the message, returns it to the local caller process, and completes the transfer of control

        This way, the native caller regains control and gets the data it needs to continue running. Obviously, the main function of the above steps is to convert the local call of the client process into a client stub, and then convert it into a local call of the server process . For the client and the server, their intermediate steps are invisible. Therefore, the caller in The entire process is unaware that the execution of the process is remote and not local.

2. Implementation of message passing communication

        When communicating between processes, the source process can directly or indirectly transmit messages to the target process, so process communication can be divided into direct and indirect communication methods. The common direct messaging system and mailbox communication use these two communication methods respectively.

(1) Direct messaging system

        The direct communication method is adopted in the direct message passing system, that is, the sending process uses the sending command (primitive language) provided by the OS to directly send the message to the target process.

1.1 - Direct communication primitives

        Symmetric addressing mode : This mode requires both the sending process and the receiving process to provide the identifier of the other party in an explicit manner . Generally, the system provides the following two communication commands (primitives):

send(receiver,message); 发送一个消息给接收进程
receive(sender,message); 接收Sender发来的消息

        For example, the primitive Send(P2, m1) means to send the message m1 to the receiving process P2; and the primitive Receive(P1, m1) means to receive the message m1 sent by P1.

        The disadvantage of symmetric addressing is that once you change the name of a process, you may need to check the definitions of all other processes, and all references to the old name of the process must be found in order to change it to the new name. Obviously, such The method is not conducive to the modularization of the process definition. //The process change is more troublesome

        Asymmetric addressing mode: In some cases, the receiving process may need to communicate with multiple sending processes, and the sending process cannot be specified in advance. For example, a process used to provide printing services can receive "print request" messages from any process. For such an application, in the primitive of the receiving process, it is no longer necessary to name the sending process, and only need to fill in the parameters representing the source process, that is, the return value after the communication is completed, and the sending process still needs to name the receiving process .

        The sending and receiving primitives of this method can be expressed as: //Receiving no longer needs to know the process name of the sender in advance

send(P,message); 发送一个消息给进程 P
receive(id,message); 接收来自任何进程的消息,id变量可设置为进行通信的发送方进程id或名字。

1.2 - Format of the message

        The message transmitted in the message delivery system must have a certain message format.

        In a stand-alone system environment, since the sending process and the receiving process are in the same machine and have the same environment, the format of the message is relatively simple, and a relatively short fixed-length message format can be used to reduce the processing and storage overhead of the message . This method can be used in an office automation system to provide users with fast note-style communication.

        However, the fixed-length message method is inconvenient for users who need to send longer messages. To this end, a variable-length message format can be used , that is, the length of the message sent by the process is variable. For variable-length messages, the system may pay more overhead in terms of processing and storage, but its advantage is that it is convenient for users to use.

1.3 - How processes are synchronized

        When communicating between processes, a process synchronization mechanism is also required to coordinate communication between processes . Whether it is a sending process or a receiving process, after finishing sending or receiving a message, there are two possibilities, that is, the process either continues sending (or receiving) or blocks. //execute or block

        From this, we have three situations:

  • The sending process is blocked, and the receiving process is blocked. This situation is mainly used for tight synchronization between processes, when there is no buffer between the sending process and the receiving process.
  • The sending process is not blocked, and the receiving process is blocked. This is the most widely used method of process synchronization. Usually, the sending process is not blocked, so it can send one or more messages to multiple targets as soon as possible; while the receiving process is usually in a blocked state, and is not awakened until the sending process sends a message.
  • Neither the sending process nor the receiving process blocks. This is also a more common form of process synchronization. Usually, both the sending process and the receiving process are busy with their own affairs, and only when an event occurs that prevents it from continuing to run, they block themselves and wait.

1.4 - Communication Links

        In order for communication to take place between the sending process and the receiving process, a communication link must be established between the two. There are two ways to establish a communication link.

  • The first way is: the sending process uses an explicit "establish connection" command (primitive language) to request the system to establish a communication link for it before communication, and tear down the link after the link is used up. This method is mainly used in computer networks. //network
  • The second method is: the sending process does not need to explicitly request to establish a link, but only needs to use the sending command (primitive language) provided by the system, and the system will automatically establish a link for it. This method is mainly used in stand-alone systems. //stand-alone

        According to different communication methods, links can be divided into two types:

  • A one-way communication link that only allows the sending process to send messages to the receiving process, or vice versa
  • A two-way communication link that allows process A to send messages to process B, and process B to send messages to process A at the same time.

(2) Mailbox communication

        Mailbox communication is an indirect communication method, that is, the communication between processes needs to be completed through some intermediate entity (such as a shared data structure, etc.) . This entity is established on the public buffer of random access memory, and is used to temporarily store the messages sent by the sending process to the target process; the receiving process can take out the messages sent to itself by the sending process from this entity, and this intermediate entity is usually called a mailbox (or mailboxes), each mailbox has a unique identifier . Messages are kept securely in mailboxes and only approved intended users can read them at any time. Therefore, using the mailbox communication method can realize both real-time communication and non-real-time communication. // message queue

2.1 - Structure of the mailbox

        A mailbox is defined as a data structure. Logically, it can be divided into two parts:

  • Mailbox header : used to store the descriptive information about the mailbox, such as the mailbox identifier, the owner of the mailbox, the password of the mailbox, the number of spaces in the mailbox, etc.
  • Mailbox body : It consists of several mailbox boxes that can store messages (or message headers). The number of mailbox boxes and the size of each box are determined when creating a mailbox.

        In terms of message delivery, the simplest case is one-way delivery. Message delivery can also be bidirectional. The figure below shows the communication method of the bidirectional communication link.

2.2 - Mailbox communication primitives

        The system provides several primitives for mailbox communication, which are used for:

  • Creation and withdrawal of mailboxes. The process can use the mailbox creation primitive to create a new mailbox. The creator process should give the mailbox name and mailbox attributes (public, private or shared); for shared mailboxes, the name of the sharer should also be given. When a process no longer needs to read the mailbox, it can be undone using the mailbox undo primitive.
  • Sending and receiving of messages. When processes need to use mailboxes to communicate, they must use shared mailboxes and use the following communication primitives provided by the system to communicate.
Send(mailbox,message);将一个消息发送到指定邮箱
Receive(mailbox,message);从指定邮箱中接收一个消息

3.3 - Types of mailboxes

        A mailbox can be created by the operating system or by a user process, and the creator is the owner of the mailbox. Accordingly, mailboxes can be divided into the following three categories:

  • Private mailbox. A user process can create a new mailbox for itself as part of the process. The owner of the mailbox has the right to read messages from the mailbox, and other users can only send messages composed by themselves to the mailbox. This private mailbox can be realized by using a mailbox with a one-way communication link. When the process that owns the mailbox ends, the mailbox disappears with it. //user creation
  • public mailbox. Created by the operating system and made available to all authorized processes in the system. The approval process can either send messages to this mailbox or read messages addressed to itself from the mailbox. Obviously, the public mailbox should be realized by a mailbox with a two-way communication link. Typically, public mailboxes exist throughout the life of the system. // operating system creation
  • Shared mailbox. Created by a process, indicating that it is shareable at the time of creation or after creation, and must indicate the name of the sharing process. Both the owner and the sharer of the mailbox have the right to remove the messages sent to them from the mailbox. //user creation

        When using mailboxes to communicate, there are the following four relationships between the sending process and the receiving process:

  • One-to-one relationship. The sending process and the receiving process can establish a communication link dedicated to the two, so that the interaction between the two will not be disturbed by other processes.
  • Many-to-one relationship. Allows the process providing the service to interact with multiple user processes, also known as client/server interaction (client/server interaction).
  • One-to-many relationship. A sending process is allowed to interact with multiple receiving processes, so that the sending process can send messages to the receivers (multiple) by broadcasting.
  • Many-to-many relationship. It is allowed to establish a public mailbox, so that multiple processes can post messages to the mailbox; they can also take their own messages from the mailbox.

3. Example of direct messaging system

        The message buffer queue communication mechanism was first proposed by Hansan in the United States, and implemented on the RC 4000 system, and was later widely used in the communication between local processes. In this communication mechanism, the sending process uses the Send primitive to send the message directly to the receiving process; the receiving process uses the Receive primitive to receive the message. Message buffer queue diagram:

(1) The data structure in the message buffer queue communication mechanism

        message buffer . In the message buffer queue communication mode, the main data structure used is the message buffer. It can be described as follows:

type struct message_buffer {
    int sender;                         //发送者进程标识符
    int size;                           //消息长度
    char text;                          //消息正文
    struct message buffer next;         //指向下一个消息缓冲区的指针
}

        Data items related to communication in the PCB . When the message buffer queue communication mechanism is adopted in the operating system, in addition to setting the message buffer queue for the process, the queue head pointer of the message queue should also be added to the PCB of the process to operate the message queue and to implement Synchronized mutual exclusion semaphore mutex and resource semaphore sm.

        The data items that should be added in the PCB can be described as follows:

type struct processcontrol_block {
    ...
    struct message_buffer mq;        //消息队列队首指针
    semaphore mutex;                 //消息队列互斥信号量
    semaphore sm;                    //消息队列资源信号量
    ...
} PCB;

(2) Send primitive

        Before the sending process uses the sending primitive to send a message, it should first set up a sending area a in its own memory space, as shown in the figure below, and fill in the message text to be sent, sending process identifier, message length and other information. Then call the send primitive to send the message to the target process. // set send area

        The sending primitive first applies for a buffer i according to the message length a.size set in the sending area a, and then copies the information in the sending area a to the buffer i. In order to hang i on the message queue mq of the receiving process, you should first obtain the internal identifier j of the receiving process, and then hang i on j.mq. Since this queue is a critical resource, it must be executed before and after the insert operation. To perform wait and signal operations. // Apply for a buffer, and then also need to get the identifier of the receiving process

        Send primitives can be described as follows:

void send(receiver,a) {        //1-发送区:receiver为接收进程标识符,a为发送区首址;

    getbuf(a.size,i);           //2-缓冲区:根据a.size申请缓冲区;

    copy(i.sender,a.sender);   //3-缓冲区复制操作:将发送区a中的信息复制到消息缓冲区i中;
    i.size=a.size;
    copy(i.text,a.text);
    i.next=0;

    getid(PCBset,receiver.j);  //4-获取接收者标识:获得接收进程内部的标识符;

    wait(j.mutex);
    insert(&j.mq,i);           //5-消息插入:将消息缓冲区插入接受进程的消息队列;
    signal(j.mutex);

    signal(j.sm);               //6-唤醒数据消费:资源信号量+1
}

(3) Receive primitives

        The receiving process calls the receiving primitive receive(b), picks the first message buffer i from its own message buffer queue mq, and copies the data in it to the specified message receiving area starting with b. The receiving primitives are described as follows:

void receive(b) {
    j = internal name;          //j为接收进程内部的标识符;

    wait(j.sm);                 //如果资源为空,此时将阻塞

    wait(j.mutex);              //使用同步原语操作j.mq
    remove(j.mq, i);            //将消息队列中第一个消息移出;
    signal(j.mutex);

    copy(b.sender, i.sender);   //缓冲区->接收区,将消息缓冲区i中的信息复制到接收区b:
    b.size =i.size;
    copy(b.text, i.text);
    releasebuf(i);              //释放消息缓冲区;
}

        So far, the full text is over. Because of love, please carry out the technology to the end.

Guess you like

Origin blog.csdn.net/swadian2008/article/details/131163277