Relearning the operating system-22 | Inter-process communication: What are the methods for inter-process communication?

table of Contents

 

1. What is inter-process communication?

Second, the pipeline

Three, local memory sharing

Four, local message/queue

Five, remote call

Six, message queue

Seven, the problem

7.1 What are the methods for inter-process communication?


1. What is inter-process communication?

Intermediate Process Communication (IPC). The so-called communication is the exchange of data. Therefore, in a narrow sense, the processes created by the operating system are exchanging data.

Today we are not only discussing communication in a narrow sense, but also discussing the broader meaning of IPC-communication between programs.

A program can be a process, it can be a thread, it can be two parts of a process (the process itself sends to itself), or it can be distributed-in short, today's discussion is broadly exchanging data .

Second, the pipeline

Pipelines provide a very important capability, which is to organize calculations. The process does not need to know that there is a pipe, so the design of the pipe is non-intrusive. The programmer can first focus on the design of the program itself, and only need to reserve the interface of the response pipeline to use the pipeline's capabilities. For example, to shellexecute a MySQL statement, it may be like this:

进程1 | 进程2 | 进程3 | mysql -u... -p | 爬虫进程

We can calculate the statements required by MySQL from Process 1, Process 2, and Process 3, and then execute them directly through the pipeline. MySQL passes the result to a crawler process after calculation, and the crawler starts to work. MySQL is not designed to be used in pipelines, and the crawler process is not designed specifically to be used in pipelines. It is just that programmers happen to find that it can be used in this way, which perfectly solves their own problems, such as building a micro crawler with pipelines and storing the results in the database .

We also learned a word called named pipe . Named pipes did not change the use of pipes. Compared with anonymous pipes, named pipes provide more programming methods. such as:

  1. 进程1 > namedpipe

  2. 进程2 > namedpipe

The above program redirects the temporary results of the two processes to namedpipe at the same time, which is equivalent to merging the content and looking for a chance to process it. For another example, if your process needs to query the local MySQL continuously, you can also consider using a named pipe to pass the query to MySQL, and then use another named pipe to pass it back.

This eliminates the need for the establishment of localhost and TCP 3-way handshake time . Of course, the databases are all remote now, here is just an example.

The core of the pipeline is non-intrusive, flexible, does not increase the burden of program design, and can organize complex calculation processes.

Three, local memory sharing

Multiple threads of the same process themselves share process memory. In this case, there is no need to particularly consider shared memory. If it is a cross-process thread (or understood as a cross-process program), you can consider using shared memory. Memory sharing is a capability provided by modern operating systems. Unix operating systems, including Linux, have a POSIX memory shared library-shmem. (If you are interested, you can refer to the content on the webpage )

The principle of shared libraries Linux memory is in the form of a virtual file system , divided into an area from memory, commonly used for both processes. It looks like a file, but the actual operation is memory.

The shared memory method is very fast, but the program is not very easy to write, because this is an intrusive development , which means that you need to write a lot of programs for this. For example, if you modify the value in the shared memory, you need to call the API. If you consider concurrency control , you must also deal with synchronization issues .

Therefore, as long as it is not a high-performance scenario, inter-process communication usually does not consider the way of shared memory.

Four, local message/queue

Memory sharing is not easy to use, so there are two common methods for local messaging. One is to use message queues-modern operating systems will provide similar capabilities. Unix system can use POSIX standard mqueue. Another way is to directly use the network request, such as the TCP/IP protocol, but also include more communication protocols built on top of it

Essentially, these are all modes of receiving/sending messages. The process encapsulates the data that needs to be transferred into a message with a certain format, which is very helpful for writing programs. Programmers can respond to messages in different categories according to the message type; they can also trigger special logic operations according to the content of the message. In the case of a large message volume, it is also possible to construct a producer queue and a consumer queue, and use concurrent technology for processing.

Five, remote call

Remote Procedure Call (RPC) is a method of encapsulating remote service requests through local procedure calls .

RPC calls the method (service) of another machine (server) from one machine (client) through parameter passing and returns the result.

When a programmer calls an RPC, the program seems to be calling a local method or performing a local task, but there will be a service program (usually called a stub) later to convert this local call into a remote network request. In the same way, after the server receives the request, there will also be a server program (stub) that converts the request into a real server method call.

You can observe the above picture, which shows the communication process between the client and the server. There are a total of 10 steps, namely:

  1. Client call function (method);

  2. The stub encapsulates the function call as a request;

  3. The client socket sends the request, and the server socket receives the request;

  4. The server stub processes the request and restores the request to a function call;

  5. Execute server-side methods;

  6. Return the result to the stub;

  7. The stub encapsulates the return result as return data;

  8. The server socket sends the return data, and the client socket receives the return data;

  9. The client socket passes data to the client stub;

  10. The client stub escapes the returned data into the function return value.

There are many conventions in the RPC calling process, such as

  • Function parameter format
  • Return result format
  • How to deal with the exception
  • There are many fine-grained issues
  • Such as handling TCP sticky packets, handling network exceptions, and I/O mode selection

The above issues are more difficult, so the usual practice in actual combat is to use a framework.

  1.  Thrift framework (Facebook open source)
  2. Dubbo framework (Ali open source)
  3. grpc (Google open source)

These RPC frameworks usually support multiple languages, which requires an interface definition language to support the definition of interfaces (IDL) between multiple languages.

The RPC call method is more suitable for the development of a microservice environment. Of course, RPC usually requires a professional team's framework to support high-concurrency and low-latency scenarios. However, it is true that RPC has additional data conversion overhead (mainly serialization), but this is not the main disadvantage of RPC.

The real flaw of RPC is to increase the coupling between systems . When the system actively calls another system's method , it means that the coupling of the two systems is increasing . Long-term increase in RPC calls , border system will gradually rot of. This is what you really need to pay attention to when using RPC.

Six, message queue

Since RPC will increase coupling, what should we do-events can be considered. Events will not increase coupling. If one system subscribes to another system's events, no matter who provides the same type of events in the future, they can work normally. What the system depends on is not another system, but some kind of event . If one day another system does not exist, as long as the event is provided by the other system, the system can still operate normally.

Another scenario where message queues are used is the transmission of purely large amounts of data .

For example, in the transmission of logs, there may be nodes for collecting, cleaning, screening, and monitoring in the middle, which constitutes a huge distributed computing network.

In general, message queues are a less coupled and more flexible model. But the requirements for system designers will be higher, and there will be certain requirements for the architecture of the system itself.

The message queue for specific scenarios has Kafka, which mainly processes feed;

RabbitMQ, ActiveMQ, RocketMQ and other major distributed inter-application communication (application decoupling)

Seven, the problem

7.1  What are the methods for inter-process communication?

It can be explained to the interviewer from a stand-alone and distributed perspective .

 

Process 9999 can receive this signal by writing a program. In addition to using killinstructions, the above process can also be completed by calling the operating system API .

 

 
  • If you consider the stand-alone model, there are pipelines, memory sharing, and message queues . Of these three models, memory sharing programs are the most difficult to write, but they have the highest performance. The pipeline program is best written and has a standard interface. Message queue programs are also easier to write, such as using publish/subscribe mode to implement specific programs.

  • If we consider a distributed model, we have remote call, message queues, and network requests . Sending a network request program directly is not easy to write, it is better to directly use a good RPC call framework. The RPC framework will increase the coupling of the system. You can consider the message queue and the mode of publishing and subscribing to events, which can reduce the coupling between systems.

  • Use database

  • Use normal files

  • There is also a signal, a process can provide signals through the operating system. For example, if you want to send a USR1 signal to a process (pid=9999), you can use:

  • kill -s USR1 9999

Guess you like

Origin blog.csdn.net/MyySophia/article/details/114303043