Pipeline communication for communication between Python processes: os.pipe

foreword

  A process (process)is the basic unit for resource allocation and scheduling by the system. Each process has its own address (memory) space (allocated by the CPU). For security reasons, the memory spaces between different processes are isolated from each other, that is to say A process Ais a memory space that cannot be directly accessed Bby a process. But in some scenarios, different processes need to communicate with each other, what should we do? Since the processes cannot communicate directly, a third party is used to complete the communication, and this third party is the kernel of the system.
  The kernel provides a variety of ways to complete inter-process communication, such as pipelines, message queues, shared memory, semaphores, and so on. This article mainly introduces (pipe)the principle and os.pipe()implementation of the pipeline.

  ps:The code in this article needs to linuxrun on the system.

1. Simulate pipe communication

  A pipe is essentially a cache in the kernel. When a process creates a pipe, the system will return two file descriptors through which data can be written to or read from the pipe. A pipe is a kind of 单向communication, that is, one end of the pipe is only used to read data, and the other end is only used to write data. Only after the data is written can another process read it.
  Simulate the read and write data of the pipeline:

import os

input_data = 'hello world!'.encode('utf-8')

if __name__ == '__main__':
    r, w = os.pipe()	# 创建管道

    os.write(w, input_data)		# 向管道写入数据
    # os.close(w)

    # 如果没有数据会进入等待状态(阻塞)
    output_data = os.read(r, len(input_data))	# 从管道中读数据
    print(output_data.decode('utf-8'))
    # os.close(r)
# hello world!

  os.pipe()Creates a pipe and returns a pair of file descriptors for reading and writing respectively (r, w). New file descriptors are not inherited. Read up to a number of bytes from
  os.read(fd, n) the file descriptor , returning the bytes string for the bytes read , or an empty bytes object if the end of the file pointed to by . Write the byte string in to the file descriptor , returning the number of bytes actually written.fdn(bytestring)fd
  os.write(fd, str)str(bytestring)fd

  os.read(fd, n)and os.write(fd, str)apply to low-level I/Ooperations and must be used os.open()on pipe()the file descriptor returned by or .

  In addition to the above-mentioned reading and writing of pipelines, osthe library also provides fdopen()functions to create file objects, and then use the objects to perform related operations.

import os
import struct

input_data = 'hello world!'.encode('utf-8')

if __name__ == '__main__':
    r, w = os.pipe()

    writer = os.fdopen(w, 'wb')
    writer.write(input_data)	# 写入数据
    writer.flush()  # 刷新(写入数据之后必须手动刷新)
    # writer.close()

    reader = os.fdopen(r, 'rb')
    output_data = reader.read(len(input_data))		# 读取数据
    print(output_data.decode('utf-8'))
    # reader.close()
# hello world!

  os.fdopen(fd, *args, **kwargs)Returns the object fdcorresponding to , similar to open()the function, both accept the same parameters, the difference is that fdopen()the first parameter is an integer (the file descriptor is of integer type).

2. Realize one-way communication between processes

  The pipeline communication process between processes is roughly as follows:
  (1)the parent process uses pipethe function to create a pipeline through the system call;
  (2)the parent process uses forkthe function to create two child processes through the system call;
  (3)the two child processes can communicate through the pipeline.

insert image description here

  Here we simplify the example of producers and consumers to simulate one-way communication between processes. mainThe code implementation is as follows:

# main.py
import os
import sys
import subprocess


if __name__ == '__main__':
    r, w = os.pipe()

    cmd1 = [sys.executable, "-m", "consumer", str(r)]
    cmd2 = [sys.executable, "-m", "producer", str(w)]

    proc1 = subprocess.Popen(cmd1, pass_fds=(r, ))     # 在一个新的进程中执行子程序
    proc2 = subprocess.Popen(cmd2, pass_fds=(w, ))
    
    print('parent process pid: ', os.getpid())
    print('child process pid(proc1): ', proc1.pid)
    print('child process pid(proc2): ', proc2.pid)

    proc1.wait()   # 等待子进程被终止
    proc2.wait()

  producerThe code is implemented as follows:

"""负责写数据"""
import os
import sys
import struct

writer = os.fdopen(int(sys.argv[1]), "wb")

input_data = 'hello world!'.encode('utf-8')

writer.write(struct.pack("<i", len(input_data)))  # 小端模式, 低地址存储 b'\x05\x00\x00\x00'
writer.write(input_data)
writer.flush()  # 刷新(写入数据之后必须手动刷新)

print('input data: ', input_data.decode('utf-8'))

  consumerThe code is implemented as follows:

"""负责读数据"""
import os
import sys
import struct


reader = os.fdopen(int(sys.argv[1]), "rb")

len_data = reader.read(4)  # int占用4个字节
recv_bytes = struct.unpack("<i", len_data)[0]
output_data = reader.read(recv_bytes)

print('output data: ', output_data.decode('utf-8'))

  The result of the operation is as follows:

insert image description here

3. Realize two-way communication between processes

  If you use os.pipe()functions for two-way communication, you need to create two pipes.

insert image description here

  mainThe code is implemented as follows:

# main2.py
import os
import sys
import subprocess

if __name__ == '__main__':
    # connect subprocess with a pair of pipes
    worker1_read, worker2_write = os.pipe()
    worker2_read, worker1_write = os.pipe()

    cmd1 = [sys.executable, "-m", "client", str(worker1_read), str(worker1_write)]
    cmd2 = [sys.executable, "-m", "server", str(worker2_read), str(worker2_write)]

    proc1 = subprocess.Popen(cmd1, pass_fds=(worker1_read, worker1_write))  # 在一个新的进程中执行子程序
    proc2 = subprocess.Popen(cmd2, pass_fds=(worker2_read, worker2_write))

    print('parent process pid: ', os.getpid())
    print('child process pid(proc1): ', proc1.pid)
    print('child process pid(proc2): ', proc2.pid)

    proc1.wait()  # 等待子进程被终止
    proc2.wait()

  clientThe code is implemented as follows:

import os
import sys
import struct

reader = os.fdopen(int(sys.argv[1]), "rb")
writer = os.fdopen(int(sys.argv[2]), "wb")

pid = os.getpid()
send_info = '[{}]hello server!'.format(pid).encode('utf-8')

print('pid[{}] send info: {}'.format(pid, send_info.decode()))
writer.write(struct.pack("<i", len(send_info)))  # 小端模式, 低地址存储 b'\x05\x00\x00\x00'
writer.write(send_info)
writer.flush()  # 刷新(写入数据之后必须手动刷新)

len_data = reader.read(4)  # int占用4个字节
recv_bytes = struct.unpack("<i", len_data)[0]
recv_data = reader.read(recv_bytes)
print('pid[{}] recv info: {}'.format(pid, recv_data.decode()))

  serverThe code is implemented as follows:

import os
import sys
import struct

reader = os.fdopen(int(sys.argv[1]), "rb")
writer = os.fdopen(int(sys.argv[2]), "wb")

pid = os.getpid()
send_info = '[{}]hello client!'.format(pid).encode('utf-8')

len_data = reader.read(4)  # int占用4个字节
recv_bytes = struct.unpack("<i", len_data)[0]
recv_data = reader.read(recv_bytes)
print('pid[{}] recv info: {}'.format(pid, recv_data.decode()))

print('pid[{}] send info: {}'.format(pid, send_info.decode()))
writer.write(struct.pack("<i", len(send_info)))  # 小端模式, 低地址存储 b'\x05\x00\x00\x00'
writer.write(send_info)
writer.flush()  # 刷新(写入数据之后必须手动刷新)

  The result of the operation is as follows:

insert image description here

conclusion

  os.open()The difference between a function and open()a function: the file
    os.open()is osopened by calling the operating system through the library, and open()the function is pythona built-in function, which pythonopens the file through the program, which can be understood as a low-level os.open()package.

Guess you like

Origin blog.csdn.net/qq_42730750/article/details/127729762