IPC communication pipeline

    Pipes are the oldest but most commonly used form of IPC for UNIX systems, and have the following two limitations.
    (1) Historically, pipes were half-duplex (that is, data could only flow in one direction), although some systems now offer full-duplex pipes as well. But for portability, the system should not presuppose that the system supports full-duplex pipes.
    (2) Pipes can only be used between two processes that have a common ancestor. Typically, a pipe is created by a process, and after the process calls fork, the pipe can be used between the parent and child processes.
    As you will see later, FIFO does not have the second limitation, and UNIX domain sockets do not have either.
    Whenever you type a sequence of commands into a pipe for the shell to execute, the shell creates a separate process for each command, and then pipes the standard output of the previous command's process to the standard input of the next command.
    Pipes are created by calling the pipe function.

#include <unistd.h>
int pipe(int fd[2]); /* Return value: if successful, return 0; otherwise, return -1 */

Two file descriptors are returned via the parameter fd: fd[0] is open for reading and fd[1] is open for writing. The output of fd[1] is the input of fd[0]. For implementations that support full-duplex pipes, both fd[0] and fd[1] are opened for read/write. The fstat function returns a FIFO-type file descriptor for each end of the pipe, and the S_ISFIFO macro can be used to test the pipe.
The following diagrams show two ways of depicting a half-duplex pipe: the left diagram shows that the two ends of the pipe are connected to each other in a process, and the right diagram emphasizes the need for data to flow through the pipe through the kernel.

Pipes in a single process are of little use. Processes typically call pipe first, followed by fork, creating an IPC channel from parent to child, and vice versa. The image below shows this situation.

    After fork, for the pipe from the parent process to the child process, the parent process closes the read end fd[0] of the pipe, and the child process closes the write end fd[1]; and for the pipe from the child process to the parent process, the parent process closes the fd [1], the child process closes fd[0].
    The following two rules apply when one end of the pipe is closed.
    (1) When read is a pipe whose write end has been closed, after all data has been read, read returns 0, indicating the end of the file.
    (2) If write a pipe whose read end has been closed, a SIGPIPE signal will be generated. If the signal is ignored or caught, write returns -1 and errno is set to EPIPE.
    When writing to a pipe (or FIFO), the constant PIPE_BUF specifies the kernel's pipe buffer size. When the number of bytes written to a pipe is greater than PIPE_BUF, and there are multiple processes writing to a pipe (or FIFO) at the same time, the data written may interleave with data from other processes. Use the pathconf or fpathconf functions to determine the value of PIPE_BUF.
    The following program pipes the output of the parent process directly to the pager called in the child process (ignoring the checking of the return value of the function call).

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>

#define MAXLINE	1024
#define DEF_PAGER	"/bin/more"		// default pager program

int main(int argc, char *argv[]){
	if(argc != 2){
		printf("Usage: %s <filename>\n", argv[0]);
		exit(1);
	}
	int fds[2];
	pipe(fds);
	pid_t pid = fork();
	if(pid > 0){				// parent
		close(fds[0]);
		char buf[MAXLINE];
		FILE *fp = fopen(argv[1], "r");
		while(fgets(buf, MAXLINE, fp) != NULL){
			write(fds[1], buf, strlen(buf));
		}
		if(ferror(fp)){
			printf("fgets error\n");
			exit(1);
		}
		close(fds[1]);		// close write end of pipe for reader
		fclose(fp);
		waitpid(pid, NULL, 0);	// Note: this is necessary
		exit(0);
	}
	// child
	close(fds[1]);
	if(fds[0] != STDIN_FILENO)
		dup2(fds[0], STDIN_FILENO);
	close(fds[0]);

	char *argv0, *pager;
	if((pager = getenv("PAGER")) == NULL)
		pager = DEF_PAGER;
	if((argv0 = strrchr(pager, '/')) != NULL)
		argv0++;			// step past rightmost slash
	else
		argv0 = pager;			// no slash in pager
	execl(pager, argv0, (char *)0);
	exit(0);
}

Since it is common to create a pipe connected to another process and then read its output or send data to its input, the standard I/O library provides the popen and pclose functions (similar to fopen and fclose).

#include <stdio.h>
FIFE *popen(const char *cmd, const char *type);
                         /* Return value: If successful, return the file pointer; otherwise, return NULL */
int pclose(FILE *fp); /* Return value: If successful, return the termination status of the child process; otherwise, return -1 */

    The function popen first executes fork, then the child process calls exec to execute cmd, and returns a file pointer. If type is "r", the file pointer is connected to the standard output of the child process and is readable; if type is "w", the file pointer is connected to the standard input of the child process and is writable.
    The function pclose closes the standard I/O stream, waits for the command to terminate, and returns the shell's termination status. If the shell cannot be executed, the termination status returned by pclose is the same as if the shell executed exit(127).
    cmd is executed by the Bourne shell as "sh -c cmd". This means that the shell will expand any special characters in cmd, for example:
        fp = popen("ls *.c", "r");
    but also note that the set user id or set group id program should never be called popen, because it executes cmd using the caller's inherited shell environment, a malicious user could cause the shell to execute commands in unexpected ways with the elevated privileges granted by the set ID file mode.
    Rewriting the above program using the popen function will reduce the amount of code a lot.

#include <stdio.h>
#include <stdlib.h>

#define MAXLINE	1024
#define PAGER "${PAGER:-/bin/more}"	// environment variable, or default

int main(int argc, char *argv[]){
	if(argc != 2){
		printf("Usage: %s <filename>\n", argv[0]);
		exit(1);
	}
	FILE *fpin = fopen(argv[1], "r");
	FILE *fpout = popen(PAGER, "w");
	char buf[MAXLINE];
	while(fgets(buf, MAXLINE, fpin) != NULL){
		fputs(buf, fpout);
	}
	pclose(fpout);
	if(ferror(fpin)){
		printf("fgets error\n");
		exit(1);
	}
	exit(0);
}

The following code is the popen and pclose function implementation.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <sys/wait.h>

#define	MAX_FD	1024			// It depends on the system
static pid_t *childpids = NULL;	 // Pointer to array allocated at run-time

FILE *myPopen(const char *cmd, const char *type){
	if(type[0]!='r'&&type[0]!='w' || type[1]!=0){// only allow "r" or "w"
		errno = EINVAL;
		return NULL;
	}
	if(childpids == NULL){		// first time through
		if((childpids = calloc(MAX_FD, sizeof(pid_t))) == NULL)
			return NULL;
	}
	int fds[2];
	if(pipe(fds) < 0)
		return NULL;			// errno set by pipe()
	if(fds[0] >= MAX_FD || fds[1] >= MAX_FD){
		close(fds[0]);
		close(fds[1]);
		errno = EMFILE;			// too many files are open.
		return NULL;
	}
	pid_t	pid;
	if((pid=fork()) < 0){
		return NULL;			// errno set by fork()
	}else if(pid == 0){			// child
		if(*type == 'r'){
			close(fds[0]);
			if(fds[1] != STDOUT_FILENO){
				dup2(fds[1], STDOUT_FILENO);
				close(fds[1]);
			}
		}else{
			close(fds[1]);
			if(fds[0] != STDIN_FILENO){
				dup2(fds[0], STDIN_FILENO);
				close(fds[0]);
			}
		}
		int i;
		for(i=0; i<MAX_FD; i++)		// close all descriptors in childpids
			if(childpids[i] > 0)
				close(i);
		execl("/bin/sh", "sh", "-c", cmd, (char *)0);
		_exit(127);                     // execl() failed
	}
	// parent continues...
	FILE *fp;
	if(*type == 'r'){
		close(fds[1]);
		if((fp = fdopen(fds[0], type)) == NULL)
			return NULL;
	}else{
		close(fds[0]);
		if((fp = fdopen(fds[1], type)) == NULL)
			return;
	}
	childpids[fileno(fp)] = pid;	// remember child pid for this fd
	return fp;
}

int myPclose(FILE *fp){
	if(childpids == NULL){		// popen has never been called
		errno = EINVAL;
		return -1;
	}
	int fd = fileno(fp);
	if(fd >= MAX_FD){			// invalid file descriptor
		errno = EINVAL;
		return -1;
	}
	pid_t pid = childpids[fd];
	if(pid == 0){				// fp wasn't opened by popen()
		errno = EINVAL;
		return -1;
	}
	childpids[fd] = 0;
	if(fclose(fp) == EOF)
		return -1;
	int stat;
	while(waitpid(pid, &stat, 0) < 0)
		if(errno != EINTR)	// error other than EINTR from waitpid
			return -1;
	return stat;			// return child's termination status
}

Among them, because a process may call popen multiple times, the childpids array is used to save the child process ID and open file descriptor opened by the process. Here, the file descriptor is used as its subscript to save the child process ID. In addition, POSIX.1 requires popen to close I/O streams that were opened by a previous call to popen and are still open in the child process, so the child processes close the descriptors that are still open in childpids one by one. Also, if the caller of pclose catches a SIGCHLD signal or other signal that may interrupt blocking, the waitpid call may return an interrupt error EINTR, in which case we should call waitpid again. If pclose calls waitpid and finds that the child process no longer exists, it returns -1 and errno is set to ECHILD.

IPC communication pipeline

Guess you like