Linux C System Programming (15) Advanced Network Programming

1 Socket programming in-depth    

There are many advanced techniques in socket programming. Using these techniques can better operate the socket and complete the task of network communication; mastering these skills can better develop high-quality network applications.

1.1 The important role of bind function

The salient feature of the server-side program and the client-side program is that the client does not need a bind monitoring function. The bind function binds the socket to an IP address and port number; if the bind function is not used to bind the address and port, the kernel will automatically bind the socket when calling the listen and connect functions. Therefore, the theoretical call The bind function can be omitted; but the form of listen and connect binding is actually different, as shown below:

  1. listen binding: There is no address structure parameter in the listen function, so the IP address and port number can only be set by the system.
  2. Connect binding: Use a set structure (sockaddr_in) as a parameter, the structure specifies the server to be bound.

The server-side program does not care about the client's IP address, the kernel will bind it to any value (INADDR_ANY), and the port number will also be assigned an available port by the kernel. (Because it is a temporary assignment, it will cause different ports to be used each time the server program is executed, which requires the client program to change the port number every time, which is not realistic.)
From this, the bind function is Words are important.    

1.2 Concurrent servers

The disadvantage of the connection-oriented server in the front is that it can only handle one client request at a time, but if a client program occupies the server, the other clients will be "starved" to work; therefore, in In reality, a connection-oriented server does not use a circular framework. Instead, it uses a process to process multiple requests, that is, a concurrent server. The concurrent server execution process (pseudo code) is as follows:

//地址结构初始化;
fd=socket();
bind(fd,...);
listen(fd,...);
while(1){
     accept_fd=accept(fd,...);
     if(fork(...)==0){
          //与客户端交互,处理来自客户端的请求;
          close(accept_fd);
     }
     close(accept_fd);
}
close(fd);
close函数失败的处理。

The concurrent server solves the situation in which the loop server client monopolizes the server, but two new problems should also be noted:

  1. Creating child processes is very resource intensive, and you must use excellent algorithms to improve efficiency.
  2. After the child process ends, pay attention to resource recovery, which is a potential problem that can crash the system.    

1.3 UDP protocol connect function application

The connect function can also be used for UDP connections. Although UDP belongs to the connectionless communication protocol, if the destination address of the datagram is always fixed during communication, then this operation is unnecessary every time. At this time, use connect to love you Instead, a connection makes communication more efficient. At this time, the execution flow of the client is consistent with the connection-oriented client; the difference is that one is connection-oriented and the other is not connection-oriented. The process (pseudo code) is as follows:

//地址结构初始化;
fd=socket(“UDP”);
connect(fd,...);
//与服务器交互,向服务器发出具体消息/接受来自服务器的消息;
close(fd);

When the data communication volume is large, the efficiency of this method will be higher than the traditional connectionless communication method.


2 Multiplexed I / O

Multi-channel I / O is another method of processing I / O, which is more efficient than traditional I / O. It is a typical of making full use of time and is commonly used in network applications.

2.1 The concept of multiplexed I / O

The multiplexed I / O is mainly to prevent I / O from blocking and prevent the process from falling into a dead state. The idea of ​​this method is to construct a device table (usually a file description) that needs to read data, call a function to poll the devices in this table, and know that there is a device that can read and write, the function returns. The multi-channel I / O model is shown in the figure:
 

Multiple system I / O requires two system calls:

  1. Responsible for checking and returning the file descriptors of available devices.
  2. Responsible for reading and writing the file descriptor.

2.2 Realize multi-channel selection I / O

Under Linux, the select function is used to implement the selected I / O. The function prototype is as follows:

#include <sys/select.h>     /* According to earlier standards */
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>
//fd_set这种数据类型本质上是一个位向量,即一个无符号整数;每一位代表一个状态,为1表示被设置,为0表示没有被设置,

//linux环境下提供专门对这种向量进行操作的函数,如下所示:
void FD_CLR(int fd, fd_set *set);          //清除向量指定的位。
int  FD_ISSET(int fd, fd_set *set);     //测试向量指定的位是否被设置,fd表示需要测试的位。
void FD_SET(int fd, fd_set *set);          //设置向量指定的位。
void FD_ZERO(fd_set *set);               //清空位向量所有的位。
int select(int nfds, fd_set *readfds, fd_set *writefds,fd_set *exceptfds, struct timeval *timeout);
部分参数说明:
参数timeout如果为NULL则表示一直等待设备就绪,是一种死等设备的方法。
如果readfds、writefds、exceptfds都被置为NULL,即表示对三种状态都不关心,则此时select函数成为一个精度为微秒的定时器;select函数将一直查询各个设备,直到时间耗尽为止。  
函数返回值:正常返回准备好的设备数;如果为0,表示没有设备准备好;为-1表示出错。

See the linux function reference manual for details . Note: By default, a process has up to 1024 file descriptors.    

2.3 Multi-select I / O of shielded signal

The difference between pselect function and select function:

  1. The last parameter of pselect is a signal shielding option, that is, it has the function of signal shielding; the signals that cannot be shielded include SIGKILL and SIGSTOP. (Prevent malicious programs from attacking the computer)
  2. The structure used for the time parameter in pselect is timespec. The minimum precision that the timespec structure can represent is nanoseconds; if it is used as a timer, it will be the most accurate timer in linux.

Under Linux, the pselect function is used to achieve multiple selection of shielded signals. The function prototype is as follows:

int pselect(int nfds, fd_set *readfds, fd_set *writefds,fd_set *exceptfds, const struct timespec *timeout,const sigset_t *sigmask);

See the linux function reference manual for details .

2.4 Server-side process of multiplexed I / O

//地址结构初始化;
fd=socket();
bind(fd,...);
listen(fd,...);
//FD系列函数的操作,初始化
     FD_ZERO();
     FD_SET();
     FD_ISET();
     FD_CLR();
while(1){
     numner_fd=select();
     accept_fd=accept(fd,...);
     与客户端交互,处理来自客户端的请求;
     close(fd);
     if(--number_fd<=0)
          break;
}
close(fd);
close函数失败的处理。

2.5 Polling I / O

The select function does not support STREAMS in use, so the system adds the poll function on the basis of select. The poll function supports various types of file descriptors for I / O multiplexing; but unlike the select, the poll function Establish an operation structure pollfd for each file descriptor to be monitored independently. The structure describes the monitoring behavior and the events that occur in the target file descriptor. The structure is defined as follows:

# include < sys/ poll. h>
struct pollfd {
     int fd;         /* 文件描述符 */
     short events;         /* 等待的事件;用户所关心的操作 */
     short revents;       /* 实际发生了的事件;文件描述符上已经发生的事件 */
} ;

The events and vents members set one or more of the following logo combinations, as shown in the table:

Note: Members events cannot be set as exception flags; members prevent is the data returned when the poll function is called, so there is no need to set them before the call. The prototype of the poll function under Linux is as follows:    

#include <poll.h>
int poll(struct pollfd *fds, nfds_t nfds, int timeout);

See the linux function reference manual for details .


3 Non-network communication socket

The non-network communication socket is mainly used for process communication in the machine; since the IP addresses of the processes in the machine are the same, only the process number is needed to determine both parties to the communication.

3.1 Unnamed UNIX domain socket

Linux uses the socketpair function to create a pair of unnamed, interconnected UNIX domain sockets. The function prototype is as follows:

#include <sys/types.h>         
#include <sys/socket.h>
int socketpair(int domain, int type, int protocol, int sv[2]);

See the linux function reference manual for details . The socketpair function creates a pair of unnamed UNIX domain sockets. Since there is no name, other processes cannot use the socket to communicate. That is, only the file descriptors of unnamed UNIX domain sockets are saved. Processes can use it. This is somewhat similar to a pipeline. The parent process creates a child process, both of which save the file descriptors at both ends of the pipeline, and then close the end of the pipeline to start communication.

3.2 Naming UNIX domain sockets

Unnamed UNIX domain sockets limit the scope of communication and are not flexible; named sockets can solve this problem. Like network communication sockets, UNIX domain sockets also need to bind addresses, but because the domains used by the two are different, their address families are also different. In the Linux system, the UNIX domain socket uses the sockaddr_un structure to store the address, and its structure prototype is as follows:

#include <sys/un.h>
struct sockaddr_un{
    sa_family_t sun_family;          //表示地址使用的族,即AF_UNIX
    char sun_path[108];              //套接字文件的路径名
};

When the address is bound to a UNIX domain socket, the system creates a file whose path name is the path name indicated in sun_path and the type is S_IFSOCK. This file cannot be opened, and all processes that need to communicate are bound to this file. This file functions like a relay station for information. When calling the bind function for address binding, you need to pass the size of the character array sun_path as a parameter to the kernel. The usual way of writing is:

int sfd;
struct sockaddr_un un;
bind(sfd, (struct sockaddr *)*un,  sizeof(struct scokaddr_un) - sizeof(sa_family_t));

See the linux function reference manual for details . Among them, the size of sun_path is sizeof (struct scokaddr_un)-sizeof (sa_family_t), this is to improve the portability of the code. note:

  1. When the client creates a local socket, it is necessary to call bind to bind to the corresponding path.This is a necessary step. Unlike network sockets, in fact, network socket programming also requires bind, but the kernel automatically performs implicit Binding
  2. Before creating a socket binding, you should unlink the pre-bound path socket file, otherwise bind will error;
  3. After binding the socket to the corresponding path, if you no longer need to connect to other sockets through the named path, you can unlink the named path.

It can be seen that the process uses the disk file of the socket file for communication, but does not want the directory entry of the file to exist every time address binding is performed. Usually the server's process socket disk file will exist for a long time, and its directory entries will be continuously deleted and created. Each client connection will be created when the server address is bound, and then the file will be deleted. The next client connection will not be able to bind the server address. The socket file of the client process is deleted after the communication is completed, and each client process must have a socket file; to ensure the uniqueness of the file in the system, the usual practice is to use the process ID as the file Name, so you can ensure that the file name does not conflict.

3.3 Server-side and client-side processes and precautions of UNIX domain sockets

UNIX domain sockets are used in the same way as network communication sockets, and are also divided into server processes and client processes. The execution flow of the server and the execution flow of the client are consistent with the execution flow of the network communication socket.

Published 289 original articles · praised 47 · 30,000+ views

Guess you like

Origin blog.csdn.net/vviccc/article/details/105175132