Socket communication principle and routines (understand at a glance)

After reading the following, I understand why the socket’s Baidu Encyclopedia says that the socket uplink application process and the downlink network protocol stack are the interface for the application to communicate through the network protocol, and the interface for the application to interact with the network protocol root.

Write picture description here

Write picture description here

 

Taken from "Deep Practice Embedded Linux System Migration"

 

 

Retrieved from: https://blog.csdn.net/jiushimanya/article/details/82684525

Socket communication principle and routines (understand at a glance)

 

jiushimanya 2018-09-13 10:53:41 76164 Collection 587

Category column: socket communication Article tags: socket network communication

If you have any questions or incorrect places, you can leave me a message.
Are you familiar with the words TCP/IP, UDP, and Socket programming? With the development of network technology, these words flood our ears. Then I want to ask:

  1. What are TCP/IP and UDP?
  2. Where is Socket?
  3. What is Socket?
  4. Will you use them?

What are TCP/IP and UDP?

TCP/IP (Transmission Control Protocol/Internet Protocol) is the transmission control protocol/internet protocol. It is an industry standard protocol set, which is designed for wide area networks (WANs).
UDP (User Data Protocol) is a protocol corresponding to TCP. It is a kind of TCP/IP protocol family.
Here is a diagram showing the relationship between these agreements.
Write picture description here
The TCP/IP protocol suite includes transport layer, network layer, and link layer. Now you know the relationship between TCP/IP and UDP.
Where is Socket?
In Figure 1, we don't see the shadow of Socket, so where is it? Still use the picture to speak, it is clear at a glance.
Write picture description here
It turns out that Socket is here.
What is Socket?
Socket is the middleware abstraction layer for communication between the application layer and the TCP/IP protocol suite. It is a set of interfaces. In the design mode, Socket is actually a facade mode, which hides the complex TCP/IP protocol family behind the Socket interface. For users, a set of simple interfaces is everything, allowing Socket to organize data to conform to the specified protocol.
Will you use them?
The predecessors have done a lot for us, and the communication between networks is much easier, but after all, there is still a lot of work to be done. I heard about Socket programming before and thought it was a relatively advanced programming knowledge, but as long as you understand the working principle of Socket programming, the mystery will be lifted.
A scene in life. If you want to call a friend, first dial the number, and the friend picks up the phone after hearing the ringing. At this time, you and your friend have established a connection and can talk. When the communication is over, hang up the phone to end the conversation. The scenes in life explain the working principle. Maybe the TCP/IP protocol family was born in life, which is not necessarily true.
Write picture description here

Let's start with the server side. The server first initializes the Socket, then binds to the port, listens to the port, calls accept to block, and waits for the client to connect. At this time, if a client initializes a Socket, and then connects to the server (connect), if the connection is successful, then the connection between the client and the server is established. The client sends a data request, the server receives the request and processes the request, then sends the response data to the client, the client reads the data, and finally closes the connection, and the interaction ends.

============================================

We are well versed in the value of information exchange, how do processes in the network communicate, such as when we open the browser to browse the web every day, how does the browser process communicate with the web server? When you chat with QQ, how does the QQ process communicate with the server or the QQ process of your friend? All these have to rely on sockets? What is a socket? What are the types of sockets? There are also the basic functions of sockets, which are all that this article wants to introduce. The main content of this article is as follows:

1. How to communicate between processes in the network?
2. What is Socket?
3. The basic operation
of socket 3.1, socket() function
3.2, bind() function
3.3, listen(), connect() function
3.4, accept() function
3.5, read(), write() function, etc.
3.6, close() Function
4. Three-way handshake of TCP in socket to establish connection details
5. Four-way handshake of TCP in socket to release connections
6. An example
1. How to communicate between processes in the network?
There are many ways of local inter-process communication (IPC), but they can be summarized into the following four categories:

Message passing (pipes, FIFOs, message queues)
synchronization (mutexes, condition variables, read-write locks, file and write record locks, semaphores)
shared memory (anonymous and named)
remote procedure calls (Solaris gate and Sun RPC) )
But these are not the subject of this article! What we want to discuss is how to communicate between processes in the network? The first problem to be solved is how to uniquely identify a process, otherwise communication is impossible! The process PID can be used to uniquely identify a process locally, but this is not feasible in the network. In fact, the TCP/IP protocol family has helped us solve this problem. The "ip address" of the network layer can uniquely identify the host in the network, and the "protocol + port" of the transport layer can uniquely identify the application (process) in the host. In this way, a triple (ip address, protocol, port) can be used to identify the process of the network, and process communication in the network can use this flag to interact with other processes.

Applications that use the TCP/IP protocol usually use application programming interfaces: UNIX BSD sockets and UNIX System V TLI (which has been eliminated) to achieve communication between network processes. For now, almost all applications use sockets, and now it is the network age, process communication is ubiquitous in the network, which is why I say "Everything is socket".

2. What is Socket?
We already know that processes in the network communicate through sockets, so what is socket? Socket originated in Unix, and one of the basic philosophy of Unix/Linux is that "everything is a file", and it can be operated in the mode of "open open -> read and write write/read -> close close". My understanding is that Socket is an implementation of this mode. Socket is a special file. Some socket functions are operations on it (read/write IO, open, close). We will introduce these functions later.

The origin of the word socket
The first use of the word "socket" was discovered in the document IETF RFC33 published on February 12, 1970, written by Stephen Carr, Steve Crocker and Vint Cerf. According to the records of the American Museum of Computer History, Croker wrote: "The elements of a namespace can be called a socket interface. A socket interface forms one end of a connection, and a connection can be completely specified by a pair of socket interfaces. "The Computer History Museum added: "This is about 12 years before the BSD socket interface definition."

3. The basic operations of
sockets Since sockets are an implementation of the "open-write/read-close" mode, sockets provide functional interfaces corresponding to these operations. Let's take TCP as an example to introduce several basic socket interface functions.

3.1. Socket() function
int socket(int domain, int type, int protocol); The
socket function corresponds to the opening operation of ordinary files. An ordinary file opening operation returns a file description word, and socket() is used to create a socket descriptor, which uniquely identifies a socket. The socket description word is the same as the file description word, and it is used in subsequent operations. Use it as a parameter to perform some read and write operations.

Just as you can pass in different parameter values ​​to fopen to open different files. When creating a socket, you can also specify different parameters to create different socket descriptors. The three parameters of the socket function are:

domain: Protocol domain, also known as protocol family (family). Commonly used protocol families are AF_INET, AF_INET6, AF_LOCAL (or AF_UNIX, Unix domain socket), AF_ROUTE and so on. The protocol family determines the address type of the socket, and the corresponding address must be used in communication. For example, AF_INET determines the combination of ipv4 address (32-bit) and port number (16-bit), and AF_UNIX determines the use of an absolute path Name as address.
type: Specify the socket type. Commonly used socket types are SOCK_STREAM, SOCK_DGRAM, SOCK_RAW, SOCK_PACKET, SOCK_SEQPACKET, etc. (what are the socket types?).
protocol: So the name suggests, it is the designated protocol. Commonly used protocols are IPPROTO_TCP, IPPTOTO_UDP, IPPROTO_SCTP, IPPROTO_TIPC, etc. They correspond to TCP transmission protocol, UDP transmission protocol, STCP transmission protocol, TIPC transmission protocol (I will discuss this protocol separately!).
Note: It is not that the above type and protocol can be combined at will. For example, SOCK_STREAM cannot be combined with IPPROTO_UDP. When protocol is 0, the default protocol corresponding to the type type is automatically selected.

When we call socket to create a socket, the returned socket description word exists in the protocol family (address family, AF_XXX) space, but does not have a specific address. If you want to assign an address to it, you must call the bind() function, otherwise the system will automatically randomly assign a port when you call connect() and listen().

3.2. The bind() function
As mentioned above, the bind() function assigns a specific address in an address family to the socket. For example, corresponding to AF_INET and AF_INET6 is to assign an ipv4 or ipv6 address and port number combination to the socket.

int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
The three parameters of the function are:

sockfd: the socket description word, which is created by the socket() function and uniquely identifies a socket. The bind() function will bind a name to this descriptor.
addr: a const struct sockaddr * pointer, pointing to the protocol address to be bound to sockfd. This address structure is different according to the address protocol family when the socket is created. For example, ipv4 corresponds to:
struct sockaddr_in { sa_family_t sin_family; in_port_t sin_port; struct in_addr sin_addr; };



struct in_addr {
uint32_t s_addr;
};
ipv6对应的是:
struct sockaddr_in6 {
sa_family_t sin6_family;
in_port_t sin6_port;
uint32_t sin6_flowinfo;
struct in6_addr sin6_addr;
uint32_t sin6_scope_id;
};

struct in6_addr { unsigned char s6_addr[16]; }; The Unix domain corresponds to: #define UNIX_PATH_MAX 108



struct sockaddr_un { sa_family_t sun_family; char sun_path[UNIX_PATH_MAX]; }; addrlen: corresponds to the length of the address. Usually the server will be bound to a well-known address (such as ip address + port number) when it is started, which is used to provide services, and the client can connect to the server through it; the client does not need to specify, and the system automatically assigns a port number Combined with its own ip address. This is why usually the server will call bind() before listen, but the client will not call it. Instead, the system randomly generates one when connect().




Network byte order and host byte order The
host byte order is what we usually call big-endian and little-endian modes: different CPUs have different endian types. These endianness refer to the order in which integers are stored in memory. This is called the host sequence. The definitions of Big-Endian and Little-Endian that quote the standard are as follows:

  a) Little-Endian means that the low byte is placed at the low address end of the memory, and the high byte is placed at the high address end of the memory.

  b) Big-Endian means that the high byte is placed at the low address end of the memory, and the low byte is placed at the high address end of the memory.

Network byte order: 4 bytes of 32-bit values ​​are transmitted in the following order: first, 0~7bit, secondly 8~15bit, then 16~23bit, and finally 24-31bit. This transmission order is called big endian. Because all binary integers in the TCP/IP header are required to be in this order when they are transmitted over the network, it is also called network byte order. Endianness, as the name implies, the order of bytes is the order in which data larger than one byte is stored in the memory. There is no order problem for one byte of data.

Therefore: When binding an address to the socket, please convert the host byte order to the network byte order first, instead of assuming that the host byte order is the same as the network byte order using Big-Endian. Because of this problem, there have been murders! Because of this problem in the company's project code, it has caused many inexplicable problems, so please remember not to make any assumptions about the host byte order, be sure to convert it to network byte order and assign it to the socket.

3.3. If the listen() and connect() functions
are used as a server, listen() will be called to monitor the socket after calling socket() and bind(). If the client calls connect() to make a connection request, the server The end will receive this request.

int listen(int sockfd, int backlog);
int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
The first parameter of the listen function is the description of the socket to be monitored, and the second parameter is the corresponding socket. The maximum number of connections queued. The socket created by the socket() function is of an active type by default, and the listen function turns the socket into a passive type and waits for the client's connection request.

The first parameter of the connect function is the socket description of the client, the second parameter is the socket address of the server, and the third parameter is the length of the socket address. The client establishes a connection with the TCP server by calling the connect function.

3.4. Accept() function After the
TCP server calls socket(), bind(), listen() in turn, it will monitor the specified socket address. After the TCP client calls socket() and connect() in turn, it wants the TCP server to send a connection request. After the TCP server listens to the request, it will call the accept() function to receive the request, so that the connection is established. After that, you can start network I/O operations, that is, read and write I/O operations similar to ordinary files.

int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);
The first parameter of the accept function is the socket description of the server, and the second parameter is a pointer to struct sockaddr *, which is used to return the client's protocol address. The third parameter is the length of the protocol address. If accpet is successful, the return value is a brand new description word automatically generated by the kernel, which represents the TCP connection with the returning client.

Note: The first parameter of accept is the socket description of the server, which is generated when the server starts calling the socket() function, which is called the listening socket description; and the accept function returns the description of the connected socket. A server usually only creates a listening socket descriptor, which always exists during the life cycle of the server. The kernel creates a connected socket descriptor for each client connection accepted by the server process. When the server completes the service for a certain client, the corresponding connected socket descriptor is closed.

3.5. Functions such as read(), write() and so on.
Everything has nothing but Dongfeng. So far, the server and the client have established a connection. The network I/O can be called for read and write operations, that is, the communication between different processes in the network is realized! There are the following groups of network I/O operations:

read()/write()
recv()/send()
readv()/writev()
recvmsg()/sendmsg()
recvfrom()/sendto()
I recommend using recvmsg()/sendmsg() functions, these two Function is the most common I/O function. In fact, all other functions above can be replaced with these two functions. Their declaration is as follows:

   #include 

   ssize_t read(int fd, void *buf, size_t count);
   ssize_t write(int fd, const void *buf, size_t count);

   #include 
   #include 

   ssize_t send(int sockfd, const void *buf, size_t len, int flags);
   ssize_t recv(int sockfd, void *buf, size_t len, int flags);

   ssize_t sendto(int sockfd, const void *buf, size_t len, int flags,
                  const struct sockaddr *dest_addr, socklen_t addrlen);
   ssize_t recvfrom(int sockfd, void *buf, size_t len, int flags,
                    struct sockaddr *src_addr, socklen_t *addrlen);

   ssize_t sendmsg(int sockfd, const struct msghdr *msg, int flags);
   ssize_t recvmsg(int sockfd, struct msghdr *msg, int flags);

The read function is responsible for reading the content from fd. When the read is successful, read returns the number of bytes actually read. If the returned value is 0, it means that the end of the file has been read, and if it is less than 0, it means an error has occurred. If the error is EINTR, it means that the reading is caused by an interrupt. If it is ECONNREST, it means that there is a problem with the network connection.

The write function writes the contents of nbytes in buf to the file descriptor fd. It returns the number of bytes written when it succeeds. On failure, it returns -1 and sets the errno variable. In a network program, there are two possibilities when we write to the socket file descriptor. 1) The return value of write is greater than 0, which means that part or all of the data has been written. 2) The returned value is less than 0, and an error has occurred at this time. We have to deal with it according to the type of error. If the error is EINTR, it means that an interrupt error occurred while writing. If it is EPIPE, there is a problem with the network connection (the other party has closed the connection).

I will not introduce these pairs of I/O functions one by one. For details, please refer to the man document or baidu and Google. Send/recv will be used in the following examples.

3.6. close() function
After the server and the client have established a connection, some read and write operations will be performed. After the read and write operations are completed, the corresponding socket descriptor must be closed. It is like calling fclose to close the opened file after the operation is completed.

#include
int close(int fd);
The default behavior of closing a TCP socket is to mark the socket as closed, and then immediately return to the calling process. The description word can no longer be used by the calling process, which means that it can no longer be used as the first parameter of read or write.

Note: The close operation only makes the reference count of the corresponding socket description word -1. Only when the reference count is 0, the TCP client is triggered to send a connection termination request to the server.

4. Detailed explanation of TCP three-way handshake connection establishment in socket
We know that tcp connection establishment requires a "three-way handshake", that is, three packets are exchanged. The general process is as follows:

The client sends a SYN J to the
server. The server responds to the client with a SYN K and confirms ACK J+1 to the SYN J. The
client wants the server to send a confirmation ACK K+1.
Only the three-way handshake is completed, but this three-way handshake occurs. What about the socket functions? Please see the picture below:

image

Figure 1. TCP three-way handshake sent in socket

It can be seen from the figure that when the client calls connect, it triggers a connection request and sends a SYN J packet to the server. At this time, connect enters the blocking state; the server listens to the connection request and receives the SYN J packet and calls the accept function. The receiving request sends SYN K and ACK J+1 to the client, then accept enters the blocking state; after the client receives the SYN K and ACK J+1 from the server, then connect returns and confirms SYN K; the server receives When ACK K+1 is reached, accept returns. So far, the three-way handshake is completed and the connection is established.

Summary: The client's connect returns in the second time of the three-way handshake, while the server-side accept returns in the third time of the three-way handshake.

5. Detailed explanation of TCP's four-way handshake release connection
in socket The above describes the establishment process of TCP's three-way handshake in socket and the socket functions involved. Now we introduce the four-way handshake in the socket to release the connection process, please see the following figure:

image

Figure 2. TCP four-way handshake sent in socket

The graphic process is as follows:

An application process first calls close to actively close the connection. At this time, TCP sends a FIN M; after the
other end receives the FIN M, it executes a passive close to confirm the FIN. Its reception is also passed to the application process as the end of file, because the reception of FIN means that the application process can no longer receive additional data on the corresponding connection;
after a period of time, the application process that receives the end of file calls close to close it Socket. This causes its TCP to also send a FIN N;
the source sender TCP that receives this FIN acknowledges it.
So there is a FIN and ACK in each direction.

6. An example of implementation is given below.
First, a screenshot of the implementation is given first

Write picture description here

The server-side code is as follows:
[cpp] view plaincopyprint?

#include "InitSock.h"
#include
#include
using namespace std;
CInitSock initSock; // 初始化Winsock库
int main()
{
// 创建套节字
SOCKET sListen = ::socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
//用来指定套接字使用的地址格式,通常使用AF_INET
//指定套接字的类型,若是SOCK_DGRAM,则用的是udp不可靠传输
//配合type参数使用,指定使用的协议类型(当指定套接字类型后,可以设置为0,因为默认为UDP或TCP)
if(sListen == INVALID_SOCKET)
{
printf("Failed socket() \n");
return 0;
}
// 填充sockaddr_in结构 ,是个结构体
sockaddr_in sin;
sin.sin_family = AF_INET;
sin.sin_port = htons(4567); //1024 ~ 49151:普通用户注册的端口号
sin.sin_addr.S_un.S_addr = INADDR_ANY;
// 绑定这个套节字到一个本地地址
if(::bind(sListen, (LPSOCKADDR)&sin, sizeof(sin)) == SOCKET_ERROR)
{
printf("Failed bind() \n");
return 0;
}
// 进入监听模式
//2指的是,监听队列中允许保持的尚未处理的最大连接数
if(::listen(sListen, 2) == SOCKET_ERROR)
{
printf("Failed listen() \n");
return 0;
}
// 循环接受客户的连接请求
sockaddr_in remoteAddr;
int nAddrLen = sizeof(remoteAddr);
SOCKET sClient = 0;
char szText[] = " TCP Server Demo! \r\n";
while(sClient==0)
{
// 接受一个新连接
//((SOCKADDR*)&remoteAddr)一个指向sockaddr_in结构的指针,用于获取对方地址
sClient = ::accept(sListen, (SOCKADDR*)&remoteAddr, &nAddrLen);
if(sClient == INVALID_SOCKET)
{
printf("Failed accept()");
}
printf("接受到一个连接:%s \r\n", inet_ntoa(remoteAddr.sin_addr));
continue ;
}
while(TRUE)
{
// 向客户端发送数据
gets(szText) ;
::send(sClient, szText, strlen(szText), 0);
// 从客户端接收数据
char buff[256] ;
int nRecv = ::recv(sClient, buff, 256, 0);
if(nRecv > 0)
{
buff[nRecv] = '\0';
printf(" 接收到数据:%s\n", buff);
}
}
// 关闭同客户端的连接
::closesocket(sClient);
// 关闭监听套节字
::closesocket(sListen);
return 0;
}

Client code:

#include "InitSock.h"
#include
#include
using namespace std;
CInitSock initSock; // 初始化Winsock库
int main()
{
// 创建套节字
SOCKET s = ::socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if(s == INVALID_SOCKET)
{
printf(" Failed socket() \n");
return 0;
}
// 也可以在这里调用bind函数绑定一个本地地址
// 否则系统将会自动安排
// 填写远程地址信息
sockaddr_in servAddr;
servAddr.sin_family = AF_INET;
servAddr.sin_port = htons(4567);
// 注意,这里要填写服务器程序(TCPServer程序)所在机器的IP地址
// 如果你的计算机没有联网,直接使用127.0.0.1即可
servAddr.sin_addr.S_un.S_addr = inet_addr("127.0.0.1");
if(::connect(s, (sockaddr*)&servAddr, sizeof(servAddr)) == -1)
{
printf(" Failed connect() \n");
return 0;
}
char buff[256];
char szText[256] ;
while(TRUE)
{
//从服务器端接收数据
int nRecv = ::recv(s, buff, 256, 0);
if(nRecv > 0)
{
buff[nRecv] = '\0';
printf("接收到数据:%s\n", buff);
}
// 向服务器端发送数据
gets(szText) ;
szText[255] = '\0';
::send(s, szText, strlen(szText), 0) ;
}
// 关闭套节字
::closesocket(s);
return 0;
}

封装的InitSock.h
[cpp] view plaincopyprint?
#include
#include
#include
#include
#pragma comment(lib, "WS2_32") // 链接到WS2_32.lib
class CInitSock
{
public:
CInitSock(BYTE minorVer = 2, BYTE majorVer = 2)
{
// 初始化WS2_32.dll
WSADATA wsaData;
WORD sockVersion = MAKEWORD(minorVer, majorVer);
if(::WSAStartup(sockVersion, &wsaData) != 0)
{
exit(0);
}
}
~CInitSock()
{
::WSACleanup();
}
};

Guess you like

Origin blog.csdn.net/sinat_16643223/article/details/108677406