[Linux] socket programming (socket socket introduction, byte order, socket address, IP address conversion function, socket function, TCP communication implementation)

orange color

1. Introduction to sockets

The so-called socket is an abstraction of endpoints for two-way communication between application processes on different hosts in the network.

A socket is one end of process communication on the network, providing a mechanism for application layer processes to exchange data using network protocols. In terms of position, the socket is connected to the application process and the network protocol stack is connected down. It is the interface for the application program to communicate through the network protocol process and the interface for the application program to interact with the network protocol.

It is an API for communication in a network environment. Each socket in use has a process connected to it. During communication, one of the network applications writes a piece of information to be transmitted into the socket of the host where it is located, and the socket sends the piece of information to the socket of another host through the transmission medium connected to the network interface card (NIC), so that The other party can receive this information. socket是由IP地址和端口结合的, provides a mechanism for application layer processes to transmit data packets.

Socket originally means "socket". In the Linux environment, it is a special file type used to represent inter-process network communication. It is essentially a pseudo file formed by the kernel with the help of a buffer. Set it as a file to facilitate our operation. We can operate through the file descriptor. As with the pipe type, the purpose of encapsulating periods into files in the Linux system is to unify the interface so that reading and writing sockets and reading and writing files operate the same. The difference is that pipes are used for local inter-process communication, while sockets are mostly used for data transfer between network processes.

Socket is a full-duplex communication, that is, data can be read in and data output at the same time.
Insert image description here

IP address (logical address): uniquely identifies a host in the network.
Port number: uniquely identifies a process in a host.
IP+port number: uniquely identifies a process in the network environment.

-Server side: Passively accepts connections and generally does not actively initiate connections.

-Client: actively initiates a connection to the server

2. Endianness

Introduction

Now the CPU's accumulator can load (at least) 4 bytes at a time (32-bit machine), that is, an integer. Then the order in which these 4 bytes are arranged in memory will affect the integer value loaded by the accumulator. This is a byte order problem. In various computer architectures, the storage mechanisms for bytes, words, etc. are different, which raises a very important issue in the field of computer communication, that is, in what order the information units exchanged by the communicating parties should be transmitted. If agreed rules are not reached, the communicating parties will not be able to perform correct encoding/decoding, resulting in communication failure.

Byte order, as the name implies, is the order in which data of a type greater than one byte is stored in memory (of course there is no need to talk about the order of data of one byte).

Byte order is divided into Big-Endian and Little-Endian. 大端字节序是指一个整数的高位字节存储在内存的低地址位置,低位字节存储在内存的高地址位置。小端字节序则是指一个整数的高位字节存储在内存高地址处,而低位字节则存储在内存的低地址处.

Insert image description here

Obviously, for a number, the number further to the left is the high bit, and the number further to the right is the low bit.

Next, write a program to detect the byte order of the current host:
If you don’t know about unions, you can refer to this article - C language | Detailed explanation of unions

/*  
    字节序:字节在内存中存储的顺序。
    小端字节序:数据的高位字节存储在内存的高位地址,低位字节存储在内存的低位地址
    大端字节序:数据的低位字节存储在内存的高位地址,高位字节存储在内存的低位地址
*/

// 通过代码检测当前主机的字节序
#include <stdio.h>

int main() {
    
    

    union {
    
    
        short value;    // 2字节
        char bytes[sizeof(short)];  // char[2]
    } test; 

    test.value = 0x0102;
    if((test.bytes[0] == 1) && (test.bytes[1] == 2)) {
    
    
        printf("大端字节序\n");
    } else if((test.bytes[0] == 2) && (test.bytes[1] == 1)) {
    
    
        printf("小端字节序\n");
    } else {
    
    
        printf("未知\n");
    }

    return 0;
}

Insert image description here

Byte order conversion function

When formatted data is passed directly between two hosts using different byte order, the receiving end will inevitably interpret it incorrectly. The way to solve the problem is: the sending end always converts the data to be sent into big-endian byte order data before sending it, and the receiving end knows that the data sent by the other party is always in big-endian byte order, so the receiving end can The byte order adopted by itself determines whether to convert the received data (little endian machine converts, big endian machine does not convert).

网络字节顺序It is a data representation format specified in TCPIP. It has nothing to do with the specific CPU type, operating system, etc., thus ensuring that data can be correctly interpreted when transmitted between different hosts. The network byte order adopts big-endian sorting.

BSD Socket provides an encapsulated conversion interface for programmer convenience. Including the conversion functions from host byte order to network byte order: htons, htonl; the conversion functions from network byte order to host byte order: ntohs, ntohl.

/*
h - host   主机,主机字节序

to   转换成什么

n - network   网络字节序

s - short unsigned short   端口

l - long unsigned int   IP

 网络通信时,需要将主机字节序转换成网络字节序(大端),
    另外一段获取到数据以后根据情况将网络字节序转换成主机字节序。

    // 转换端口
    uint16_t htons(uint16_t hostshort);		// 主机字节序 - 网络字节序
    uint16_t ntohs(uint16_t netshort);		// 网络字节序 - 主机字节序

    // 转IP
    uint32_t htonl(uint32_t hostlong);		// 主机字节序 - 网络字节序
    uint32_t ntohl(uint32_t netlong);		// 网络字节序 - 主机字节序
*/

#include <stdio.h>
#include <arpa/inet.h>

int main() {
    
    

    // htons 转换端口
    unsigned short a = 0x0102;
    printf("a : %x\n", a);
    unsigned short b = htons(a);
    printf("b : %x\n", b);

    printf("=======================\n");

    // htonl  转换IP
    char buf[4] = {
    
    192, 168, 1, 100};
    int num = *(int *)buf;
    printf("num : %d\n", num);
    
    int sum = htonl(num);
    unsigned char *p = (char *)&sum;

    printf("%d %d %d %d\n", *p, *(p+1), *(p+2), *(p+3));

    printf("=======================\n");

    // ntohl
    unsigned char buf1[4] = {
    
    1, 1, 168, 192};
    int num1 = *(int *)buf1;
    int sum1 = ntohl(num1);
    unsigned char *p1 = (unsigned char *)&sum1;
    printf("%d %d %d %d\n", *p1, *(p1+1), *(p1+2), *(p1+3));
    
     // ntohs


    return 0;
}

Insert image description here

Question: What’s going on when the printed num is 1677830336?
Answer: // 192 168 1 100
  // 11000000 10101000 000000001 01101000
//This machine is little endian, so 192 is in the low bit and 100 is in the high bit, so the num is
// 01101000 00000001 10101000 11000000 = 1677830336

3. socket address

In the socket network programming interface, the socket address is the structure sockaddr, which is defined as follows:

#include <bits/socket.h>

struct sockaddr{
    
                                    //已经被废弃掉

        sa_family_t sa_family;
        char sa_data[14];
};

typedef unsigned short int sa_family_t;

Members:
    The sa_family member is a variable of address family type (sa_family_t). Address family types usually correspond to protocol types. Common protocol families and corresponding address families are as follows:

protocol family address family describe
PF_UNIX OF_UNIX UNIX local domain protocol suite
PF_INET OF_INET TCP/IPv4 protocol suite
PF_INET6 AF_INET6 TCP/IPv6 protocol suite

Protocol family PF_* and address family AF_* are both defined in the header file bits/socket.h. They have the same value and can be used mixedly (anyway, they are both macro definitions. Macro definitions are macro replacements in the preprocessing stage, so they are suitable for mixed use. There will be no impact on compiling and running)

The sa_data member is used to store the socket address value. However, the address values ​​of different protocol families have different meanings and lengths
Insert image description here
. It can be seen that 14 bytes can only hold IPv4 addresses, but cannot hold IPv6 addresses. Therefore, this structure representation has been abandoned. Linux defines the following new universal socket address structure. This structure not only provides enough space to store address values, but is also memory aligned [memory alignment] Can speed up CPU access]

This structure is defined in:/usr/include/linux/in.h

#include <bits/socket.h>
struct sockaddr_storage
{
    
    
sa_family_t sa_family;
unsigned long int __ss_align; //不用管,用来作内存对齐的
char __ss_padding[ 128 - sizeof(__ss_align) ];
};
typedef unsigned short int sa_family_t;

Private socket address

Many network programming functions were born earlier than the IPv4 protocol (use a custom protocol, and both parties agree on a rule). At that time, they all used the struck socketaddr structure. *For forward compatibility, socketaddr has now degenerated into (void ) Its function is to pass an address to the function. Whether the function is sockaddr_in or sockaddr_in6 is determined by the address family, and then the function is forced to convert the type to the required address type internally .

The main thing to remember is that the second struct sockaddr_in
Insert image description here
UNIX local domain protocol family in the figure below uses the following dedicated socket address structure:

#include <sys/un.h>
struct sockaddr_un
{
    
    
sa_family_t sin_family;
char sun_path[108];
};

The TCP/IP protocol suite has two dedicated socket address structures, sockaddr_in and sockaddr_in6, which are used for IPv4 and IPv6 respectively:

#include <netinet/in.h>
struct sockaddr_in
{
    
    
sa_family_t sin_family;         /* __SOCKADDR_COMMON(sin_) */
in_port_t sin_port;             /* Port number. 2个字节的端口号 */
struct in_addr sin_addr;        /* Internet address. 4个字节的ip地址 */

/* Pad to size of `struct sockaddr'.  剩余填充的部分*/
unsigned char sin_zero[sizeof (struct sockaddr) - __SOCKADDR_COMMON_SIZE -
sizeof (in_port_t) - sizeof (struct in_addr)];
};


struct in_addr
{
    
    
in_addr_t s_addr;
};


struct sockaddr_in6
{
    
    
sa_family_t sin6_family;
in_port_t sin6_port; /* Transport layer port # */
uint32_t sin6_flowinfo; /* IPv6 flow information */
struct in6_addr sin6_addr; /* IPv6 address */
uint32_t sin6_scope_id; /* IPv6 scope-id */
};


typedef unsigned short uint16_t;
typedef unsigned int uint32_t;
typedef uint16_t in_port_t;
typedef uint32_t in_addr_t;
#define __SOCKADDR_COMMON_SIZE (sizeof (unsigned short int))

All special socket address (and sockaddr_storage) type variables need to be converted to the general socket address type sockaddr (just force conversion) when actually used, because the address parameter type used by all socket programming interfaces is sockaddr .

4. IP address conversion function

People are accustomed to using readable strings to represent IP addresses, such as dotted decimal strings to represent IPv4 addresses, and hexadecimal strings to represent IPv6 addresses, but in programming we need to convert them into integers first ( binary) can be used. On the contrary, for logging, we need to convert the IP address represented by an integer into a readable string.

p:点分十进制的IP字符串

n:表示network,网络字节序的整数

#include  <arpa/inet.h>

将IP地址从字符串形式转化为二进制整数形式
int inet_pton(int af,const char *src,void *dst);

af:地址族: AF_INET AF_INET6

src:需要转换的点分十进制的IP字符串

dst:转换后的结果保存在这个里面

将网络字节序的整数,转换成点分十进制的IP地址字符串
const char *inet_ntop(int af,const void *src,char *dst,socklen_t size);

af:AF_INET   AF_INE6

src: 要转换的ip的整数的地址

dst: 转换成的IP地址字符串保存的地方

size:第三个参数的大小(数组的大小)

返回值:返回转换后的数据的地址(字符串),和 dst 是一样的

点分十进制 --->  网络字节序   inet_pton

网络字节序 --->  点分十进制   inet_ntop

Code example:

/*
    #include <arpa/inet.h>
    // p:点分十进制的IP字符串,n:表示network,网络字节序的整数
    int inet_pton(int af, const char *src, void *dst);
        af:地址族: AF_INET  AF_INET6
        src:需要转换的点分十进制的IP字符串
        dst:转换后的结果保存在这个里面

    // 将网络字节序的整数,转换成点分十进制的IP地址字符串
    const char *inet_ntop(int af, const void *src, char *dst, socklen_t size);
        af:地址族: AF_INET  AF_INET6
        src: 要转换的ip的整数的地址
        dst: 转换成IP地址字符串保存的地方
        size:第三个参数的大小(数组的大小)
        返回值:返回转换后的数据的地址(字符串),和 dst 是一样的

*/

#include <stdio.h>
#include <arpa/inet.h>


int main() {
    
    

    // 创建一个ip字符串,点分十进制的IP地址字符串
    char buf[] = "192.168.1.4";
    unsigned int num = 0;

    // 将点分十进制的IP字符串转换成网络字节序的整数
    inet_pton(AF_INET, buf, &num);
    unsigned char * p = (unsigned char *)&num;
    printf("%d %d %d %d\n", *p, *(p+1), *(p+2), *(p+3));


    // 将网络字节序的IP整数转换成点分十进制的IP字符串
    char ip[16] = ""; //字符串IP地址四段,每段最多三个字节,加上3个“.”,再加一个字符串结束符
    const char * str =  inet_ntop(AF_INET, &num, ip, 16);
    printf("str : %s\n", str);
    printf("ip : %s\n", ip);
    printf("%d\n", ip == str);

    return 0;
}

Insert image description here

5. Socket function

#include <sys/types.h>
#include <sys/socket.h>
#include <arpa/inet.h>//包含了这个头文件,上面两个就可以省略

int socket(int domain,int type,int protoco1);
	- 功能:创建一个套接字
	- 参数:
		- domain:协议族
			AF_INET:ipv4
			AF_INET6:ipv6
			AF_UNIX,AF_LOCAL:本地套接字通信(进程间通信)
		- type:通信过程中使用的协议类型
			SOCK_STREAM:流式协议(TCP等)
			SOCK_DGRAM:报式协议(UDP等)
		- protocol:具体的一个协议。一般写0
			- SOCK_STREAM:流式协议默认使用TCP
			- SOCK_DGRAM:报式协议默认使用UDP
		- 返回值:
			- 成功:返回文件描述符,操作的就是内核缓冲区
			- 失败:-1	
			
int bind(int sockfd,const struct sockaddr *addr,socklen_t addrlen);
	- 功能:绑定,将fd和本地的IP+端口进行绑定
	- 参数:
			- socket:通过socket函数得到的文件描述符
			- addr:需要绑定的socket地址,这个地址封装了ip和端口号的信息
			- addr len:第二个参数结构体占的内存大小
			- 返回值:成功返回0,失败返回-1
			
int listen(int sockfd,int backlog);// /proc/sys/net/cor e/somaxconn
	- 功能:监听这个socket上的连接
	- 参数:
		- sockfd:通过socket()函数得到的文件描述符
		- backlog:未连接的和已连接的和的最大值,超过该设定的最大值的连接会被舍弃掉。但该设定值不能超过/proc/sys/net/cor e/somaxconn这个文件里的数值
		
int accept(int sockfd,struct sockaddr *addr ,sock1en_t *addrlen);
	- 功能:接收客户端连接,默认是一个阻塞的函数,阻塞等待客户的连接
	- 参数:
			- sockfd:用于监听的文件描述符
			- addr:传出参数,记录了连接成功后客户端的地址信息(IP和端口号)
			- addrlen:指定第二个参数的对应的内存的大小
	- 返回值:
			- 成功:返回用于通信的文件描述符
			- -1:失败
			
int connect(int sockfd,const struct sockaddr *addr,socklen_t addr1en);
	- 功能:客户端连接服务器
	- 参数:
			- sockfd:用于通信的文件描述符 
			- addr:客户端要连接的服务器的地址信息
			- addrlen:第二个参数的内存大小
	- 返回值:成功返回0,时报返回-1

ssize_t write(int fd,const void *buf, size_t count);
ssize_t read(int fd,void *buf, size_t count);

6. TCP communication implementation (server and client)

Note that this program is written based on the steps of the server receiving information during TCP communication. You can refer to this article first .

Service-Terminal

// TCP 通信的服务器端

#include <stdio.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>

int main() {
    
    

    // 1.创建socket(用于监听的套接字)
    int lfd = socket(AF_INET, SOCK_STREAM, 0);

    if(lfd == -1) {
    
    
        perror("socket");
        exit(-1);
    }

    // 2.绑定
    struct sockaddr_in saddr;        //这个结构体本文章的上半部分有详细的介绍,不了解可以去看看
    saddr.sin_family = AF_INET;
    // inet_pton(AF_INET, "192.168.193.128", &saddr.sin_addr.s_addr);
    saddr.sin_addr.s_addr = INADDR_ANY;  // 0.0.0.0
    saddr.sin_port = htons(9999);
    int ret = bind(lfd, (struct sockaddr *)&saddr, sizeof(saddr));

    if(ret == -1) {
    
    
        perror("bind");
        exit(-1);
    }

    // 3.监听
    ret = listen(lfd, 8);
    if(ret == -1) {
    
    
        perror("listen");
        exit(-1);
    }

    // 4.接收客户端连接
    struct sockaddr_in clientaddr;
    int len = sizeof(clientaddr);
    int cfd = accept(lfd, (struct sockaddr *)&clientaddr, &len);
    
    if(cfd == -1) {
    
    
        perror("accept");
        exit(-1);
    }

    // 输出客户端的信息
    char clientIP[16];
    inet_ntop(AF_INET, &clientaddr.sin_addr.s_addr, clientIP, sizeof(clientIP));
    unsigned short clientPort = ntohs(clientaddr.sin_port);
    printf("client ip is %s, port is %d\n", clientIP, clientPort);

    // 5.通信
    char recvBuf[1024] = {
    
    0};
    while(1) {
    
    
        
        // 获取客户端的数据
        int num = read(cfd, recvBuf, sizeof(recvBuf));
        if(num == -1) {
    
    
            perror("read");
            exit(-1);
        } else if(num > 0) {
    
    
            printf("recv client data : %s\n", recvBuf);
        } else if(num == 0) {
    
    
            // 表示客户端断开连接
            printf("clinet closed...");
            break;
        }

        char * data = "hello,i am server";
        // 给客户端发送数据
        write(cfd, data, strlen(data));
    }
   
    // 关闭文件描述符
    close(cfd);
    close(lfd);

    return 0;
}

Question: Assume that the server calls read once to finish reading the contents of the client's file descriptor. At this time, the client does not write data to the descriptor, but does not disconnect, then the server calls read for the second time. What will be returned?
Answer: Characteristics of read pipes, when there is no data in the pipe: 1. The writing end is completely closed, and read returns 0 (equivalent to reading the end of the file). 2. The writing end is not completely closed, and read blocks and waits. Refer to my article [Linux] The read and write characteristics of pipes and the setting of pipes as non-blocking

client

// TCP通信的客户端

#include <stdio.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>

int main() {
    
    

    // 1.创建套接字
    int fd = socket(AF_INET, SOCK_STREAM, 0);
    if(fd == -1) {
    
    
        perror("socket");
        exit(-1);
    }

    // 2.连接服务器端,注意是要服务器的ip地址和端口
    struct sockaddr_in serveraddr;
    serveraddr.sin_family = AF_INET;
    inet_pton(AF_INET, "192.168.177.146", &serveraddr.sin_addr.s_addr);
    serveraddr.sin_port = htons(9999);
    int ret = connect(fd, (struct sockaddr *)&serveraddr, sizeof(serveraddr));

    if(ret == -1) {
    
    
        perror("connect");
        exit(-1);
    }

    
    // 3. 通信
    char recvBuf[1024] = {
    
    0};
    while(1) {
    
    

        char * data = "hello,i am client";
        // 给客户端发送数据
        write(fd, data , strlen(data));

        sleep(1);
        
        int len = read(fd, recvBuf, sizeof(recvBuf));
        if(len == -1) {
    
    
            perror("read");
            exit(-1);
        } else if(len > 0) {
    
    
            printf("recv server data : %s\n", recvBuf);
        } else if(len == 0) {
    
    
            // 表示服务器端断开连接
            printf("server closed...");
            break;
        }

    }

    // 关闭连接
    close(fd);

    return 0;
}

The IP address in line 21 of the client should be the IP address of its own host. The IP address of my host is 192.168.177.146

Compile and execute the two files separately. The results are as follows:
Insert image description here
Insert image description here
You can see that in the server's execution program, the client's IP is printed as 192.168.177.146, and the port is randomly assigned to 35302

Note: Start the server first and then the client.

Homework: Change the server to a echo server, that is, the server sends back the data sent by the client. Change the client to input data from the keyboard and send it to the server. So the final effect is that I input from the keyboard, the client sends it to the server, and the server sends the same content back.

Server:

// TCP 通信的服务器端

#include <stdio.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>

int main() {
    
    

    // 1.创建socket(用于监听的套接字)
    int lfd = socket(AF_INET, SOCK_STREAM, 0);

    if(lfd == -1) {
    
    
        perror("socket");
        exit(-1);
    }

    // 2.绑定
    struct sockaddr_in saddr;
    saddr.sin_family = AF_INET;
    // inet_pton(AF_INET, "192.168.193.128", &saddr.sin_addr.s_addr);
    saddr.sin_addr.s_addr = INADDR_ANY;  // 0.0.0.0
    saddr.sin_port = htons(9999);
    int ret = bind(lfd, (struct sockaddr *)&saddr, sizeof(saddr));

    if(ret == -1) {
    
    
        perror("bind");
        exit(-1);
    }

    // 3.监听
    ret = listen(lfd, 8);
    if(ret == -1) {
    
    
        perror("listen");
        exit(-1);
    }

    // 4.接收客户端连接
    struct sockaddr_in clientaddr;
    int len = sizeof(clientaddr);
    int cfd = accept(lfd, (struct sockaddr *)&clientaddr, &len);
    
    if(cfd == -1) {
    
    
        perror("accept");
        exit(-1);
    }

    // 输出客户端的信息
    char clientIP[16];
    inet_ntop(AF_INET, &clientaddr.sin_addr.s_addr, clientIP, sizeof(clientIP));
    unsigned short clientPort = ntohs(clientaddr.sin_port);
    printf("client ip is %s, port is %d\n", clientIP, clientPort);

    // 5.通信
    char recvBuf[1024] = {
    
    0};
    while(1) {
    
    
        memset(recvBuf, 0, 1024);
        // 获取客户端的数据
        int num = read(cfd, recvBuf, sizeof(recvBuf));
        if(num == -1) {
    
    
            perror("read");
            exit(-1);
        } else if(num > 0) {
    
    
            printf("recv client data : %s\n", recvBuf);
        } else if(num == 0) {
    
    
            // 表示客户端断开连接
            printf("clinet closed...");
            break;
        }

        char * data = recvBuf;;
        // 给客户端发送数据
        write(cfd, data, strlen(data));
    }
   
    // 关闭文件描述符
    close(cfd);
    close(lfd);

    return 0;
}

Client:

// TCP通信的客户端

#include <stdio.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>

int main() {
    
    

    // 1.创建套接字
    int fd = socket(AF_INET, SOCK_STREAM, 0);
    if(fd == -1) {
    
    
        perror("socket");
        exit(-1);
    }

    // 2.连接服务器端,注意是要服务器的ip地址和端口
    struct sockaddr_in serveraddr;
    serveraddr.sin_family = AF_INET;
    inet_pton(AF_INET, "192.168.177.146", &serveraddr.sin_addr.s_addr);
    serveraddr.sin_port = htons(9999);
    int ret = connect(fd, (struct sockaddr *)&serveraddr, sizeof(serveraddr));

    if(ret == -1) {
    
    
        perror("connect");
        exit(-1);
    }

    
    // 3. 通信
    char recvBuf[1024] = {
    
    0};
    while(1) {
    
    

        char data[1024];
        memset(data, 0, 1024);
        printf("请输入发送数据:\n");
        scanf("%s", data);
        // 给客户端发送数据
        write(fd, data , strlen(data));

        sleep(1);
        
        memset(recvBuf, 0, 1024);
        int len = read(fd, recvBuf, sizeof(recvBuf));
        if(len == -1) {
    
    
            perror("read");
            exit(-1);
        } else if(len > 0) {
    
    
            printf("recv server data : %s\n", recvBuf);
        } else if(len == 0) {
    
    
            // 表示服务器端断开连接
            printf("server closed...");
            break;
        }

    }

    // 关闭连接
    close(fd);

    return 0;
}

Insert image description here
Insert image description here

Guess you like

Origin blog.csdn.net/mhyasadj/article/details/131181974