Table of contents
IP conversion address conversion function
Socket model creation flow chart (TCP communication process/CS model flow chart)
error handling wrapper function
socket socket concept
The so-called socket is an abstraction of an endpoint for two-way communication between application processes on different hosts in the network.
A socket is one end of process communication on the network, providing a mechanism for application layer processes to exchange data using network protocols. From the perspective of its position, the socket connects to the application process and connects to the network protocol stack , which is the interface for the application program to communicate through the network protocol process, and the interface for the application program to interact with the network protocol.
It is an API for communication in a network environment , and each socket in use has a process connected to it. During communication, one of the network applications writes a piece of information to be transmitted into the socket of the host where it is located, and the socket sends this piece of information to the socket of another host through the transmission medium connected to the network interface card (NIC), so that The other party can receive this message . The socket is combined by the IP address and port , and provides a mechanism for the application layer process to transmit data packets.
Socket originally means "socket". In the Linux environment, it is a special file type used to represent inter-process network communication. Essentially, it is a pseudo-file formed by the kernel with the help of a buffer . Setting it as a file is convenient for us to operate, we can operate through the file descriptor. Compared with the pipeline type, the purpose of Linux system to encapsulate the period into a file is to unify the interface, so that the operation of reading and writing sockets is the same as that of reading and writing files. The difference is that pipes are used for local inter-process communication, while sockets are mostly used for data transfer between network processes.
The socket is a full-duplex communication, that is, data can be read in and output at the same time.
MAC address (physical address)
IP address (logical address): uniquely identifies a host in the network
Port number: uniquely identifies a process in a host
IP+port number: uniquely identify a process in the network environment
Socket principle: (bind IP and port number)
Sockets must appear in pairs in the network.
The TCP/IP protocol was first implemented on BSD UNIX, and the application layer programming interface designed for the TCP/IP protocol is called socket API.
-Server side: Passively accept connections, generally do not actively initiate connections
-Client: actively initiate a connection to the server
byte order
Now the CPU's accumulator can be loaded with (at least) 4 bytes (32-bit machines), ie an integer at a time. Haha, I'm thinking that the 32-bit pointer is also 4 bytes. The size of the pointer is related to the addressable range. The load of the accumulator limits the size of the pointer and the addressable range. Then the order in which these 4 bytes are arranged in memory will affect the integer value loaded by the accumulator, which is the byte order problem. In various computer architectures, the storage mechanisms for bytes and words are different, which leads to a very important problem in the field of computer communication, that is, in what order should the information units exchanged by the two communication parties be transmitted. If no consistent rules are reached, the two communicating parties will not be able to perform correct encoding/decoding, resulting in communication failure.
(In a word: byte order is the way to store data, everyone follows the unified rules to ensure the correctness of data transmission)
Byte order is divided into big-endian (Big-Endian) and little-endian (Little-Endian). Big-endian means that the high-order byte of an integer is stored in the low-address location of the memory, and the low-order byte is stored in the high-address location of the memory. Little-endian means that the high-order byte of an integer is stored at a high address in memory, and the low-order byte is stored at a low address in memory.
Big end: low address --- high
Little end: high address --- low
Memory method: the lower address is larger than the upper address, and the higher address is smaller than the lower address
network byte order
When formatted data is passed directly between two hosts that use different endianness, the receiving end will necessarily interpret it incorrectly. So how to solve this problem? Suppose we let the end of the sending data always send in big-endian byte order (yes, to make a unified regulation), then the end of the receiving data will know that the byte order I receive is always big-endian byte order, if accepted If the byte order of the party is little-endian, then only the big-endian data needs to be converted into little-endian byte order.
The TCP/IP protocol stipulates that network data streams should adopt big-endian byte order.
For convenience and portability, we can call the corresponding library function to convert...
h - host host, host byte order
to convert to what
n - network network byte order
s - short unsigned short port
l - long unsigned int IP
#include <arpa/inet.h>
// convert port
uint16_t htons(uint16_t hostshort); // host byte order - network byte order
uint16_t ntohs(uint16_t netshort); // host byte order - network byte order
// transfer to IP
uint32_t htonl(uint32_t hostlong); // host byte order - network byte order
uint32_t ntohl(uint32_t netlong); // host byte order - network byte order
Let’s test the endianness of my machine, hee hee
#include <stdio.h> #include <arpa/inet.h> //先定义一个联合体 union { int number; char c; }test; //为什么用这个联合体可以测试呢? /* 联合体:所有变量共用一块内存 按最大的成员变量进行申请内存 每一时刻只能有一个成员 对于test: siezof(test) = 4 如果用 c = 1 type ==> 每次都能够取到最低位置 */ int main(void) { test.number = 0x12345678; if(test.c == 0x12) //高位存储在内存的低地址上 { printf("本机为大端字节序\n"); } else { printf("本机为小端字节序\n"); } return 0; }
socket address
The socket address is actually a structure that encapsulates information such as port and IP. This socket address needs to be used in the following socket-related APIs.
As mentioned before, the socket socket connects to the application program and connects to the protocol stack
For a process where a data packet wants to be between two different hosts in the network (of course we do not include local sockets here), as long as the IP (logical address) and port of the other party are confirmed, the data can be transmitted to the other party [MAC The address can be obtained according to the ARP protocol]
General socket address
The socket network programming interface indicates that the socket address is a structure sockaddr, which is defined as follows:
#include <bits/socket.h>
struct sockaddr{ //has been deprecated
in_family_t in_family;
char in_data[14];
};
typedef unsigned short int sa_family_t;
member:
The sa_family member is a variable of address family type (sa_family_t). The address family type usually corresponds to the protocol type. Common protocol families and corresponding address families are as follows:
protocol family address family describe PF_UNIX OF_UNIX UNIX Native Domain Protocol Family PF_INET OF_INET TCP/IPv4 protocol family PF_INET6 AF_INET6 TCP/IPv6 protocol family The protocol family PF_* and the address family AF_* are in the header file bits/socket.h, the two values are the same, and they can be mixed (in any case, they are all macro definitions, and the macro definitions are macro replacements in the preprocessing stage, so mixed use is right for compilation and operation will not affect)
In fact, we can easily see a problem,
This place uses a fixed number 14type:
The sa_data member is used to store the socket address value. However, address values of different protocol families have different meanings and lengths
We can see that 14 bytes can almost only hold IPv4 addresses. Therefore, Linux defines the following new general-purpose socket address structure, which not only provides a large enough space for storing address values, but also is memory-aligned [memory alignment can speed up CPU access, memory alignment issues see My C language column has a detailed introduction]
This structure is defined in: /usr/include/linux/in.h
For ease of understanding, we remove some messy information:
#include <bits/socket.h> struct sockaddr_storage { sa_family_t sa_family; unsigned long int __ss_align; char __ss_padding[ 128 - sizeof(__ss_align) ]; }; typedef unsigned short int sa_family_t;
Private socket address
Many network programming functions were born earlier than the IPv4 protocol (use a custom protocol, and both parties agree on a rule). At that time, they used the struck socketaddr structure. For backward compatibility, now socketaddr degenerates into (void *) The function is to pass an address to the function. As for whether the function is sockaddr_in or sockaddr_in6, it is determined by the address family, and then the function is forced to convert the type into the required address type .
The UNIX local domain protocol family uses the following special socket address structure:
#include <sys/un.h> struct sockaddr_un { sa_family_t sin_family; char sun_path[108]; };
The TCP/IP protocol family has two dedicated socket address structures, sockaddr_in and sockaddr_in6, which are used for IPv4 and IPv6 respectively:
#include <netinet/in.h> struct sockaddr_in { sa_family_t sin_family; /* __SOCKADDR_COMMON(sin_) */ in_port_t sin_port; /* Port number. */ struct in_addr sin_addr; /* Internet address. */ /* Pad to size of `struct sockaddr'. */ unsigned char sin_zero[sizeof (struct sockaddr) - __SOCKADDR_COMMON_SIZE - sizeof (in_port_t) - sizeof (struct in_addr)]; }; struct in_addr { in_addr_t s_addr; }; struct sockaddr_in6 { sa_family_t sin6_family; in_port_t sin6_port; /* Transport layer port # */ uint32_t sin6_flowinfo; /* IPv6 flow information */ struct in6_addr sin6_addr; /* IPv6 address */ uint32_t sin6_scope_id; /* IPv6 scope-id */ }; typedef unsigned short uint16_t; typedef unsigned int uint32_t; typedef uint16_t in_port_t; typedef uint32_t in_addr_t; #define __SOCKADDR_COMMON_SIZE (sizeof (unsigned short int))
All variables of the special socket address (and sockaddr_storage) type need to be converted to the general socket address type sockaddr (mandatory conversion is enough) in actual use , because the address parameter type used by all socket programming interfaces is sockaddr.
IP conversion address conversion function
People are accustomed to using readable strings to represent IP addresses, such as dotted decimal strings to represent IPV4 addresses, and hexadecimal strings to represent IPv6 addresses, but we need to convert them to integers first in programming ( binary) can be used . Instead of logging, we need to convert the IP address represented by an integer into a readable string.
Early: (deprecated)
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
int inet_aton(const char *cp,struct in_addr *inp);
in_addr_t inet_addr(const char *cp);
char *inet_ntoa(struct in_addr in);
This can only handle IPV4 ip addresses, non-reentrant functions
Now:
p: IP string in dotted decimal notation
n: Integer representing network, network byte order
#include <arpa/inet.h>
int inet_pton(int af,const char *src,void *dst);
by: 第二族: AF_INET AF_INET6
src: the dotted decimal IP string that needs to be converted
dst: The converted result is saved in this
const char *inet_ntop(int af,const void *src,char *dst,socklen_t size);
by:AF_INET AF_INE6
src: the address of the integer of the ip to convert
dst: Converted to the place where the IP address string is saved
size: the size of the third parameter (the size of the array)
Return value: returns the address (string) of the converted data, which is the same as dst
Dotted decimal ---> network byte order inet_pton
Network byte order ---> dotted decimal inet_ntop
web socket functions
Socket model creation flow chart (TCP communication process/CS model flow chart)
Header file: #include <arpa/inet.h>
或者:#include <sys/types/h> #include <sys/socket.h>
int socket(int domain,int type,int protocol);
Function: create a socket
parameter:
domain: protocol family
AF_INET --> ipv4
AF_INET6 --> ipv6
AF_UNIX AF_LOCAL --> local socket communication (interprocess communication)
type: the protocol protocol used in the communication process
SOCK_STREAM --> streaming protocol
SOCK_DGRAM --> newspaper format file
protocol: a specific protocol, usually write 0
SOCK_STREAM --> streaming files use TCP by default
SOCK_DGRAM --> The format file uses UDP by default
return value:
Success: return the file descriptor, the operation is the kernel buffer (socket is essentially a pseudo-file)
Failed: -1
int bind(int sockfd,const struct sockaddr *addr,socklen_t addrlen);
Function: bind, bind fd with local IP+port
parameter:
sockfd: the file descriptor obtained by the socket function
addr: the socket address that needs to be bound, this address encapsulates the information of ip and port number
addrlen: the memory size occupied by the second parameter structure
int listen(int sockfd,int backlog);
Function: Listen for connections on this socket
parameter:
sockfd: the file descriptor obtained by the socket() function
backlog: unconnected sum, max value of connected sum 5
int accept(int sockfd,struct sockaddr *addr,socklen_t *addrlen);
Function: Receive client connection, the default is a blocking function, blocking waiting for client connection
parameter:
sockfd: file descriptor for listening
addr: Outgoing parameters, record the address information (ip, port) of the client after the connection is successful
addrlen: Specify the corresponding memory size of the second parameter
return value:
success: file descriptor for communication
Failed: -1
ssize_t write(int fd, const void *buf, size_t count); // 写数据
ssize_t read(int fd, void *buf, size_t count); // read data
Previous program case
#include <arpa/inet.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <stdio.h> //定义IP #define SERVER_IP "127.0.0.1" //定义端口 #define SERVER_PORT 8080 int main(void) { int lfd,cfd; char str[INET_ADDRSTRLEN]; //创建socket套接字 lfd = socket(AF_INET,SOCK_STREAM,0); //TCP ipv4 //绑定 IP(server) 和 端口号(监听) struct sockaddr_in serverAddr; memset(&serverAddr,0,sizeof(serverAddr)); serverAddr.sin_family = AF_INET; serverAddr.sin_port = htons(SERVER_PORT); serverAddr.sin_addr.s_addr = htonl(INADDR_ANY); //或者 INADDR_ANY:提供任意一个本地有效IP bind(lfd,(struct sockaddr *)&serverAddr,sizeof(serverAddr)); //监听 设置最大监听数目 128 listen(lfd,128); //等待连接 struct sockaddr_in clientAddr; socklen_t clientAddr_len = sizeof(clientAddr); cfd = accept(lfd,(struct sockaddr *)&clientAddr,&clientAddr_len); //数据交换 int n,i=0; char buf[1024] = {0}; while(1) { n = read(cfd,buf,sizeof(buf)); if(n == 0) //有客户端断开连接 { printf("有客户端断开连接\n"); } if(n < 0) { printf("aaaaaaaa\n"); } // inet_ntop(AF_INET,&clientAddr.sin_addr,str,sizeof(str)); // ntohs(clientAddr.sin_port); printf("已收到第%d次数据:%s\n",i++,buf); //sleep(2); write(cfd,buf,n); } close(cfd); close(lfd); return 0; }
#include <arpa/inet.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <stdio.h> //定义IP #define SERVER_IP "127.0.0.1" // //定义端口 #define SERVER_PORT 8080 int main(void) { int sockfd; //创建套接字 TCP ipv4 sockfd = socket(AF_INET,SOCK_STREAM,0); //连接 struct sockaddr_in serverAddr; memset(&serverAddr,0,sizeof(serverAddr)); serverAddr.sin_family = AF_INET; serverAddr.sin_port = htons(SERVER_PORT); inet_pton(AF_INET,SERVER_IP,&serverAddr.sin_addr); connect(sockfd,(struct sockaddr *)&serverAddr,sizeof(serverAddr)); //数据交换 char buf[1024] = {0}; int i=0,n=0; while(1) { //memset(buf,0,sizeof(buf)); fgets(buf,sizeof(buf),stdin); //scanf("%s",buf); write(sockfd,buf,sizeof(buf)); memset(buf,0,sizeof(buf)); n = read(sockfd,buf,sizeof(buf)); printf("------a-------\n"); write(STDOUT_FILENO,buf,n); } close(sockfd); return 0; }
After the client and server start the connection, you can use netstat -apn|grep 8080 to check the connection status
error handling wrapper function
We know that system calls cannot be guaranteed to succeed every time, and error handling must be performed, so that on the one hand, the logic of the program can be guaranteed to be normal, and on the other hand, fault information can be quickly obtained.
In order to make the error handling code not affect the readability of the main program, we encapsulate a series of socket-related functions plus error codes into new functions (encapsulation according to the system library functions), and make a template wrap.c
If necessary, please save this code by yourself, and you can use it directly next time. (If you are interested, you can package it into a dynamic library)
Let's get the header file done first:
Let's first look at what we need to define: it is actually very simple, but it is not difficult. Put the frame up first. Eurygi
Yes, just paste the above system calls directly, and make each one an error-handling interface API. Get it done
The function name is taken directly, it is best to name it according to the hump method
#ifndef _WRAP_H_ #define _WRAP_H_ void perr_exit(const char *s); int Accept(int fd,struct sockaddr *sa,socklen_t *salenptr); int Bind(int fd,const struct sockaddr *sa,socklen_t salen); int Connect(int fd,const struct sockaddr *sa,socklen_t salen); int Listen(int fd,int backlog); int Socket(int family,int type,int protocol); ssize_t Read(int fd,void *ptr,size_t nbytes); ssize_t Write(int fd,const void *ptr,size_t nbytes); int Close(int fd); ssize_t Readn(int fd,void *vptr,size_t n); ssize_t Writen(int fd,const void *vptr,size_t n); ssize_t my_read(int fd,char *ptr); ssize_t Readline(int fd,void *vptr,size_t maxlen); #endif
#include <stdlib.h> #include <string.h> #include <unistd.h> #include <errno.h> #include <sys/socket.h> #include <error.h> void perr_exit(const char *s) { perror(s); exit(1); } int Accept(int fd,struct sockaddr *sa,socklen_t *salenptr) { int n; //accept:阻塞,是慢系统调用。可能会被信息中断 again: if((n = accept(fd,sa,salenptr)) < 0) { if((errno == ECONNABORTED) || (errno == EINTR)) { goto again; //重启 } else { perr_exit("accept error"); } } return n; } int Bind(int fd,const struct sockaddr *sa,socklen_t salen) { int n; if((n = bind(fd,sa,salen)) < 0) { perr_exit("bind error"); } return n; } int Connect(int fd,const struct sockaddr *sa,socklen_t salen) { int n; if((n = connect(fd,sa,salen)) < 0) { perr_exit("connect error"); } return n; } int Listen(int fd,int backlog) { int n; if((n = listen(fd,backlog)) < 0) { perr_exit("listen error"); } return n; } int Socket(int family,int type,int protocol) { int n; if((n = socket(family,type,protocol)) < 0) { perr_exit("socket error"); } return n; } ssize_t Read(int fd,void *ptr,size_t nbytes) { ssize_t n; again: if((n = read(fd,ptr,nbytes)) == -1) { if(errno == EINTR)//被中断 { goto again; } else { return -1; } } return n; } ssize_t Write(int fd,const void *ptr,size_t nbytes) { ssize_t n; again: if((n = write(fd,ptr,nbytes)) == -1) { if(errno == EINTR) { goto again; } else { return -1; } } return n; } int Close(int fd) { int n; if((n = close(fd)) == -1) { perr_exit("close error"); } return n; } ssize_t Readn(int fd,void *vptr,size_t n) { size_t nleft; ssize_t nread; char *ptr; ptr = vptr; nleft = n; while(nleft > 0) { if((nleft = read(fd,ptr,nleft)) < 0) { if(errno == EINTR) { nread = 0; } else { return -1; } } else if(nread == 0) { break; } nleft -= nread; ptr += nread; } return n-nleft; } ssize_t Writen(int fd,const void *vptr,size_t n) { size_t nleft; ssize_t nwritten; const char *ptr; ptr = vptr; nleft = n; while(nleft > 0) { if((nwritten = write(fd,ptr,nleft)) <= 0) { if(nwritten < 0 && errno == EINTR) { nwritten = 0; } else { return -1; } } nleft -= nwritten; ptr += nwritten; } return n; } static ssize_t my_read(int fd,char *ptr) { static int read_cnt; static char *read_ptr; static char read_buf[100]; if(read_cnt <= 0) { again: if((read_cnt = read(fd,read_buf,sizeof(read_buf))) < 0) { if(errno == EINTR) { goto again; } return -1; } else if(read_cnt == 0) { return 0; } read_ptr = read_buf; } read_cnt--; *ptr = *read_ptr++; return 1; } ssize_t Readline(int fd,void *vptr,size_t maxlen) { ssize_t n,rc; char c,*ptr; ptr = vptr; for(n=1;n<maxlen;n++) { if((rc = my_read(fd,&c)) == 1) { *ptr++ = c; if(c == '\n') { break; } } else if(rc == 0) { *ptr = 0; return n-1; } else { return -1; } } *ptr = 0; return n; }