TCP/IP Network Programming Chapter 3: Address Family and Data Sequence

In the previous chapter, we introduced various details about socket creation. In this chapter, we will explain the assignment of IP addresses and port numbers to sockets.

The IP address and port number assigned to the socket

Internet Address

If you are really learning network programming now, then you must have mastered a certain foundation of computer networks. Things like network addresses should be familiar by heart, so I will briefly introduce them below.

There are two forms of expression for an IP address:

1.IPv4 4-byte address family

2.IPv6 16-byte address family

We must remember that IPv4 and IPv6 are not only different in address length, but also have a great degree of difference in their specific protocol implementation. The main purpose of the emergence of IPv6 is to solve the problem of possible shortage of IP addresses due to the surge in the number of computers. Now, the address range of IPv6 allows any sand on the earth to have an IP address.

Let's go back to IPv4. The IPv4 standard 4-byte IP address is divided into a network address and a host (referring to a computer) address, and is divided into types such as A, B, C, D, and E.

type address range network address bits host address bits Number of networks that can be allocated The number of hosts that can be allocated per network
A 1.0.0.0 - 126.255.255.255 8 24 128 16,777,216
B 128.0.0.0 - 191.255.255.255 16 16 16,384 65,536
C 192.0.0.0 - 223.255.255.255 24 8 2,097,152 256
D 224.0.0.0 - 239.255.255.255 unassigned unassigned unassigned unassigned

The above table can be very simple to understand the four types.

Now let me give an example to specifically understand the meaning of network addresses and host addresses. Assuming that data is transmitted to WWW.SEMI.COM company, the company built a local area network to connect all computers. Therefore, data should be transmitted to the SEMI.COM network first, that is, instead of browsing all 4-byte IP addresses at the beginning to find the target host; instead, only browse the network address of the 4-byte IP address, and first transmit the data to the SEMI.COM network. After receiving the data, the SEMI.COM network (the routers that make up the network) browses the host address (host ID) of the transmitted data and transmits the data to the target computer.

Network Address Classes and Host Address Boundaries

The number of bytes occupied by a network address can be determined simply by the first byte of the address, because we
distinguish network addresses based on the boundaries of the address, as shown below.
□The first byte range of class A address: 0~127

□The first byte range of class B address: 128-191

□The first byte range of C class address: 192~223

There is also the following expression.
□The first digit of a class A address starts with 0

□The first 2 digits of the class B address start with 10

□The first 3 digits of the class C address start with 110.
Because of this, when sending and receiving data through the socket, the correct host can be easily found after the data is transmitted to the network.

Differentiate the port number of the socket

When I started learning computer networks, I found ports to be difficult to understand. At that time, I thought that ports were ports, just like ports on a router. There are as many ports as there are ports. Thinking about it now, this idea is very naive. In fact, the two things of IP address and port number are born together. The IP address is to route a piece of data to your computer through a long line. When the data reaches your computer, the work of the IP address is over. Since the IP address can allow data to reach your computer, the data may come from different applications, so how to distinguish different applications? Yes, it is the turn of the port at this time. The function of the port is to distinguish different sockets, so one port number cannot be assigned to different sockets. In addition, the port number consists of 16 bits, and the range of port numbers that can be allocated is 0-65535. But 0-1023 is a well-known port (Well-known PORT), which is generally assigned to a specific application, so a value outside this range should be assigned. In addition, although the port number cannot be repeated, TCP sockets and UDP sockets do not share port numbers, so repetition is allowed. For example: If a TCP socket uses port 9190, other TCP sockets cannot use this port number, but UDP sockets can. In short, the data transmission target address contains both the I address and the port number, and only in this way, the data will be transmitted to the final destination application (application socket).

Representation of address information

In the previous chapters, the bind function we called had some parameters that we could not understand, but we know that the function of the bind function is to bind an IP address and port to a socket, so the parameters that we cannot understand must contain an IP address and port number. Next, let's introduce the content of the parameters.

struct sockaddr_in{
    sa_family_t     sin_family;//地址族
    uint16_t        sin_port;//16位TCO/UDP端口号
    struct in_addr  sin_addr;//32位IP地址
    char            sin_zero[8];//不使用
};

Another structure in_addr mentioned in this structure is defined as follows, which is used to store 32-bit IP addresses.

struct in_addr{
     In_addr_t  s_addr;//32位IPv4地址
};
data type name Data Type Description Declared header file
int8_t signed 8-bit int sys/types.h
uint8_t unsigned 8-bit int (unsigned char) sys/types.h
int16_t signed 16-bit int sys/types.h
uint16_t unsigned 16-bit int(unsigned short) sys/types.h
int32_t signed 32-bit int sys/types.h
uint32_t unsigned 32-bit int(unsigned long) sys/types.h
sa_family_t address family sys/socket.h
socklen_t length sys/socket.h
in_addr_t IP address, declared as uint32_t netinet/in.h
in_port_t port number, declared as uint16_t netinet/in.h

Seeing such a long type table, some people can't help but ask, why is the type name so long? One of the big reasons is the problem of portability. If the code suitable for a 32-bit computer is moved to a 64-bit computer, ints due to different digits will be interpreted as different byte sizes. This kind of problem must never happen. Therefore, if you use data of type int32_t, you can guarantee that it occupies 4 bytes at any time, even if it is transferred to a computer with different bytes.

Member analysis of structure sockaddr_in

member sin_family

The applicable address families are different for each protocol family. For example, IPv4 uses a 4-byte address family, and IPv6 uses a 16-byte address family.

Address Family meaning
OF_INET

The address family used in the IPv4 network protocol

AF_INET6 The address family used in the IPv6 network protocol
AF_LOCAL The address family of the UNIX protocol used in local communication

This member is there only to explain the following members

member sin_port

This member holds the 16-bit port number, and the point is, it is saved in network byte order.

member sin_addr

This member holds 32-bit address information and is also stored in network byte order. In order to understand this member well, the structure
in_addr should be observed at the same time. But the structure in_addr is clearly uint_32, so it only needs to be regarded as a 32-bit integer.

member sin_zero

No special meaning. It is just a member inserted to make the size of the structure sockaddr_in consistent with the sockaddr structure. It must be filled with 0, otherwise the desired result cannot be obtained. Later, sockaddr will be explained separately. It can also be seen from the code introduced before that the address value of the sockaddr_in structure variable will be passed to the bind function in the following manner. A detailed description of the bind function will be given later, and I hope you will focus on the codes of parameter passing and type conversion.

struct sockaddr_in serv_addr;
if(bind(serv_sock,(struct sockaddr *)&serv_addr, sizeof(serv_addr))==-1)
error_handling("bind() error");

What matters here is the passing of the second parameter. In fact, the second parameter of the bind function is expected to get the address value of the sockaddr structure variable, including address family, port number, IP address, etc.

struct sockaddr{
    sa_family_t sin_family;//地址族
    char        sa_data[14];//地址信息
};

Compared with sockaddr_in, this structure structure puts the last three members into sa_data. And this is very troublesome for containing address information, and then there is a new structure sockaddr_in. But in the end, it still needs to be converted to a structure variable of type sockaddr, and then passed to the bind function.

Network byte order and address translation

Byte order and network byte order

In the principle of computer composition, it is explained that there are two ways to save data in CPU memory

Mode 1: Store high byte to low address

Method 2: Store the high-order byte in the high-order address

The big-endian system value formed by 0x12 and 0x34 is the same as the little-endian system value formed by 0x34 and 0x12. In other words, the same value can only be recognized by changing the order in which the data is saved. When the big-endian system transmits the data 0x1234, the endian problem is not considered, but it is directly sent in the order of 0x12 and 0x34. As a result, the receiving end saves the data in little-endian order, so the data received in little-endian order becomes 0x3412 instead of 0x1234. Because of this, a unified method is agreed upon when transmitting data over the network. This agreement is called Network Byte Order, which is very simple and unified into big endian order. Therefore, all computers should recognize the network byte format of the data when receiving data, and the little-endian system should convert the data into a big-endian arrangement when transmitting data.


Next, introduce the function to help convert the byte order
unsigned short htons(unsigned short)

□ unsigned short ntohs(unsigned short)

□ unsigned long htonl(unsigned long)

□ unsigned long ntohl(unsigned long)

You should be able to grasp its function through the function name, just understand the following details.
□ h in htons represents the host (host) byte order.
□ The n in htons represents the network byte order.
In addition, s refers to short, and l refers to long (the long type occupies 4 bytes in Linux, which is very critical). Therefore, htons is a combination of h, to, n, and s, and can also be interpreted as "converting short data from host byte order to network byte order". Usually, in the function with suffix, s represents 2 bytes short, so it is used for port number conversion; in the function with l as suffix, it represents 4 bytes, so it is used for IP address conversion.

The above function call process is illustrated by the following example

#include<stdio.h>
#include<arpa/inet.h>

int main(int argc,char *argv[]){
    unsigned short host_port=0x1234;
    unsigned short net_port;
    unsigned long host_addr=0x12345678;
    unsigned long net_addr;

    net_port=htons(host_port);
    net_addr=htonl(host_addr);

    printf("Host ordered port: %#x \n",host_port);
    printf("Network ordered port: %#x \n",net_port);
    printf("Host ordered address: %#lx \n", host_addr);
    printf("Network ordered address:%#lx \n", net_addr);
    return 0;
}

Initialization and allocation of network addresses

Convert string information to integer type in network byte order

The members storing address information in sockaddr_in are 32-bit integers. Therefore, in order to assign an I address, it needs to be expressed as 32-bit integer data. This is not easy for us who are only familiar with string information. You can try to convert the IP address
201.211.214.36 into 4-byte integer data.
For the representation of IP addresses, what we are familiar with is the dotted decimal notation (Dotted Decimal Notation), not the integer data notation. Fortunately, there is a function that will help us convert the I address in string form into 32-bit integer data. This function performs network byte order conversion while converting the type.

#include<arpa/inet.h>
in_addr_t inet_addr(const char*string);//成功时返回32位大端序整数型值,失败时返回INADDR_NONE

If you pass a string in dotted decimal format like "211.214.107.99" to this function, it will convert it to a 32-bit integer and return it. Of course, the integer value satisfies the network byte order. In addition, the return value type in_addr_t of this function is internally declared as a 32-bit integer type. The following example shows the calling process of this function.

#include<stdio.h>
#include<arpa/inet.h>

int main(int argc,char*argv[]){
    char*addr1="1.2.3.4";
    char*addr2="1.2.3.256";
    
    unsigned long conv_addr=inet_addr(addr1);
    if(conv_addr==INADDR_NONE)
       printf("Error occured! \n");
    else
       printf("Network ordered integer addr: %#lx \n",conv_addr);

    conv_addr=inet_addr(addr2);
    if(conv_addr==INADDR_NONE)
       printf("Error occureded \n");
    else
       printf("Network ordered integer addr: %#lx \n\n", conv_addr);
    return 0;
}

It can be seen from the running results that the inet_addr function can not only convert the IP address into a 32-bit integer, but also detect invalid
IP addresses. In addition, it can be verified from the output that it is indeed converted to network byte order.

The inet_aton function is completely the same as the inet_addr function. It also converts the IP address in the form of a string into a 32-bit network byte
order integer and returns it. It's just that this function uses the in_addr structure, and it is used more frequently.

#include <arpa/inet.h>
int inet_aton(const char * string, struct in_addr * addr);
//成功时返回1(true),失败时返回0(false)。

Use the following example to understand the inet_aton function call process.

#include <stdio.h>
#include <stdlib.h>
#include <arpa/inet.h>
void error _handling(char *message);

int main(int argc, char *argv[]){
    char *addr="127.232.124.79";
    struct sockaddr_in addr_inet;
    if(!inet_aton(addr, &addr_inet.sin_addr))
         error _handling("Conversion error");
    else
         printf("Network ordered integer addr: %#x \n",addr _inet.sin_addr.s_addr);
    return 0;
}

Let's introduce another function that does the opposite of the above function

#include <arpa/inet.h>
char * inet_ntoa(struct in_addr adr);//成功时返回转换的字符串地址值,失败时返回-1。

This function converts the integer IP address passed in as a parameter into a string format and returns it. But be careful when calling, the return value type is a char pointer. Returning the string address means that the string has been saved to the memory space, but this function does not ask the programmer to allocate memory, but internally applies for memory and saves the string. That is to say, after calling this function, the string information should be copied to other memory spaces immediately. Because, if the inet_ntoa function is called again, it is possible to overwrite the previously saved string information. In short, this value is valid for the characters returned before calling the inet_ntoa function again. If long-term storage is required, the string should be copied to other memory spaces. An example of the above function call is given below.

#include <stdio.h>
#include <string.h>
#include <arpa/inet.h>
int main(int argc, char *argv[]){
    struct sockaddr_in addr1, addr2;
    char *str_ptr;
    char str_arr[20];

    addr1.sin_addr.s_addr=htonl(0x1020304);
    addr2.sin_addr.s_addr=htonl(0x1010101);

    str_ptr=inet_ntoa(addr1.sin_addr);
    strcpy(str_arr, str_ptr);
    printf("Dotted-Decimal notation1: %s \n", str_ptr);
    inet_ntoa(addr2.sin_addr);

    printf("Dotted-Decimal notation2: %s \n", str_ptr);
    printf("Dotted-Decimal notation3: %s \n", str_arr);
    return 0;
}

INADDR_ANY

It would be cumbersome to enter the IP address every time a server-side socket is created. In this case, the address information can be initialized as follows.

struct sockaddr_in addr;
char * serv_port =“9190*;
memset(&addr, 0, sizeof(addr));
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = htonl(INADDR_ANY);
addr.sin_port = htons(atoi(serv_port));

The biggest difference from the previous method is that the constant INADDR_ANY is used to assign the IP address of the server. If this method is used, the IP address of the computer running the server can be obtained automatically, without having to input it in person. Moreover, if multiple IP addresses have been assigned to the same computer (Multi-homed computers, general routers belong to this category), data can be received from different IP addresses as long as the port numbers are consistent. Therefore, this method is given priority in the server side. And unless the client has some server-side functions, it will not be used.

Assign a network address to a socket

Now that the initialization method of the sockaddr_in structure has been discussed, let's assign the initialized address information to the socket
. The bind function takes care of this.

#include <sys/socket.h>
int bind(int sockfd, struct sockaddr * myaddr, socklen_t addrlen);//成功时返回0,失败时返回-1。

      sockfd         //要分配地址信息(IP地址和端口号)的套接字文件描述符。
      myaddr         //存有地址信息的结构体变量地址值。
      addrlen        //第二个结构体变量的长度。

If this function is called successfully, the address information specified by the second parameter will be assigned to the corresponding socket in the first parameter

Windows-based implementation

Use of function htons, htonl in Windows

#include<stdio.h>
#include<winsock2.h>

void ErrorHandling(char* message);

int main(int argc, char *argv[]){
    WSADATA wsaData;
    unsigned short host_port=0x1234;
    unsigned short net_port;
    unsigned long host_addr=0x12345678;
    unsigned long net_addr;

    if(WSAStartup(MAKEWORD(2,2), &wsaData)!=0)
           ErrorHandling("WSAStartup() error!");

    net_port=htons(host_port);
    net_addr=htonl(host_addr);

    printf("Host ordered port: %#x \n",host_port);
    printf("Network ordered port: %#x \n", net_port);
    printf("Host ordered address: %#lx \n", host_addr);
    printf("Network ordered address: %#lx \n", net_addr);

    WSACleanup();
    return 0;
}
 
void ErrorHandling(char* message){
    fputs(message, stderr);
    fputc('\n', stderr);
    exit(1);
}

Use of functions inet_addr and inet_ntoa in Windows

#include <stdio.h>
#include <string.h>
#include <winsock2.h>

void ErrorHandling(char* message);
int main(int argc, char *argv[]){
    WSADATA wsaData;
    if(WSAStartup(MAKEWORD(2,2),&wsaData)!=0)
        ErrorHandling("WSAStartup() error!");

    /* inet_addr函数调用示例*/
    char *addr="127.212.124.78";
    unsigned long conv_addr=inet_addr(addr);%记向具体传构街
    if(conv_addr==INADDR_NONE)
        printf("Error occured! \n");
    else
        printf("Network ordered integer addr: %#lx \n", conv_addr);

    /* inet_ntoa函数调用示例*/
    struct sockaddr_in addr;
    char *strptr;
    char strArr[20];

    addr.sin_addr.s_addr=htonl(0x1020304);
    strptr=inet_ntoa(addr.sin_addr);
    strcpy(strArr, strptr);
    printf("Dotted-Decimal notation3 %s \n", strArr);

    WSACleanup();
    return e;
}
void ErrorHandling(char* message)
//与之前示例一致,故省略!

WSAStringToAddress & WSAAddressToString

The following introduces the two conversion functions added in Winsock2. They are functionally the same as inet_ntoa and inet_addr, but the advantage is that they only support multiple protocols and are applicable in both IPv4 and IPv6. Of course, they also have disadvantages, using inet_ntoa, inet_addr can easily switch programs between Linux and Windows. The two functions to be introduced depend on a specific platform, which will reduce compatibility.

#include <winsock2.h>
INT WSAStringToAddress(
    LPTSTR AddressString, INT AddressFamily, LPWSAPROTOCOL_INFO lpProtocolInfo,
    LPSOCKADDR lpAddress, LPINT lpAddressLength
);
    //成功时返回0,失败时返回SOCKET_ERROR

    参数一:含有IP和端口号的字符串地址值
    参数二:第一个参数中地址所属的地址族信息
    参数三:设置协议提供者(Provider),默认为NULL
    参数四:保存地址信息的结构体变量地址值
    参数五:第四个参数中传递的结构体长度所在的变量地址值

The various types emerging in the above functions are almost all typedef declarations for the default data types.

WSAAddressToString is just opposite to WSAStringToAddress in terms of function, it converts the address information in the structure
into a string form.

#include <winsock2.h>
INT WSAAddressToString(
    LPSOCKADDR lpsaAddress, DWORD dwAddressLength,
    LPWSAPROTOCOL_INFO lpProtocolInfo, LPSTR lpszAddressstring,LPDWORD 
    lpdwAddressStringLength);
    //成功时返回0,失败时返回 SOCKET_ERROR。

    参数一:需要转换的地址信息结构体变量地址值
    参数二:第一个参数中结构体的长度
    参数三:设置协议提供者,默认为NULL
    参数四:保存转换结果的字符串地址值
    参数五:第四个参数中存有地址信息的字符串长度

Here is an example of these two functions:

#undef UNICODE
#undef _UNICODE
#include <stdio.h>
#include <winsock2.h>

int main(int argc, char *argv[]){
    char *strAddr="203.211.218.102:9190";
    char strAddrBuf[50];
    SOCKADDR_IN servAddr;
    int size;

    WSADATA wsaData;
    WSAStartup(MAKEWORD(2,2), &wsaData);

    size=sizeof(servAddr);
    WSAStringToAddress(strAddr, AF_INET, NULL,(SOCKADDR*)&servAddr, &size);

    size=sizeof(strAddrBuf);
    WSAAddressToString((SOCKADDR*)&servAddr,sizeof(servAddr), NULL, strAddrBuf,&size);

    printf("Second conv result: %s \n", strAddrBuf);
    WSACleanup();
    return 0;
}

Guess you like

Origin blog.csdn.net/Reol99999/article/details/131615932