Detailed explanation of IP datagram format

The IP protocol provides unreliable and connectionless datagram transmission services. The services provided by the IP layer are realized by the encapsulation and decapsulation of the datagrams by the IP layer. The format of the IP datagram is divided into two parts: the header area and the data area. The header area is various control information added for the correct transmission of high-level data, and the data area includes the data that the high-level protocol needs to transmit.

The format of an IP datagram is as follows:

Note that the data represented in the above figure, the highest bit is on the left, marked as 0 bit; the lowest bit is on the right, marked as 31 bits. When transmitting data in the network, first transmit 0~7 bits, then 8~15 bits, then transmit 16~23 bits, and finally transmit 24~31 bits. Since all binary numbers in the TCP/IP protocol header are required to be in this order when transmitted over the network, it is called network byte order . In actual programming, binary numbers stored in other forms must use the corresponding function of the network programming API to convert the header to network byte order before transmitting the data.

The functions of each field of the IP datagram are as follows:

1) Version number: Occupies a 4-bit binary number, indicating the IP protocol version used by the IP datagram. Currently, the IP protocol with version number 4 in the TCP/IP protocol suite is mainly used in the Internet.

2) Header length: Occupies 4 binary bits, this field indicates the length of the entire header (including options), the length is a 32-bit binary number as a counting unit, the receiver can calculate where the header ends through this field and where to start reading the data. For normal IP datagrams (without any options) the value of this field is 5 (ie 20 bytes in length).

3) Type of service (TOS, type of service): It occupies 8 binary bits and is used to specify the processing method of this datagram. The 8 bits of the Type of Service field are divided into 5 subfields:

(1)—The larger the number of priority (0-7), the higher the priority of the datagram. The routers in the network can use the priority to control the congestion. For example, when the network is congested, the datagram can be chosen according to the priority of the datagram.

(2)—Short delay bit D (Delay): When this bit is 1, the datagram is requested to be transmitted on a short delay channel, and 0 means normal delay.

(3)—High throughput bit T (Throughput): When this bit is 1, the datagram is requested to be transmitted on a high-throughput channel, and 0 means normal.

(4)—High reliability bit R (Reliability): When this bit is 1, the datagram is requested to be transmitted on a high reliability channel, and 0 means normal.

(5)—Reserved bits.

The TCP/IP protocol currently used in the Internet does not handle TOS in most cases, but in actual programming, there are special functions to set the fields of this field. Recommended TOS values ​​are set in some important Internet application protocols:

As can be seen from the above table, for applications that directly interact with users, short delay is generally used; for applications with a large amount of data to be transmitted, high throughput is generally used; for applications where datagrams need to transmit control information, high throughput is generally used. reliability. If TOS is not supported during the lifetime of the datagram, the TOS field is set to 0x00.

4) Total length: occupies 16 binary bits, the total length field refers to the length of the entire IP datagram (header area + data area), in bytes. The starting position and length of the data content in the IP datagram can be calculated by using the header length field and the total length field. Since the length of this field is a 16-bit binary number, the IP datagram can theoretically be up to 65536 bytes long (in fact, it is much smaller than this value due to physical network limitations).

5) Time to Live (TTL, time to live): Occupies 8 binary bits, which specifies the maximum time that a datagram can be transmitted in the network. In practical applications, the time-to-live field is set to the maximum number of routers that the datagram can pass through. The initial value of TTL is set by the source host (usually 32, 64, 128, or 256), and its value is decremented by 1 once it passes through a router that handles it. When this field is 0, the datagram is discarded, and an ICMP message is sent to notify the source host, so it can prevent the datagram from being transmitted endlessly when it enters a loop.

6) Upper-layer protocol identification: It occupies 8 binary bits. The IP protocol can carry various upper-layer protocols. According to the protocol identification, the target terminal can send the received IP datagram to the upper-layer protocol such as TCP or UDP that processes the message.

Common Internet Protocol Numbers:

7) Checksum: Occupies 16 bits of binary number, which is used to verify the validity of the protocol header data, which can ensure the correctness and integrity of the IP header area during transmission. The header checksum field is the checksum calculated based on the IP protocol header, and it does not calculate the data behind the header.

Principle: The sender first sets the checksum field to 0, and then performs the complement sum operation for every 16-bit binary number in the header, and stores the result in the checksum field. Since the receiver includes the checksum that the sender put in the header in the calculation process , if there is no error in the header during transmission, the result of the receiver's calculation should be all 1s.

8) Source address: Occupies a 32-bit binary number, indicating the IP address of the sender.

9) Destination address: Occupies a 32-bit binary number to represent the destination IP address.

=======================IP Datagram Fragmentation and Reassembly ====================== =

Maximum Transmission Unit:

When an IP datagram is transmitted over the Internet, it may pass through multiple physical networks to be transmitted from the source to the destination. Different networks have different physical characteristics of the link layer and medium, so during data transmission, there is a limit on the maximum length of the data frame, which is the maximum transmission unit (MTU ).

When communicating between two hosts on the same network, the MTU value of the network is determined, and there is no fragmentation problem. Fragmentation problems generally exist only in Internets with different MTU values. Since the Internet today mainly uses routers for network connections, the sharding work is usually done by routers.

When the communication between two hosts is going to go through multiple networks with different MTU values, the bottleneck of MTU is the smallest MTU value on the communication path, which is called the path MTU . Since routing is not necessarily symmetric (the route from A to B may be different from the route from B to A), the path MTU is not necessarily consistent in both directions. The following table shows the MTU of several commonly used networks value:

Fragmentation:

The process of dividing a datagram into multiple datagrams for network transmission is called fragmentation , and each IP datagram after being fragmented may reach the target host through different paths.

An IP datagram may or may not be fragmented in transit. If fragmented, the fragmented IP datagram has the same structure as the original IP datagram without fragmentation, that is, it is also composed of two parts: the IP header and the IP data area:

Fragmented IP datagram, the data area is a continuous part of the original IP datagram data area, the header is a copy of the original IP datagram header, but there are two main points with the original unfragmented IP datagram header. Different: Flags and slice offsets:

(1) - Flag: There is a field called "flag" in the IP datagram header, which is represented by a 3-bit binary number:

If the DF (Do not Fragment) flag is set to 1, the datagram cannot be fragmented during transmission. For example, the network connectivity test command ping can be set to not fragment during data transmission with the -F parameter. But in this way, when the data cannot pass through the network with smaller MTU, the error of data unreachable will be generated.

If the MF (More Fragment) flag is set to 1, it means that the datagram is not the last datagram after fragmentation, and the bit of the last datagram is set to 0.

(2) - Fragment offset: After the IP datagram is fragmented, the position of each fragment data area in the original IP data area is represented by a 13-bit fragment offset. In the above figure, the offset of fragment 1 is 0; the offset of fragment 2 is 600; the offset of fragment 3 is 1200. Actually, the offset is in the IP address. Since the offset is calculated in units of 8 bytes, Thus in the IP datagram the offset of fragment 1 is 0; the offset of fragment 2 is 75; the offset of fragment 3 is 150.

Reorganization:

When the fragmented IP datagram arrives at the final target host, the target host assembles each fragment and restores it to the IP datagram sent by the source host. This process is called IP datagram reassembly.

In the IP datagram header, the identifier is represented by a 16-bit binary number, which uniquely identifies each datagram sent by the host. When a datagram is fragmented, each fragment only copies the value of the datagram's "ID" field as it is, so all fragments of a datagram have the same ID.

The principle for the target host to reassemble the datagram is as follows:

(1)—According to the "Identification" field, it can be determined which IP datagram the received fragment belongs to;

(2)—it can be determined whether the fragment is the last fragment according to the "fragment unfinished MF" subfield of the "flag" field;

(3)—According to the "offset" field, the position of the fragment in the original datagram can be determined.

=========================IP Datagram Options======================== ==

IP datagram "options" have two main functions:

1) It is used to realize the control of the datagram transmission process, such as specifying the route to be passed by the datagram;

2) Carry out network testing, such as which routers a datagram passes through during transmission.

The IP "option" domain is divided into four categories, each category is divided into several options, and each option has a certain number:

An IP datagram "option" consists of three parts: option code, option length, and option data. The option code and option length each occupy one byte, in which the option length is used to determine the length of the entire option part; the option code is further divided into copy, option class and option number:

Copy: A bit used to control how options are processed after an IP datagram with options is fragmented. When this bit is set to 1, the option is copied to all shards; when it is set to 0, the option is copied to the first shard only.

The option class and option number are used to determine which option in which category of options the option is, which is actually to determine the function of the option.

1) Source routing: It means that when an IP datagram is transmitted in the Internet, the route it goes through is specified by the source host that sends out the IP datagram, which is different from that when the datagram is transmitted in the Internet, which is automatically searched by the IP layer of the router. the resulting route.

By setting source routing options, you can test the connectivity of specified routes in a network so that datagrams can bypass faulty networks, and can also be used to test the throughput of a specific network. Source routing can be divided into two categories: strict source routing and loose source routing.

(1)—Strict source routing has each router on the path that the sender specifies that the IP datagram must pass, there must be no intermediate routers between adjacent routers, and the order of the routers passed through cannot be changed. If a router sends the source route specified next router is not on its directly connected network, then it returns a "source route failed" ICMP error message. The strict source routing option format is as follows:

The option code field is 100 01001 (0x89), which is the 0-type option 9. The maximum length of the options is 39, and 9 IP addresses can be stored. Because the IP header length field has only 4 binary digits, the entire IP header can only contain up to 15 (<2 4 ) 32-bit words (ie, 60 bytes). Since the fixed length of the IP header is 20 bytes, the option code, option length and pointer share 3 bytes, so there are 60-20-3=37 bytes left to store the IP address list, so only 9 can be stored IP address.

(2)—Loose source routing: The sender specifies a list of IP addresses that a datagram passes through, but on the path of datagram transmission, there may be routers with other IP addresses between the two IP addresses specified in the option. The format is the same as Strict, except that the option code field value is 0x83.

2) Record route: By setting the record route option, the IP datagram can record the IP address of each router on the path when the datagram is transmitted from the source host to the target host. The data format of the record routing option is the same as that of strict source routing, but the value of the option code field is 0x87, and the initial value of the pointer is 4, pointing to the location where the first IP address is stored. The IP address of each router is stored in the data area of ​​the option, and the value of the pointer field also increases (from 4 to 8, 12, 16, and up to 36), which always points to the next location where the IP address is stored. When 9 IP addresses are recorded, the value of the pointer field is 40, indicating that the data area is full.

3) Record timestamp: that is, every time an IP datagram passes through a router, its IP address and time are recorded. The time in the timestamp is in ms, and the value of the timestamp is generally the number of milliseconds since midnight in Greenwich Mean Time (UT, Universal Time). The format of the timestamp option is as follows:

The option code for the timestamp option is 0x44. The option length indicates the total length of the option (usually 36 or 40), and the pointer points to the next available space pointer (values ​​5, 9, 13, etc.).

The "overflow OF" field indicates the number of timestamps that cannot be recorded due to insufficient space in the timestamp option data area;

The "Flag FL" field is used to control the format of the timestamp option, with the following values:

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=324109922&siteId=291194637