Network basics (4): protocol format

Packet encapsulation

  The transport layer and the following mechanisms are provided by the kernel , the application layer is provided by the user process, the application program interprets the meaning of the communication data, and the transport layer and the following processes the details of the communication, sending data from a computer through a certain path To another computer. When the application layer data sent to the network stack through the protocol, each protocol should be added header (header) of data, called package ( Encapsulation ), as shown below:
          Insert picture description here
  different protocol layers for data packets have different titles, in the transport layer is called the segment (segment), the network layer is called the data packets (datagram), the known link-layer frames (frame). After the data is encapsulated into a frame, it is sent to the transmission medium. After reaching the destination host, the corresponding header of each layer of the protocol is stripped off, and finally the application layer data is handed over to the application program for processing.


Ethernet frame format

  The frame format of Ethernet is as follows:
          Insert picture description here
  the source address and destination address refer to the hardware address of the network card ( also called MAC address ), the length is 48 bits, and it is solidified when the network card leaves the factory. You can use the ifconfig command in the shell to view, the "HWaddr 00:15:F2:14:9E:3F" part is the hardware address. The protocol field has three values, corresponding to IP, ARP, and RARP. The end of the frame is the CRC check code.
  The data length in the Ethernet frame specifies a minimum of 46 bytes and a maximum of 1500 bytes. The length of the ARP and RARP data packets is not enough for 46 bytes, and padding bits must be added at the back. The maximum value of 1500 is called the maximum transmission unit ( MTU ) of the Ethernet . Different network types have different MTUs. If a data packet is routed from the Ethernet to the dial-up link, the length of the data packet is greater than the MTU of the dial-up link. the packet is fragmented ( fragmentation ). There is also "MTU:1500" in the output of the ifconfig command. Note that the concept of MTU refers to the maximum length of the payload in a data frame, not including the length of the frame header.

ARP datagram format

  During network communication, the application program of the source host knows the IP address and port number of the destination host, but does not know the hardware address of the destination host. The data packet is first received by the network card and then processed by the upper layer protocol. If the received data is If the hardware address of the packet does not match that of the local machine, it will be discarded directly. Therefore, the hardware address of the destination host must be obtained before communication. The ARP protocol plays this role. The source host sends an ARP request, asking "what is the hardware address of the host whose IP address is 192.168.0.1", and broadcasts this request to the local network segment (the hardware address of the Ethernet frame header is filled with FF:FF:FF:FF:FF :FF means broadcast), the destination host receives the broadcast ARP request and finds that the IP address matches the machine, then sends an ARP response packet to the source host, and fills in its own hardware address in the response packet.
  Each host maintains an ARP cache table , which can be viewed with the arp -a command. The entry in the cache table has an expiration time (usually 20 minutes). If an entry is not used again within 20 minutes, the entry will become invalid, and an ARP request will be sent next time to obtain the hardware address of the destination host. Think about it, why should the table entry have an expiration time instead of being valid all the time?
  The format of the ARP datagram is as follows: the
        Insert picture description here
  source MAC address and the destination MAC address appear once in the Ethernet header and the ARP request, which is redundant for the case where the link layer is Ethernet, but if the link layer is of other types The network may be necessary. The hardware type refers to the link layer network type, 1 is Ethernet, the protocol type refers to the type of address to be converted, 0x0800 is the IP address, and the following two address lengths are 6 and 4 (bytes) for the Ethernet address and the IP address, respectively. An op field of 1 indicates an ARP request, and an op field of 2 indicates an ARP response.
Look at a specific example.

The request frame is as follows (for clarity, the byte count is added to the front of each line, each line is 16 bytes):
Ethernet header (14 bytes) 0000: ff ff ff ff ff ff
00 05 5d 61 58 a8 08 06 ARP Frame (28 bytes) 0000: 00 01 0010: 08 00 06 04 00 01
00 05 5d 61 58 a8 c0 a8 00 37 0020: 00 00 00 00 00 00 c0 a8 00 02
Filling bits (18 bytes) 0020: 00 77 31 d2 50 10 0030: fd 78 41 d3 00 00 00 00 00 00 00
00 00 Ethernet header: the destination host uses the broadcast address, the source host's MAC address is 00:05:5d:61:58:a8, the upper layer protocol type 0x0806 means ARP.
ARP frame: hardware type 0x0001 means Ethernet, protocol type 0x0800 means IP protocol, hardware address (MAC address) length is 6, protocol address (IP address) length is 4, op is 0x0001 means requesting the MAC address of the destination host, source host The MAC address is 00:05:5d:61:58:a8, the IP address of the source host is c0
a8 00 37 (192.168.0.55), the MAC address of the destination host is all 0s to be filled, and the IP address of the destination host is c0 a8 00
02 (192.168 .0.2).
Since Ethernet stipulates that the minimum data length is 46 bytes and the ARP frame length is only 28 bytes, there are 18 bytes of padding bits. The content of the padding bits is not defined and is related to the specific implementation. The response frame is as follows:
Ethernet header 0000: 00 05 5d 61 58 a8 00 05 5d a1 b8 40 08 06 ARP frame 0000: 00 01
0010: 08 00 06 04 00 02 00 05 5d a1 b8 40 c0 a8 00 02 0020: 00 05 5d
61 58 a8 c0 a8 00 37 Filling bits 0020: 00 77 31 d2 50 10 0030: fd 78 41 d3 00
00 00 00 00 00 00 00
Ethernet header: the MAC address of the destination host is 00:05:5d:61:58:a8, the MAC address of the source host is 00:05:5d:a1:b8:40, and the upper layer protocol type 0x0806 represents ARP .
ARP frame: hardware type 0x0001 means Ethernet, protocol type 0x0800 means IP protocol, hardware address (MAC address) length is 6, protocol address (IP address) length is 4, op is 0x0002 means response, source host MAC address is 00: 05:5d:a1:b8:40, the IP address of the source host is c0
a8 00 02 (192.168.0.2), the MAC address of the destination host is 00:05:5d:61:58:a8, and the IP address of the destination host is c0 a8 00
37 (192.168.0.55).

IP segment format

      Insert picture description here
; The header length and data length of the IP datagram are variable, but always an integer multiple of 4 bytes . For IPv4, the 4-bit version field is 4. The value of the 4-bit header length is in the unit of 4 bytes**, and the minimum value is 5**, which means that the minimum header length is 4x5=20 bytes, that is, no For the IP header with any option, the maximum value that can be represented by 4 bits is 15, which means that the maximum length of the header is 60 bytes . The 8-bit TOS field has 3 bits to specify the priority of the IP datagram (currently obsolete), and 4 bits to indicate the optional service type (minimum delay, maximum throughput, maximum reliability, minimum cost) ), and another bit is always 0. The total length is the number of bytes in the entire datagram (including the IP header and the IP layer payload). Each time an IP datagram is transmitted, the 16-bit identifier is increased by 1, which can be used for fragmentation and reassembly of the datagram. The 3-bit flag and 13-bit chip offset are used for fragmentation. TTL (Time to live) is used like this: the source host sets a time to live for the data packet, such as 64, and the value is reduced by 1 every time a router passes. If it is reduced to 0, it means that the route is too long and still cannot be found. When the network reaches the destination host, the packet is discarded. Therefore, the unit of the survival time is not seconds, but hops. The protocol field indicates whether the upper layer protocol is TCP, UDP, ICMP, or IGMP. Then there is the checksum, only the IP header is checked, and the higher-level protocol is responsible for the verification of the data. The length of an IPv4 IP address is 32 bits.

UDP datagram format

      Insert picture description here
The following analyzes a UDP-based TFTP protocol frame.
Ethernet header
0000: 00 05 5d 67 d0 b1 00 05 5d 61 58 a8 08 00
IP header
0000: 45 00
0010: 00 53 93 25 00 00 80 11 25 ec c0 a8 00 37 c0 a8
0020: 00 01
UDP header
0020 : 05 d4 00 45 00 3f ac 40
TFTP protocol
0020: 00
01'c'' :''''q' 0030:'w''e''r''q''.''q''w'' e'00'n''e''t''a''s''c''i'
0040:'i'00'b''l''k''s''i''z''e '00 '5''1''2'00't''i'
0050:'m''e''o''u''t'00 '1''0'00't''s'' i''z''e'00 '0'
0060: 00 Ethernet header: the source MAC address is 00:05:5d:61:58:a8, the destination MAC address is 00:05:5d:67:d0:b1 , The upper layer protocol type 0x0800 represents IP.
  IP header: each byte 0x45 contains a 4-bit version number and a 4-bit header length. The version number is 4, that is, IPv4, and the header length is 5, indicating that the IP header does not have an option field. The service type is 0, and no service is used. The 16-bit total length field (including the length of the IP header and the IP layer payload) is 0x0053, which is 83 bytes. Adding the 14 bytes of the Ethernet header shows that the entire frame length is 97 bytes. The IP packet identifier is 0x9325, the flag field and the fragment offset field are set to 0x0000, that is, DF=0 allows fragmentation, and MF=0 this datagram has no more fragments and no fragmentation offset. TTL is 0x80, which is 128. The upper layer protocol 0x11 represents the UDP protocol. The IP header checksum is 0x25ec, the source host IP is c0 a8 00 37 (192.168.0.55), and the destination host IP is c0 a8 00 01 (192.168.0.1).
UDP header: the source port number 0x05d4 (1492) is the port number of the client, and the destination port number 0x0045 (69) is the well-known port number of the TFTP service. The length of the UDP packet is 0x003f, which is 63 bytes, including the length of the UDP header and the payload of the UDP layer. The checksum of the UDP header and the UDP layer payload is 0xac40.
TFTP is a text-based protocol. Each field is separated by byte 0. The first 00 01 indicates that a file is requested to be read. The following fields are:
c:\qwerq.qwe
netascii
blksize 512
timeout 10
tsize 0
  General network communication is like the TFTP protocol. The two parties in the communication are the client and the server. The client actively initiates the request (the above example is the request frame initiated by the client), and the server passively waits, receives, and responds to the request. The client's IP address and port number uniquely identify the TFTP client process on the host, and the server's IP address and port number uniquely identify the TFTP service process on the host. Since the client is the party that initiates the request, it must know The IP address of the server and the port number of the TFTP service process. Therefore, some common network protocols have default server ports. For example, HTTP service defaults to TCP protocol port 80, FTP service defaults to TCP protocol port 21, and TFTP service defaults to UDP protocol. Port 69 (as shown in the example above). When using the client program, you must specify the server's host name or IP address. If the port number is not explicitly specified, the default port will be used. Please refer to the man pages of ftp, tftp and other programs to learn how to specify the port number. /etc/services lists all well-known service ports and corresponding transport layer protocols, which are regulated by the Internet Assigned Numbers Authority (IANA). Some of these services can use either TCP or UDP. For clarity, IANA stipulates that such services use the same TCP or UDP default port number, while other TCP and UDP same port numbers correspond to different services.
  Many services have a well-known port number, but the port number of the client program does not need to be well-known. It is often the system automatically allocates a free port number every time the client program is run, and releases it when it is used up. The port number called ephemeral, think about why?
  As mentioned earlier, the UDP protocol is not connection-oriented and does not guarantee the reliability of transmission. For example,
  the UDP protocol layer at the sender only encapsulates the data from the application layer into segments and delivers it to the IP protocol layer. Even if the task is completed, if there is a network failure The segment cannot be sent to the other party, and the UDP protocol layer will not return any error information to the application layer.
  The UDP protocol layer at the receiving end just delivers the received data to the corresponding application according to the port number to complete the task. If the sending end sends multiple data packets and passes through different routes on the network, the order will be out of order when they arrive at the receiving end. However, the UDP protocol layer does not guarantee that it will be delivered to the application layer in the order in which it was sent.
  Usually the UDP protocol layer at the receiving end puts the received data in a fixed-size buffer waiting for the application to extract and process. If the application extracts and processes slowly, and the sender sends it quickly, it will Loss of data packets, the UDP protocol layer does not report this error.
  Therefore, applications that use the UDP protocol must consider these possible problems and implement appropriate solutions, such as waiting for a response, retransmitting over time, numbering data packets, flow control, etc. Generally, application programs that use the UDP protocol are relatively simple to implement, and only send some messages that do not require high reliability, instead of sending a large amount of data. For example, the UDP-based TFTP protocol is generally only used to transfer small files (so it is called trivial's ftp), while the TCP-based FTP protocol is suitable for the transmission of various files. How does the TCP protocol use connection-oriented services instead of application programs to solve the problem of transmission reliability.

Guess you like

Origin blog.csdn.net/qq_40329851/article/details/114810447