Chapter 1. Detailed explanation of TCP protocol

The TCP/IP protocol contains many protocols. In this chapter, we introduce several related protocols: ICMP protocol, ARP protocol, DNS protocol. Learning them is very helpful for understanding network communication.

1. TCP/IP protocol family architecture and main protocols:

The TCP/IP protocol is divided into four layers from bottom to top: data link layer, network layer, transport layer, and application layer.

Application layer ping OSPF DNS user space

Transport layer TCP UDP

Network Layer ICMP IP Core Space

Data link layer ARP data-link RARP

1. (1) Data link layer: The network driver that implements the network card interface to handle the transmission of data on the physical medium (Ethernet, Token Ring).

(2) Two commonly used protocols: ARP (Address Resolution Protocol), RARP: They realize the mutual conversion between IP address and machine physical address (MAC).

(3) The network layer must use the IP address to find a machine, while the data link layer uses the MAC to find a machine, so the network layer must use ARP to convert the IP address to a physical address in order to use the services provided by the data link layer , which is what the ARP protocol is for.

2. (1) Network layer: The network layer implements routing and forwarding of data packets. WAN usually uses numerous routers to connect scattered hosts or LANs. Therefore, two hosts in communication are generally not directly connected, but are connected through multiple intermediate nodes (routers). The role of the network layer is to select these intermediate nodes and determine the path between hosts. The network layer hides the connection details of the upper-layer protocol network topology, so that from the perspective of the transport layer and the network application, the two sides of the communication are directly connected.

(2) The core of the network layer is the IP protocol. The IP protocol decides how to deliver the packet according to the destination IP address of the packet. If the data packet cannot be sent directly to the target host, then the IP protocol finds a suitable next-hop router for him and hands the data packet to the router for forwarding. Repeating this process many times, the packet eventually reaches the destination host, or is dropped due to a failed delivery. It can be seen that the IP protocol uses a hop-by-hop approach to determine the communication path.

(3) Another important communication protocol at the network layer is ICMP: (Internet Control Message Protocol). It is an important complement to the IP protocol and is mainly used to detect network connections. The packet format used by the ICMP protocol is as follows:

8-bit type 8-digit code 16-bit checksum

The content of the message depends on the type of the message.

The "8-bit type" field is used to distinguish the packet type. He divides ICMP packets into two categories: error packets, which are mainly used to respond to network errors, such as target unreachable (type value 3) and redirection (type value 5); the other type is Query packets, which are used to query network information. For example, the ping program uses ICMP packets to check whether the target has arrived (type value is 8). Some OCMP messages still use 8-bit code fields to further subdivide different conditions.

The "16-bit Checksum" field does a cyclic redundancy check (CRC) of the entire message to check if the sum was corrupted during transmission. Different ICMP message types have different body content, and we will discuss host redirection messages in detail later.

3. Transport layer: (1) Provide port-to-port communication. Different from the hop-by-hop communication method used by the network layer, the transport layer only cares about the origin and destination, and does not care about the transfer process of the data packets.

(2) There are three main transport layer protocols: TCP protocol, UDP protocol, SCTP protocol:

  The TCP protocol (Transmission Control Protocol) provides reliable, connection-oriented and stream-based services for the application layer. The TCP protocol uses timeout retransmission, data acknowledgment and other methods to ensure that the data packet is correctly sent to the destination, so the TCP service is reliable. Both parties using TCP communication must first establish a TCP connection and maintain some necessary data structures for the connection in the kernel, such as the state of the connection, read and write buffers, and so on. At the end of the communication, the connection must be closed to release the kernel data. TCP services are stream based. Stream-based data has no length limit, and it flows continuously from one end to the other. The sender can write data into the data stream byte by byte, and the receiver can also read them byte by byte.

  UDP Protocol: (User Datagram Protocol) Opposite of TCP, provides an unreliable, connectionless, datagram-based service for the application layer. When using the UDP protocol, the application usually has to handle the logic of data confirmation, timeout transmission, etc. by itself. UDP is connectionless, and the two communicating parties do not maintain a long-term connection, so each time the application sends data, the address of the receiver must be clearly specified. Each UDP has a length limit, and the receiver must read all its contents at one time with this length as the minimum unit, otherwise the data will be truncated.

  SCTP protocol: (Stream Control Transmission Protocol) A protocol designed for uploading book and telephone signals over the Internet.

4. Application layer protocol: (1) Ping application, not protocol, uses ICMP message to detect network connection, which is an essential tool for debugging network connection environment.

(2) The telnet protocol is a remote login protocol that enables us to complete remote tasks locally.

(3) OSPF: (Open Shortest Path First) is a dynamic routing update protocol used for communication between routers to inform each other of their respective router information.

(4) DNS: (Domain Name Service Protocol) provides the conversion of machine domain name to IP address.

 Second, packaging.

The data encapsulated by the data link layer is called a frame. The frames are different for different transmission media. Ethernet frames are transmitted on the Ethernet, and Token Ring frames are transmitted on the Token Ring network.

An Ethernet frame uses a 6-byte destination physical address and a 6-byte source physical address to represent both sides of the communication.

The maximum transmission unit of a frame is how much upper-layer protocol data the frame can carry at most, which is usually limited by the network type. An Ethernet frame is 1500 bytes. Because of this, ip packets that are too long may need to be fragmented for transmission.

destination physical address source physical address Types of data CRC
6 bytes 6 bytes 2 bytes 46~1500 bytes 4 bytes

A frame is the sequence of bytes that is ultimately transmitted over the physical network. At this point, the packaging is complete.

Three, use.

  When the frame arrives at the destination host, it will be passed along the protocol from bottom to top. Each layer protocol sequentially processes the header data in the frame that the layer is responsible for to obtain the required information, and finally delivers the processed frame to the target application. This process is called apportionment.

  Decommissioning is achieved by means of the "type" field in the header information. The standard document RFC1700 defines all the type fields that identify upper-layer protocols and the corresponding values ​​for each upper-layer protocol.

Ethernet frames use a 2-byte "type" field to identify upper-layer protocols. TCP, and UDP packets distinguish upper-layer applications by the 16-bit port number field in their headers. For example, the port number corresponding to the DNS protocol is 53, and the corresponding port number of the HTTP protocol is 80. The port numbers used by all well-known application layer protocols can be found in the file.

  After the frame is divided, the original data before encapsulation is finally sent to the target service. In this way, from the perspective of the top-level target service, encapsulation and demultiplexing do not seem to have occurred.

Fourth, test the network.

。。。

Fifth, the working principle of ARP protocol.

ARP can realize the conversion of any network layer address (IP) to any physical address (MAC). The principle is: the host broadcasts an ARP request to the network where it is located, and the request contains the network address of the target machine. Other machines on this network doUI excitedly receive this request, but only the requested target machine will respond with an ARP reply containing its own MAC.

(1) The format of the Ethernet ARP request/response message is as follows:

hardware type agreement type hardware address length Protocol address length operate sender Ethernet address sender IP address Destination Ethernet address Destination IP address
2 bytes 2 bytes 1 byte 1 byte 2 bytes 6 bytes 4 bytes 6 bytes 4 bytes

  The hardware type field defines the type of physical address, and a value of 1 identifies the MAC address.  

  The protocol type field indicates the type of the protocol address to be mapped, and its value is 0x800, indicating the IP address.

  The hardware address length field and the protocol address length field, as the names suggest, are in bytes. For MAC addresses, its length is 6; for IP addresses, its length is 4.

  The "Operation" field indicates 4 types of operations: ARP request (value 1), ARP reply (value 2), RARP request (value 3) and RARP (value 4).

  The last four fields specify the Ethernet address and IP address of the communicating parties. The sender fills in 3 fields except the destination Ethernet address to construct an ARP request and send it. When the receiver finds that the destination IP address of the request is itself, it fills in its own Ethernet address, and then exchanges the addresses of the two destinations and the two senders to construct an ARP reply and return it. As can be seen from the above figure, the length of the ARP packet is 28 bytes. If 18 bytes at the head and tail of the Ethernet frame are added, the length of an Ethernet frame carrying an ARP request/response packet is 46 bytes.

(2) View and modify the ARP cache.

  Typically, ARP maintains a cache containing IP-to-MAC mappings of frequently accessed machines. This avoids repeated ARP requests and increases the speed of sending packets.

  Under Linux, you can use the ARP command to view and modify the ARP cache.

(3) Use tcpdump to observe the process of ARP communication.

 。。。

6. How DNS works.

  We usually use the machine's domain name to access this machine, not its IP address directly, such as accessing various websites on the Internet. So how do we translate the various domain names of the machine into IP addresses? This requires a domain name lookup system.

1. Detailed explanation of DNS query and response packets.

DNS is a distributed domain name query service system. Each DNS server stores a large number of mappings between machine names and IP addresses, which are updated dynamically. Many network client short programs use the DNS protocol to query the DNS server for the IP address of the target host. The format of DNS query and response packets is as follows:

16 bit identification 16 bit flag
16-bit number of questions The number of 16-bit response resource records
16-bit authorized resource record number 16-bit number of additional resource records
Query question (variable length)
Reply (variable number of resource records, variable length)
Authorization (variable number of resource records, variable length)
Extra information (variable number of resource records, variable length)

(1) The 16-bit identification field is used to mark a pair of DNS query and response to distinguish which DNS query a DNS response is.

(2) The 16-bit flag field is used to negotiate specific communication methods and feedback communication status. The details of the 16-bit flags field of the DNS warm-up header are as follows:

QR opeode AA TC RD OUT zero rcode

。。。

2. The relationship between socket and TCP/IP protocol family.

  The data link layer, network layer, transport layer, and application layer are implemented in the kernel. Therefore, the operating system needs to implement a set of system calls to enable applications to access the services provided by these protocols. There are two main sets of APIs for implementing this group of system calls, socket and XTI.

  This set of APIs defined by socket provides the following two functions: First, copy application data from user buffer to TCP/UDP kernel send buffer, which has been delivered to kernel to send data, or from kernel TCP/UDP. Copy the data in the receive buffer to the user buffer, and the data has been read. ; Second, applications can use them to modify some header information or other data structures of various layers of protocols in the kernel, so as to finely control the behavior of the underlying communication.

  It is worth mentioning that socket is a set of general network programming interfaces, which can not only access the TCP/IP protocol stack, but also access other network protocol stacks.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324400521&siteId=291194637