Protocol in the eyes of programmers: TCP/IP five-layer network model

Hello, everyone~ I am your old friend: Protecting Xiao Zhou ღ , this issue brings you the TCP/IP five-layer network model in the basic principles of the network, mainly from the concept of the protocol, the network model, and the data analysis. The process of layer transmission is explained in several aspects. After reading it, you can easily understand how data is transmitted in the network. Are you sure you don’t want to take a look~~ Stay tuned for
more exciting things: protect Xiaozhou ღ *★,°* :.☆( ̄▽ ̄)/$:*.°★* '

1. The concept of agreement

In order for data to be transmitted on the network (from the source to the destination), all network devices that the data passes through must follow the same rules, such as: how to establish a connection, how to transmit data, how to parse information from each other, and so on. Only by following this convention can computers communicate with each other. Such a rule is called a protocol, and it is ultimately reflected in the format of data packets transmitted on the network.

The three elements of communication refer to the basic conditions that must be met in the communication process, namely syntax, semantics, and timing.

1. Grammar refers to the symbols used by both parties in the communication and their combinations, which is the so-called communication protocol. In the communication process, both parties need to abide by the same grammatical rules in order to correctly understand each other's information.

Use some special delimiters to give an application layer example:

 This is just a strategy agreed by the application layer protocol, which can be customized, provided that both parties to the communication adopt the protocol.

2. Semantics refers to the meaning of the communication information and its interpretation, which is the so-called message format. During communication, the sender of a message needs to ensure that its meaning is the same as that of the receiver so that the receiver can understand it correctly and respond accordingly.

3. Timing refers to the timing relationship of communication information, which is the so-called time protocol. During the communication process, both parties need to abide by the same timing rules to ensure that messages can be delivered in the correct time sequence, thereby ensuring the reliability and correctness of communication. 

In layman's terms, we send a lot of messages to the other party, how to ensure that these messages are kept in order when they are transmitted to the other party's host on the network, and the messages will not be sent out of order.

For example:

I send messages on qq: "Good morning brother", "Are you free today? Come out and play".

The other party received the message via QQ: "Are you free today? Come out to play." "Good morning brother", this situation is obviously inappropriate, this is the embodiment of timing, bloggers have encountered this situation before using qq to send messages, and only learned about it through screenshots of friends.

To sum up, the three elements of communication are indispensable basic conditions in the communication process. Only when these elements are met, can communication go smoothly and information exchange be realized.


1.1 Why does network communication need to use protocols

Network communication requires a protocol because the protocol specifies a standardized method of data transmission in the network, ensuring that different devices can exchange information correctly and without error. Without the agreement, the devices in the network cannot accurately understand the data sent by each other, nor can they determine how to respond (parse and respond) to the data, which will lead to the failure of network communication.

Specifically, the protocol defines a series of details such as the format, parameters, serialization method (described below) , anomaly detection and retransmission mechanism required for data transmission.

Network communication requires a protocol because it provides a standard framework that enables data to be efficiently transmitted, organized, and managed across a large number of different devices and complex network environments.


2. Network model

In network communication, it is necessary to agree on a protocol. The data (software) of one application layer is received by another designated application layer. The intermediate transmission is very complicated. Imagine that a Chinese sends an email to a foreigner, regardless of the distance How far away, no matter where you are, give a legal receiving address, and the information will come to you. Next, the blogger will show you how the information gets from here to there.

In the face of a complex network communication environment, complex protocols are required to constrain it. One protocol is definitely not suitable for all application scenarios. Even if it can be, it is quite complicated. Therefore, there are many protocols, and each protocol has its own unique characteristics. functions, you can classify these protocols , and at the same time, layer the protocols for these different categories, and agree on the calling relationship between the layers. Only the upper-layer protocols are allowed to call the lower-layer protocols, and the lower-layer protocols provide support for the upper-layer protocols. Allows protocols to be called across layers .

Benefits of protocol layering:

After the protocol is layered, it is easier to replace the protocol of a certain layer in the future. For example: the protocol of the application layer, qq can send messages, and it can only be received and parsed by the qq application (only qq knows it), WeChat You can also send messages, Douyin, Alipay... can all send messages, but the above application software uses different application layer protocols, so it has no effect on how the information is transmitted, right? From a programmer's point of view, the real operation is only the transport layer and the application layer. You can first understand that the transport layer can ensure that your information is sent from your computer to the computer of the designated recipient (this is actually what the network layer does) , and the information will be handed over to an application according to the port number (discussed below) . As for how the application parses the information, it is up to each application.

The above example is to tell everyone that after the protocol is layered, the "coupling degree" between the layers is relatively low , that is to say, each layer of the protocol is responsible for doing its own work, assembly line operation, and employees with a process in the middle go fishing. For the front employees, there is no impact, and they can still work. For the latter employees, if there is no goods, it means they have nothing to do. If it is so low, it is not necessary to replace people quickly ( the second advantage is that it is convenient to replace a certain layer of protocols ). The employees boasted: this is fierce~~

Look at the protocol layering from the perspective of the program:

Invoking services : You don't need to care about how the interface is implemented, just care about what can be done through those interfaces.

Provide services : encapsulate the details of hidden function implementation, and provide open interfaces for others to use


2.1 OSI seven-layer model

OSI (Open System Interconnection), Open System Interconnection, is formulated and published by the International Organization for Standardization (ISO). This model divides computer network communication into 7 different levels to help the interaction and collaboration between different network devices and software.
The OSI seven-layer network model is a logical definition and specification: the network is logically divided into seven layers.
Its biggest advantage is that it clearly distinguishes the three concepts of service, interface, and protocol. The concept is clear and the theory is relatively complete. Reliable communication between different systems and different networks is realized through
seven hierarchical structural models.

  1. Application Layer: The application layer defines protocols, interfaces, and data formats to support data exchange for specific applications.

  2. Presentation Layer: The presentation layer is responsible for presenting data to the application so that the application can understand and use the data. How to encode and decode data, and how to encrypt, compress, and convert data formats.

  3. Session Layer (Session Layer): The session layer provides a mechanism that makes it possible to exchange data between different computers. This layer is responsible for establishing, maintaining and ending sessions, and handling errors that occur during sessions. The session layer can also support communication between multiple applications.

  4. Transport Layer (Transport Layer): manages data transmission between two nodes, is responsible for reliable transmission, and ensures that data is reliably transmitted to the target address.

  5. Network Layer (Network Layer): address management: record the ip of the source and destination host and the ip of the destination host, routing and forwarding: responsible for selecting a reliable and efficient transmission path for the two nodes.

  6. Data Link Layer (Data Link Layer): Responsible for transmitting data frames between connected devices, such as: network cards, switches
  7. Physical Layer (Physical Layer): The physical layer processes the original bit stream on the transmission medium, and physical layer equipment: a series of transmission media such as networks and optical fibers.

Although the theory is good: But the OSI seven-layer network model only exists in textbooks, and there is no real implementation. In fact, the real network only uses four or five of the seven layers. It is currently the most widely used network model It is a TCP/IP five-layer network model.


3. TCP/IP five-layer network model

The TCP/IP five-layer model is a collection of network protocols that covers all the protocols required for computers to communicate over the Internet. This model is designed to solve the problem of interconnection between different computers and to ensure the reliability of data during transmission. Next, we will talk about the functions corresponding to each layer protocol in an easy-to-understand manner. For details, please refer to the IOS reference model.

  1. Application layer: only pay attention to the transmitted data, what we want to do, during the development of a program, the programmer uses a series of technical means (in Java, the inputstream and outputstream of the Socket class can be used to realize the reading of the transmission layer data Read and write) can get these data, how our application layer handles these data according to the agreed format after getting it, that is regardless of the transport layer.

  2. Transport layer: manages the data transmission between two nodes and is responsible for reliable transmission. The transport layer does not pay attention to the path of intermediate transmission, but only pays attention to the starting point and end point and ensures that the data is reliably transmitted to the target address.

  3. Network layer: address management: record the ip of the source and destination host and the ip of the destination host, routing and forwarding: responsible for selecting a reliable and efficient transmission path for two nodes - path planning.

  4. Data link layer: Responsible for the transmission of data frames between connected devices, mainly focusing on the transmission between two adjacent device nodes, such as data transmission between connected switches and network cards through network cables/optical fibers/network interfaces.
  5. Physical layer: infrastructure equipment for network communication: a series of transmission media such as network and optical fiber.


As an ordinary programmer, what we pay attention to is actually the application layer, because the application layer really cares about the purpose of the data, but the transport layer provides services to the application layer, so we also need to focus on the transport layer, and network programming is mainly around Expand with the application layer.

In addition to the application layer, the other four layers are encapsulated by the operating system, which is equivalent to that the system directly provides the application layer with an interface to operate the transport layer, which can be used directly (in Java, through the InputStream of the Socket class and OutputStream to realize the reading and writing of transport layer data) , attention! ! Only the upper-layer protocol is allowed to call the lower-layer protocol, and the lower-layer protocol provides support for the upper-layer protocol. Cross-layer protocol calls are not allowed, so the application layer can only directly operate the transport layer.


4. Under the background of protocol layering, how is data transmitted through the network?

According to the TCP/IP five-layer network model protocol, there are five layers: application layer, transport layer, network layer, data link layer, and physical layer. The transport layer provides services for the application layer, and network programming is mainly to learn the interaction between the transport layer and the application layer.

4.1 Renegotiate the agreement

For client and server applications, the request (sending information to the server) and the response (the server gives a response) need to agree on the same data format for convenience ( encapsulation and distribution, described below ).

  1.  The client sends the request and the server parses the request to use the same data format.
  2.  The server returns the response and the client parses the response request to use the same data format.
  3. The request format and response format can be the same, or they can be different conventions. It is enough to ensure that the format of the request is the same as the format of the parsing request, and the format of the response is the same as the format of the parsing response. The main purpose of agreeing on the same data format is to Let the receiver know how to parse the data when parsing.
  4.  You can use a well-known protocol, or you can agree on a data format yourself, which is a custom protocol.

For example: when the blogger above described the three elements of communication, he gave an example of application layer datagrams.


4.2 Encapsulation/decomposition vs serialization/deserialization

Protocol (protocol), which is finally reflected in the format of data packets transmitted on the network.

Draw a rough sketch to let everyone understand the communication process of the client server:

When each of us uses QQ, we must first connect to the Internet, so that we can chat happily with friends. In fact, the process of networking is to establish a connection with the server of Tencent QQ. Of course, Tencent has many servers, and information can be exchanged between servers. Everyone After connecting to Tencent's server, this is the basis of qq communication. At this time, if we send information to our friend qq, the qq application layer will generate an application layer datagram. The information inside must include your qq and the other party's qq , there is still time, information body and other information, and then the data will be handed over to the transport layer. The transport layer will ensure that the data is sent to the server in accordance with the standard network protocol. After the server reads your request, it will analyze the information and see Who is it sent to? It contains the IP address (to confirm the location of the communication device on the Internet), the port number (equivalent to telling that the information is sent by qq), and the qq number. If the IP device is connected to Tencent's server, if not, wait for the friend's device to go online. If your friend qq is online, send the message to the friend according to the IP address of the device that owns the account (only one communication device is allowed at the same time- The mobile phone logs into the QQ, and the computer is another set of logic), of course, this is a string of binary data. After the friend’s device receives the data, the operating system analyzes it according to the standard network communication protocol. When the data analysis reaches the transport layer, it finds the port number description It is qq, so the information is handed over to qq for processing. After receiving the information, qq analyzes the data according to the message agreement designed by the application developer of qq, and finally the message is displayed on the device of your friend. The information you send to your friend .


In the above process, the transmitted data is transmitted in the format of data packets.

If the information is from the application layer to the physical layer, the layer-by-layer protocol will package the objects in the program layer by layer, and finally convert them into binary numbers. If the conversion process uses well-known protocols, such as: UDP, TCP, protocol network at the transport layer Layer IP protocol, this conversion process is called encapsulation , if it is a niche protocol (including custom protocols), this conversion process is called serialization .

Receiver application, data conversion when receiving data, that is, parsing the original binary number, data from the physical layer to the application layer

If it is parsed using a well-known protocol, this parsing is also called splitting . If it is using a niche protocol (including a custom protocol), this action is also called deserialization.


4.3 What is a port number

A port number is a number used to identify an application on a computer network. Multiple network applications can run simultaneously on a host, and each application will be assigned a unique port number so that other computers can access the application through the network.

For example, my computer qq sent you a message. This message is sent to your computer through the network. After entering your computer, how can this message know which application program (process) to send it to? WeChat is also owned by Tencent, and I haven’t seen messages sent by QQ. The key point is that the application will bind a port number when it starts (the port number can be used randomly within a certain range), and the port number in a host The number cannot be repeated, so after the message enters the host, it will send the information to the specified application program according to the port number, and the application program will analyze the information according to the datagram protocol of the application layer (this can be the protocol agreed by the communication parties, or qq's own protocol) .

The port number only occupies two bytes, and the data range that can be described is [0, 65535], that is, there are only so many port numbers that we can use. Port numbers less than 1024 are called "well-known port numbers", which are It is provided for some well-known servers, for example: the port number of http server is 80, the port number of ssh is 22, and the port number of ftp server is 21. So we should try to avoid using this part of the port in the program design.


4.4 What is an IP address

The abbreviation of IP address (Internet Protocol Address), that is, the Internet protocol address belongs to the protocol of the network layer. It is a set of numbers used to uniquely identify a device on a network, similar to a house number. The role of IP addresses is to allow data packets to be correctly transmitted and routed on the network to their destinations. In the Internet, every device connected to the network needs to have a unique IP address so that other devices can communicate with it through this address. An IP address consists of 32-bit or 128-bit binary numbers and can be divided into two formats, IPv4 and IPv6. IPv4 is the widely used version at present, but with the increase of network connected devices, IPv6 gradually replaces IPv4 and becomes the new IP address standard.

Only by knowing the address can you find people. At present, this is a one-sided understanding. The real understanding is to bind an ID number and search through the entire Internet. After finding you, the network layer chooses a reliable path to connect you.


4.5 Encapsulation of a piece of information

When the sender sends data, it needs to encapsulate the data from top to bottom (TCP/IP network model), from the application layer to the physical layer, to the protocol of the corresponding layer.

Take qq to send messages as an example:

Zhang San used QQ to send Li Si a "good morning";

1. The application layer (qq application program) gets the message "Good morning" sent by Zhang San, encapsulates it according to the user data of the sender and receiver, and encapsulates it into an application layer data packet.

The application layer datagram can be understood as a string of strings connected together, which can be distinguished by some special symbols, and it is a protocol that can be customized.


2. Application layer to transport layer: use UDP protocol

The protocol used by the transport layer can be selected by the application layer. Here is an example of the UDP protocol:

The application layer needs to call the interface provided by the transport layer to process data. In Java, you can use Java's native socket (Socket) API to operate UDP (User Datagram Protocol).

//创建DatagramSocket对象来表示套接字:
DatagramSocket socket = new DatagramSocket();

Today I mainly talk about concepts, network programming is the follow-up content, please look forward to it~~

The UDP protocol encapsulates the above datagram - adding a UDP header.

The most important thing in the transport layer protocol is to add the source port and destination port to the application layer datagram. At this time, the port can be regarded as representing qq, and adding a header is essentially a concatenation of strings.

For a detailed introduction to the UDP protocol, please refer to another blog of the blogger: Understanding from the perspective of a programmer: UDP protocol_Protect Xiao Zhouღ's Blog-CSDN Blog


3. Transport layer to network layer

After the network layer obtains the UDP datagram, it will perform secondary encapsulation and add an IP protocol header. The most common protocol at the network layer is the IP protocol.

Both source IP and source port number describe where the data comes from.

Both destination IP and destination port number describe where the data goes.


 4. Network layer to data link layer

The most typical protocol is - Ethernet (data link layer + physical layer), which also encapsulates network layer datagrams and concatenates strings. Add Ethernet frame header and frame tail.


MAC address is the abbreviation of Media Access Control Address, also known as physical address. It is also unique and can be understood as the ID card of the network card . The MAC address is represented by 6 bytes, 48 ​​binary numbers, usually 12 hexadecimal numbers.

The reason why the MAC address was born is the same as the IP address. It is used to identify devices in the computer network. Each network interface has a unique MAC address. The creator of the IP address and the creator of the MAC address were designed separately. The purpose is the same, the packets are properly transmitted and routed on the network to their destination.

MAC addresses can be assigned by network equipment manufacturers or randomly generated. In a LAN, when a packet of data is sent from one device to another, the MAC address of the destination device is used to determine which interface the packet should be sent to.

The Ethernet frame trailer is the last part of the Ethernet data frame, which marks the end of the data frame. The length of this field is 4 bytes. The data frame tail contains two fields: CRC and frame end delimiter (FCS). CRC stands for cyclic redundancy check code, which is used to detect whether the data frame is damaged; the frame end delimiter is a special sequence composed of 16 1s, which is used to tell the receiver the end of the data frame.

The Ethernet data frame tail is an important component to ensure that the Ethernet data frame can be sent and received correctly.


5. Data link layer to physical layer

At this point, the physical layer needs to convert the binary data transmitted from the data link layer into optical signals - optical fibers, electrical signals, electromagnetic wave signals, etc. for transmission.


In addition to the application layer in the above process, the operating system will automatically package it for us. After the package is packaged, it will naturally be transmitted to the servers of major operators (such as Mobile Unicom). The process of our networking is to connect to their servers and access the servers. For the data in the data, the server will perform secondary forwarding according to the data we request, and send (response) information to the destination host based on the server. Our devices are not directly connected to the network.


4.6 Distribution of a message

When the sender sends data, it needs to transfer the data from bottom to top (TCP/IP network model), from the physical layer to the application layer, so as to hand it over to the protocol of the corresponding layer for use-analysis.

The information received by the device is just the opposite of the above~~

1. The physical layer device and network card receive high and low level signals

The received signal is parsed and restored to a binary sequence.

2. Physical layer to data link layer

When the data link layer receives data from the physical layer, it does the following:

  • Check data integrity: The data link layer checks the data for parities or errors to ensure that no errors occurred during data transmission.

  • Framing: The data link layer divides the received bit stream into data frames for better management and control.

  • Sequence number confirmation of data frames: For reliable transmission protocols, the data link layer will confirm the sequence and number of data frames that have been correctly received.

  • Address identification and forwarding: The data link layer checks the address of the target device, and then forwards the data to the target device or the next hop.

  • Control flow: If necessary, the data link layer will use flow control methods to limit the sending rate of data to ensure the stability and reliability of the network.

The data link layer is responsible for managing and controlling the data transmitted by the physical layer to ensure successful data transmission.

The data frame is then submitted to the network layer.


3. From data link layer to network layer

When the network layer receives a packet from the data link layer, it performs several tasks:

Decapsulation: The network layer will remove the data link layer header and trailer, leaving only the IP header and payload.

Checksum verification: The network layer will verify the checksum of the IP header to ensure that there are no errors in the header.

Routing: The network layer examines the destination IP address in the IP header and uses the IP to determine the best path through the network to reach the destination host.

Data transmission is not just a direct point-to-point mechanism, but there are many transit devices in the middle.

After completing these tasks, the network layer forwards the packet to the appropriate transit device for transmission to the next forwarding point in the network route until the data reaches the destination host.

At this time, the IP protocol of the network layer analyzes the datagram of the network layer and removes the IP header.


4. From the network layer to the transport layer

When the transport layer receives the data from the network layer, it will process the data according to the protocol. The most common of these are the Transmission Control Protocol (TCP) and User Datagram Protocol (UDP). Because the datagram is the data encapsulated by the UDP protocol, it should also be parsed (separated) using the UDP protocol when parsing.

At this time, the UDP protocol is analyzed and processed, and the specific application program (qq) is distinguished by the destination port number. Each networked application program needs to be associated with a port number, and then the UDP header is removed, the payload is taken out, and the data is delivered to the application layer.


5. Transport layer to application layer

When the data is in the transport layer, the port number is used to determine which application receives it (qq), so at this time, the application qq parses the data.

At this time, if this is a string, I believe you have many ways to extract the fields in the string according to a certain format. There is a sender, a receiver, a time, and a message body. At this time, qq can extract the The parsed data is displayed to the user interface. The receiver’s qq function seems to be useless, but it is actually used to determine during network transmission. The qq server needs to find the network login device based on the receiver’s qq, which is also mentioned in the above example.


V. Summary

The two devices communicate with each other, the sender: layer-by-layer encapsulation, and the receiver: layer-by-layer separation.

In a real network environment, data transmission is very complicated. First of all, determining the location of the host IP in the entire Internet is like finding a needle in a haystack. Even if you find it, there may not be a direct route to it, so the data transmission may have to go through many nodes. Forward.

 When a computer is connected to the network, its network-connected device will randomly assign a port number. If the computer needs to obtain information or send information on the network, it needs to use an external network IP address as the "unique" identifier of the host, and transmit data in layers. Transit devices are often used, such as hubs, switches, routers, repeaters, etc.

Switches: network layer devices

1. The data in the physical layer is distributed to the data link layer. In the process, it is necessary to judge whether the destination MAC is accurate and whether the data frame is complete.

2. The switch encapsulates the data at the data link layer, encapsulates it to the physical layer, and continues to forward the data.

Router: Network layer device

1. After receiving the data, the data will be directly divided (parsed) from the physical layer to the network layer. The router belongs to the device of the network layer. The datagram will provide the destination IP address at the network layer; and then the router will proceed to the next stage according to the destination IP The addressing operation, the IP address is in the process of transmission while planning the path, at this time the source IP will be replaced with the IP address of the router, and the original source IP address will be stored by the router, the old and new, source IP Form a mapping relationship. If the router finds that the road ahead cannot reach the destination IP in the next stage of addressing operation, it will go back to the parent node and continue addressing by another road.

2. Encapsulate the obtained transport layer datagram from the network layer to the physical layer. When passing through the data link layer, the MAC address will also be adjusted. When the data reaches the target host, it will be distributed from the physical layer to the application layer.

In network communication, the above description: Layer 2 and Layer 3 forwarding are two different routing methods.

Layer 2 forwarding is based on physical addresses, which is also called MAC address forwarding. When a computer sends data within a LAN, the data packet is sent to the MAC address of the destination computer. This process does not require the help of a router, since all devices are connected within the same subnet.

Layer 3 forwarding is based on network layer protocols, such as IP addresses, to route data packets. When a computer sends a packet of data, a router routes the packet to the appropriate network based on the destination address. If a data packet needs to cross multiple networks, the router will choose the best path to forward. This process requires the help of routers, which provide connections for different networks.

Layer 2 forwarding is suitable for communication within a LAN, while Layer 3 forwarding is suitable for communication between different networks.            


So far, bloggers have finished sharing the TCP/IP five-layer network model in network programming. I hope it will be helpful to everyone. If there is anything wrong, please criticize and correct. 

This issue is included in the blogger's column - JavaEE, which is suitable for programming beginners. Interested friends can subscribe to view other "JavaEE basics".

Next issue preview: UDP protocol

Thank you to everyone who read this article, and more exciting events are coming: Protect Xiaozhou ღ *★,°*:.☆( ̄▽ ̄)/$:*.°★* 

Met you, all the stars are falling on my head...

Note: This blog is mainly about basic conceptual knowledge, and there are many references. If there is anything wrong, please contact to correct it. 

Guess you like

Origin blog.csdn.net/weixin_67603503/article/details/130137198