Summary of common network protocols

foreword

This blog will summarize the common protocols based on the five-layer computer network model, with the purpose of gaining a deeper understanding of the overall network transmission process and related network principles through these specific protocols

Review of the five-layer model of computer network

insert image description here

  • Application layer: Provide users with network communication services for the user's application process
    Protocols - DNS protocol, HTTP protocol, HTTPS protocol
  • Transport layer: Responsible for data transmission between two hosts , and transfer data from the sending end to the receiving end
    Protocol - TCP protocol, UDP protocol
  • Network layer: Responsible for address management and routing selection for transmission , and determine a suitable path in many complex network environments.
    Protocol - IP protocol
  • Data link layer: responsible for the transmission and identification of data frames between devices , encapsulating datagrams transmitted by the network layer into frames, and transmitting
    protocols between two devices at the same data data link node - ARP protocol, MTU protocol
  • Physical layer: responsible for the transmission of photoelectric signals , to realize the transparent transmission of bit streams between adjacent computer nodes

The five-layer network model is basically familiar, but have you ever thought about why the network is layered in this way?

The most direct answer is that in order to simplify the complexity of network design , the communication protocol adopts a layered structure, and each layer is independent and coordinated with each other, so that the purpose of high efficiency can be achieved. Just as in the design pattern, when designing a complex program, try to decouple the functions of the program, for complex network design, layered design is also a very wise approach.

The essence of network layering is that each layer independently completes a task without considering the realization of other tasks , and because of different tasks, there are different devices corresponding to each layer. (From the example to the application, the physical layer only needs to be concerned with how the photoelectric signals of 0 and 1 are transmitted, and does not care about the content it expresses; further up, the data link layer only needs to care about how the encapsulated data frames are sent accurately. To the destination host of the corresponding MAC address, without having to care about the specific content of the datagram and the specific way it will be passed through optical fiber or LAN...the same goes up for all layers)

application layer protocol

The application layer protocol is mainly responsible for the communication between various programs. When a data is transmitted by the network, the application layer first encapsulates the data according to the corresponding protocol, and then passes it to the next layer of transport layer. After a series of network transmissions, the data reaches the receiving end At the same time, it is distributed layer by layer, and the last layer is distributed by the application layer, and finally the data is obtained.

DNS protocol:

The DNS protocol is an application layer protocol based on TCP and UDP. The default port is 53. It communicates through the UDP protocol by default, but if the packet is too large, it will switch to the TCP protocol.

The role of Domain Name System (DNS) is to convert human-readable domain names (such as www.baidu.com) into machine-readable IP addresses (such as 192.0.2.44), essentially through the correspondence between DNS domain names and IP addresses conversion, and this correspondence is stored in the DNS server

Domain name resolution process:

The resolution of the domain name can be roughly divided into two steps: the first step is that the client sends a DNS request message to the local DNS server, and the message carries the domain name to be queried; the second step is that the local DNS server responds to the machine with a DNS Response message, which carries the IP address corresponding to the query domain name

The specific process is as follows:

  1. Query in the local cache, if there is, return the corresponding IP, if not, send the request to the DNS server
  2. When the local DNS server receives the query, it first queries in the server management zone record, if not, then queries in the server's local cache, and if not, sends the request to the root domain name server
  3. The root domain name server is responsible for parsing the root domain part of the request, and then returns the DNS service address containing the next-level domain name information to the local DNS server
  4. The local DNS server uses the address resolved by the root domain name server to access the next-level DNS server and obtain the DNS server address of the next-level domain
  5. Follow the above recursive method to approach the query target step by step, and finally find the corresponding IP address information on the DNS server with the target domain name
  6. The local DNS server returns the finally queried IP to the client, allowing the client to access the corresponding host

HTTP protocol

The HTTP protocol is a simple request-response protocol, which usually runs on top of TCP. It specifies what kind of messages the client may send to the server and what kind of responses it may get.
Like other application layer protocols, it is a protocol for implementing a certain type of specific application, and its functions are realized by an application program running in user space. HTTP is a protocol specification, which is recorded in the document and is an implementation program of HTTP that actually communicates through HTTP.

HTTP is based on the TCP protocol and is connection-oriented. Typical HTTP transaction processing has the following process:

  1. The client establishes a connection with the server;
  2. The client makes a request to the server;
  3. The server accepts the request and returns the corresponding data as a response according to the request;
  4. The client and server close the connection.

HTTP protocol message format
HTTP message consists of a request (Request) from the client to the server and a response (Respone) from the server to the client

The request consists of request line, request header, and request body. The
request line contains the request method, path, and version number. The request header contains multiple key-value data, and the request body contains some requested data. The response consists of the
response line, response header, and response body. Composition
The response line contains the status code, status code description, version number, the response header contains multiple key-value data, and the response body contains some response data
insert image description here
Common HTTP response status code summary

200 OK: The client request is successful

3XX series

301 Moved Permanently: The requested resource is permanently moved to the new URL , the returned Response contains a Location, the browser will automatically redirect to the new URL, and subsequent requests will be replaced by the new URL.
302 Found: Similar to 301, But the requested resources are only temporarily moved to the new URL , and the client will continue to use the original URL for the next request.
307 Temporary Redirect: Temporary redirection , similar to 302, using GET request redirection

4XX series

400 Bad Request: The client request syntax error, the server cannot understand (it is more common when ajax requests background data)
401 Unauthorized: The request requires user authentication
403 Forbidden: The server understands the client request, but refuses to execute it (generally used at the user level Access is not supported to meet requirements, etc.)
404 Not Found: The server cannot find the corresponding resource according to the client request
405 Method Not Allowed: The server does not support this method

5XX series

500 Internal Server Error: The server has an internal error and cannot complete the request.
503 Service Unavailable: Due to overload or system maintenance, the server is temporarily unable to process the client's request. The length of the delay can be included in the server's Retry-After header

Features of the HTTP protocol

  1. Support server/client mode
  2. The transmission is faster, the client sends a request to the server, only the request method and path need to be transmitted
  3. Flexible, HTTP allows the transfer of any type of data object
  4. No connection, each connection can only process one request, the server finishes processing the client request, and the client disconnects after receiving the response
  5. Stateless, the protocol itself has no memory ability for transaction processing, if the subsequent connection needs the information sent before, it needs to be retransmitted

The difference between HTTP1.0 and HTTP1.1 and HTTP2.0:

The difference between HTTP1.0 and HTTP1.1:

  1. Long connection: HTTP1.0 only supports short connection between the browser and the server, that is, the connection must be re-established for each request, and the server cannot record each historical request. HTTP1.1 supports long connection, that is, under one connection, the browser can send The server sends multiple requests
  2. Increase the Host field: In HTTP1.0, each server is considered to be bound to this unique IP, and there is no host information in all sent request header URLs, while HTTP1.1 supports the host header field in both requests and responses, and the request message If there is no Host header field, an error will be reported (400 Bad Request)
  3. Cache: HTTP1.1 adds some new features of cache on the basis of 1.0. When the Age of the cache object exceeds Expire, it becomes a stale object. The cache does not need to discard the stale object directly, but reactivates (revalidation) with the source server .
  4. Error prompt: 16 status codes are defined in HTTP1.0, and the prompts for errors or warnings are not specific enough. HTTP1.1 introduces a Warning header field, which increases the description of error or warning information, and also adds 24 status response codes, such as 409 (Conflict) indicating that the requested resource conflicts with the current state of the resource; 410 (Gone ) indicates that a resource on the server is permanently deleted

The difference between HTTP1.X and HTTP2.0

  1. Add binary format parsing: HTTP1.X parsing is based on text, and the text format itself is diverse, which is inconvenient in many scenarios. After the introduction of binary, only the combination of 0 and 1 makes parsing more convenient and enhances robustness
  2. Multiplexing: That is, each request is used as a connection sharing mechanism, and each request corresponds to an id, so that one connection can have multiple requests, and then the request is assigned to different server requests according to the id
  3. Header compression: In HTTP1.X, some headers are written for each transmission, which takes up a lot of data. Therefore, HTTP2.0 saves a header fields table on the client and server, and only needs to transmit the header for each transmission. Update information, update the header fields table to achieve header transmission
  4. Server push: HTTP2.0 also adds server push function

HTTPS protocol

HTTPS is also an application layer protocol. It can be said that it is an upgraded version of HTTP, which increases the security of transmitted data. The HTTPS protocol adds an SSL shell on the basis of HTTP. HTTPS runs on SSL, and SSL runs on TCP. Encryption of data is done on SSL
insert image description here

Its method of ensuring security is to
mix encryption technology through certificate verification and mixed encryption of information:

Hybrid encryption technology: Combining symmetric encryption and asymmetric encryption.
The server generates a private key, then generates a public key through the private key, and then puts the public key in the certificate and issues it to the client.
Use the public key and the private key to encrypt and generate the key asymmetrically. Key
In the subsequent transmission data of the client, the key will be used to encrypt the information in a symmetrical manner, and then transmitted to the server

insert image description here

For the public key and private key mentioned above, we stipulate that the content encrypted with the public key must be unlocked with the private key. Similarly, the content encrypted with the private key can only be unlocked with the public key

Therefore, HTTPS transmits data using ciphertext encrypted with a key and a private key encrypted with a public key to ensure data security.

HTTPS encryption, can only symmetric encryption be used?
no! Security cannot be guaranteed, because if only symmetric encryption is used, that is, only the key is used to encrypt the data for transmission. If the information is hijacked by a third party during the transmission and the key is obtained, the third party can use the key pair for the next transmission. The data is decrypted to obtain the original data.

HTTPS encryption, can only asymmetric encryption be used? What about twice?
The same will not work if only asymmetric encryption is used. It seems safe for the client to encrypt data with the public key every time it transmits, and the server decrypts it with the private key. information, but the public key may be obtained before that, because the first time the server sends the public key to the client, it is transmitted in plain text.
From another perspective, if two asymmetric encryptions are used, that is, two sets of public keys and two sets of private keys, each of which is held by the client and server, security can be achieved in theory, but in practice HTTPS is not used because asymmetric encryption is time-consuming very big

Certificate:

The hybrid encryption technology alone seems to have guaranteed the security of the transmission, but in fact there are still loopholes. The problem is that the server cannot identify whether the sent public key is its own , so after the data is hijacked by a third party, it can recreate it by itself. Define a public key B, and send the public key B back to the customer service. At this time, the client will use the public key B to re-encrypt the data and send it. At this time, the third party can decrypt the original data through its own public key B. up.

The certificate solves this problem. It is designated as the CA institution. When the website uses HTTPS, it will apply for a digital certificate from the CA institution. The certificate can store public key, data and other information. From then on, the server can pass The certificate is used to prove to the client which is the correct public key, so as to ensure security.
As for certificates, there are also some own anti-tampering mechanisms to prevent third parties from obtaining and using
insert image description here

transport layer protocol

The main function of the transport layer is to realize "port-to-port" communication, to ensure that after a piece of data is sent to the host, it can be correctly delivered to the corresponding port

UDP protocol

UDP provides a method for applications to send encapsulated IP packets without establishing a connection, but UDP also has its own flaws. Once communication occurs, it is unknown whether the other party has received the data, which is likely to cause data transmission. packet loss problem

insert image description here
Features:

  • No connection: You only need to know the destination ip and port number to send data without establishing a connection
  • Unreliable: There is no set of mechanisms to deal with packet loss when transmitting data
  • Datagram-oriented sending: What kind of message is sent by the application layer to UDP, what kind of message will be sent by UDP, and will not be split or merged
  • The size of data transmitted by UDP at one time is limited, the maximum is 64k

The scope of application of UDP transmission process
insert image description here
UDP:

Since UDP is not a connection protocol, it has low resource consumption. The processing speed is excellent, so it is often used in the transmission of video and audio calls, because there are more data sent, and the occasional loss of one or two packets will not have much impact

TCP

Because the transmission of UDP mentioned above is unreliable, which often leads to connection errors and data packet loss problems, another transport layer protocol-TCP protocol is specified for these problems. TCP is a connection-oriented, reliable, word-based Throttling Transport Layer Protocol

insert image description here

Features of TCP:

  • Connection-oriented: When transmitting data, the connection between the client and the server must be established before data transmission can be performed
  • Reliable communication: In the TCP output data, various internal mechanisms will ensure that the data is transmitted to the destination port
  • Based on byte stream: TCP transmission data is based on byte transmission, which is easy to split and combine data to send
  • The overhead of TCP is higher than that of UDP, because more information needs to be stored

Regarding the various internal mechanisms of TCP, I will not introduce too much here, and bloggers can refer to the previous blog The basics of network principles

The difference between TCP and UDP:

  • UDP is connectionless, TCP is connected
  • UDP is unreliable, TCP is reliable
  • UDP is datagram-oriented, TCP is byte-oriented
  • UDP has less transmission consumption and faster speed than TCP

Here is a picture of God, so as to understand the difference between TCP and UDP more vividly
insert image description here

Network layer

The network layer is based on the third-layer protocol between the data link layer and the transport layer. It further manages the data communication in the network on the basis of the data frame transmission function provided by the data link layer between two adjacent endpoints. Try to transmit data from the source to the destination through several intermediate nodes , thus providing the most basic end-to-end data transmission service to the transport layer

The purpose of the network layer is to realize the transparent transmission of data between two end systems , and its specific functions include addressing and routing, connection establishment, maintenance and termination, etc. It provides services so that the transport layer does not need to know the data transmission and switching technology in the network.

IP protocol

The IP protocol is the core part of the TCP/IP network model. It provides a layered, hardware-independent addressing method that can deliver the services required for data in complex routed networks.

The IP protocol can connect multiple switching networks and transmit data packets between the source address and the destination address. At the same time, it can also provide data assembly functions to meet the requirements of different networks for the size of data packets.

Pre-research knowledge:

IP address:
IP address is a unique address of the Internet protocol. It is a unified address format provided by the IP protocol. The IP address assigns a logical address to each network and each host on the Internet, so as to shield the physical address. address difference

The format of the IP address:
The IP address is a 32-bit address, which is divided into 4 parts, such as XXX.XXX.XXX.XXX, and the IP address is divided into two parts. The
network number: the first three parts are used to identify the network segment, to ensure The two network segments connected to each other have different identifications.
Host ID: It is composed of the last part, which is used to identify the host to ensure that two hosts in the same network segment have different host IDs.
By setting the host ID and network ID reasonably, it can be guaranteed that In the interconnected network, the IP address of each host is different4

MAC address:
Known as the physical address, it is used to identify each device in the network. The MAC address is hard-coded after the device leaves the factory.

The purpose of introducing IP addresses:
In a single LAN segment, computers can communicate with each other using the MAC address provided by the data link layer.
If in a routed network, computers cannot use MAC addresses to communicate, mainly Because in a routed network, the data is simply established by using the MAC addresses between two computers once, but requires multiple communications. In this process, to know where to jump each time to get closer to the destination host, a logical and hierarchical addressing scheme must be used to organize the network, which is the IP address

IP protocol datagram formatHere is the quote

How the IP protocol works:

Since the network is divided into the same network segment and different network segments, it will be divided into two ways

  • Same network segment: If the source address host and the destination address host are in the same network segment, after the destination IP address is resolved into a MAC address by the ARP protocol, the source host will directly send the data packet to the destination host according to the destination MAC address
  • Different network segments:
    If the source address host and the destination address host are not in the same network segment, the data packet will go through multiple processes and finally be sent to the destination host
    1. The IP address of the gateway (usually a router) is resolved to a MAC address by the ARP protocol, According to the MAC address, the source host will send the data packet to the gateway
    2. The gateway finds the target network according to the network segment ID in the data packet. If it is found, it sends the data packet to the target network. If not, repeat the first step and send it to another Higher-level gateway
    3. After the data packet is sent to the correct network segment through the gateway, the target IP is resolved into a MAC address by the ARP protocol, and the data packet is sent to the host at the target address according to the MAC address

ICMP protocol

The ICMP protocol is also called the control message protocol. The ICMP protocol is used to transmit control messages between IP and routers, describing whether the network is unobstructed, whether the host is reachable, whether the router is available, etc. The ICMP itself does not transmit data, but for users plays an important role in the transfer of data between

Function:
In the process of transmitting the data packet from the source host to the destination host, it will go through one or more routers, and the data packet may encounter many problems during the transmission of these routers, which eventually leads to the failure of the data packet to be successfully delivered to destination host. In order to understand which link of the data packet has a problem during transmission, the ICMP protocol is needed, which can track the data packet and return the message to the source host.

insert image description here

data link layer

The data link layer is the second layer of the TCP/IP network model. Based on the physical layer and the network layer, the data link layer provides services to the network layer on the basis of the services provided by the physical layer. Its most basic service is to Data from the physical layer is reliably transmitted to the target network layer of the adjacent node .

ARP protocol

The ARP protocol solves the problem of connection between the network layer and the physical layer through the conversion of IP addresses to MAC addresses during data transmission over the network.

The purpose of introducing the ARP protocol:
Due to the different positioning methods of the IP address and the MAC address, the ARP protocol has become a necessary protocol for data transmission. Before the host sends information, it must obtain the MAC address corresponding to the target IP address through the ARP protocol, so as to send the data packet correctly.
insert image description here
ARP workflow:
insert image description here
insert image description here
The figure shows two hosts under the same network segment, ARP workflow

  • Host A sends an ARP request to all hosts in the network segment in the form of broadcast, and the request contains the IP address of the destination host
  • Host B receives the request, finds that it is what host A is looking for through the destination IP address in the request, and returns a response, which includes the MAC address of host B

ARP cache:
When requesting the MAC address of the target host, it is necessary to send an ARP request every time to obtain the MAC address of the target host, and then obtain the MAC address according to the response.

To avoid repeating ARP requests, each host has an ARP cache. After the host gets the ARP response, it stores the IP address and physical address of the target host in the local ARP cache and keeps it for a certain period of time.

As long as within this time range, the next time the MAC address is requested, the ARP cache is directly queried without sending an ARP request, thereby saving network resources.

physical layer

The physical layer, as the name implies, is to use physical means to connect two computers to communicate, and is mainly used to transmit 0, 1 photoelectric signals, because this layer is too hardware-oriented, so this article will not go into details

Overall network transmission process

After the above understanding of each layer in the network transport layer, let's take a look at what happens when a web page is accessed?

Host A: Send http://www.baidu.com network datagram

  1. DNS resolution: convert the domain name to the corresponding IP address (the local DNS cache stack starts to search —> search up step by step, if the root domain server cannot be found, it means that the domain name host does not exist on the public network)
  2. After finding the IP: Find the corresponding destination MAC address through the destination IP
  3. Calculate whether the destination host is in the same network segment as host A based on the destination IP
  4. If in the same network segment: then resolve the corresponding destination MAC through the ARP protocol, and skip to step 9
  5. If not in the same network segment: send the datagram to the gateway, now look up the ARP cache table, look up the MAC address through the gateway IP, if you can’t find it, send the query MAC broadcast datagram, and finally return the gateway’s own MAC
  6. Switch forwarding: Find the corresponding MAC switch interface in the MAC address translation table
  7. Router receives: demultiplexed datagram
    insert image description here
  8. Devices on the way: Do the same as step 7. If the MAC address corresponding to the destination IP is not the current device, continue to repeat this operation and continue sending to a route closer to the destination IP
    insert image description here
  9. Find the destination host B, the server of host B starts to accept the distribution request, parse it, and finally organize the response
    insert image description here
  10. Same as the above operation, host B sends data to host A
  11. Finally, host A receives the datagram, divides it, parses it, and finally gets a response

Guess you like

Origin blog.csdn.net/m0_46233999/article/details/119455352