Basic knowledge of network technology

-- Download the packet capture tool
https://www.wireshark.org/download.html

https://www.charlesproxy.com/download/Network

model

Before learning specific knowledge, it is very important to understand the knowledge system and model it is in. Importantly, the same is true for network knowledge. There are currently two recognized network models, one is the OSI seven-layer model, and the other is the TCP/IP five-layer model. Please see the following figure: As

you can see, OSI seven There is a corresponding relationship between the layer model and the TCP/IP five-layer model, and the layers below the transport layer are completely consistent (the network interface layer in the TCP model is the collection of the data link layer and the physical layer), so it can be said that the session in the OSI model is After the layer, the presentation layer and the application layer are merged into the application layer in the TCP/IP model, the two are basically the same.

The above two network models are general network models. Relatively speaking, the TCP/IP model is more common, so we mainly discuss the TCP/IP model as the network model, which is why the name of this class is TCP/IP. 's origin.

So what protocols does each layer correspond to? Please see the picture below:
[Click to view the original size picture]

You can see that for some of the well-known protocols, the IP protocol is located at the network layer, the TCP protocol is located at the transport layer, and the HTTP protocol is located at the application layer, and the rest are the familiar DNS protocol. , FTP protocol, etc., all have their own levels.
[Click to view the original size picture]

We can verify this through Wireshark packet capture, and grab an HTTP packet casually:
[Click to view the original size picture]

From top to bottom are the Frame header, the Ethernet header, the IP protocol header, TCP protocol header and HTTP protocol header, the last line is the data of this request, and its format is JSON

three grips and four waves

The so-called three-way handshake and four-way wave refers to the three-way handshake and four-way wave, which is the process of establishing and disconnecting the TCP protocol. , Similarly, four times of waving means that four data interactions are required when disconnecting. The interaction process diagram is as follows:
[Click to view the original size picture]

For a simple example, two people, Xiao S and Xiao C, make a phone call, and they The three-way handshake establishes the connection process:

Little S: Hello, is it Little C? Little C: Hmm yes, are you Little S? Xiao S: Yes, let's start a pleasant chat!

And the process of waving hands four times is:

Xiao S: Hey, Xiao C, I'm a little tired, how about that's it for today? Xiao C: Okay, take a rest, I'll say two more words, Xiao C: Oh, I'm fine too Tired, let's go here today Xiao S: Okay, let's go here, 886

and then Xiao S and Xiao C hung up the phone, we noticed that during the process of waving four times, Xiao S first proposed to disconnect connected, but in fact their conversation did not end. After Xiao C confirmed the message, he did not immediately disconnect, but continued the conversation. This is because the TCP protocol has the full-duplex feature, which is simply a connection. There are two lines of small C - small S and small S to small C, and the one proposed by small S and confirmed by small C to close is only the line of small S - small C, so small C can continue to send messages to small S , until Xiao C also felt that the connection should be closed and confirmed by Xiao S, all the connections between the two were completely closed.

Then you must ask why TCP is designed this way, because TCP is a full-duplex protocol, and Full Duplex is a term for communication transmission. Communication allows data to be transmitted in both directions at the same time. We also mentioned in the above example that in a TCP interaction, two lines need to be maintained, so whether it is established or disconnected, the state of the two lines must be ensured. correct.

The above explanation is still based on the theoretical stage. In order to better consolidate knowledge, we use Wireshark to capture packets in the actual production environment and see:

    The client IP is: 10.2.203.93 The server IP is: 10.108.21.2

The packet capture situation when the client and the server establish a connection:
[Click to view the original size picture]

You can see that the client first sends a SYN message, and the server After receiving and responding to the SYN ACK message, the client finally returns an ACK message, and the connection is established.

Let's take a look at the situation when the connection is disconnected:
[Click to view the original size image]

Unlike when the connection is established, the initiator of the disconnection is the server, you can see that the server sends a FIN message, and then the client sends an ACK message At this time, the server no longer transmits data to the client, and the client also sends a FIN message to the server after completing the data transmission, and officially disconnects after receiving the Last ACK message from the server.

Let’s talk about the three grips and four waves when the TCP connection is established and disconnected, and then attach a TCP state transition diagram, which is very helpful to understand the entire TCP protocol:
[Click to view the original size image]

DNS resolution

DNS (Domain Name System, Domain Name System), a distributed database on the Internet that maps domain names and IP addresses to each other, enables users to access the Internet more conveniently without having to remember IP strings that can be directly read by machines. The process of obtaining the IP address corresponding to the host name through the host name is called domain name resolution (or host name resolution).

This is a description from Baidu Encyclopedia. To put it simply, the work of DNS resolution is to convert the domain names that can be remembered and easier to remember into a system of IP addresses. Let's take a look at it with the help of Wireshark. How exactly does it work.

When we enter www.baidu.com in the browser, a DNS request message will be sent to the server. After the server has processed the request, it will send a DNS response message, which contains the IP address we care about. You can see When we catch two packets, the former is called DNS request packet, and the latter is called DNS response packet. Pay attention to our filtering conditions. It is more convenient to filter by UDP port:
[Click to view the original size image]

First look at the DNS request message:
[Click to view the original size image]

You can see that the transport layer protocol of DNS is UDP, not TCP, and its port number is 53. Next is the Transaction ID (2 bytes), which can be used as a unique ID for DNS requests, that is to say, for a request and a response message, the ID is the same, so it can also be used to find the ID. The response message corresponding to the request message.

The length of the Flags field is also 2 bytes. It can be seen that the 16 bits are divided into the following parts, in order:

    Response (1 bit), the value of 0 indicates a DNS request message, and a value of 1 indicates a DNS response message Text
    opcode (4 bits): Defines the type of query or response (if it is 0, it means standard, if it is 1, it is reverse, and if it is 2, it is a server status request).
    AA (1 bit): Flag bit for authorized answer. This bit is valid in the response message, 1 indicates that the name server is the authority server
    TC (1 bit): truncation flag. 1 means that the response has exceeded 512 bytes and has been truncated
    RD (1 bit): this bit is 1 means the client wants a recursive answer
    RA (1 bit): can only be set to 1 in the response message, indicating that recursion can be obtained response.
    zero (3 bits): If you don't say it, you know that it's all 0, and the field is reserved.
    rcode (4 bits): Return code, indicating the error status of the response, usually 0 and 3, the meaning of each value is as follows: 0 No error 1 Format error 2 The problem is on the domain name server 3 The domain reference problem 4 The query type is not supported 5 In Management is prohibited 6 -- 15 Reserved The fields

immediately below Flags are: queries, answers, authr, addrr, and their corresponding Chinese meanings are the number of questions, the number of resource records, the number of authorized resource records, and the number of additional resource records. , their lengths are all 2 bytes, generally queries are 1, and the rest of the fields are 0.

Next is the body part of the message, which includes the domain name to be queried, the query type and the corresponding query class. The format of the domain name here is quite special. The domain name here is www.baidu.com, and the part marked in blue It is the representation in the message. It can be seen that 03 represents 3 bytes, followed by 3 77s. If converted to ASIC code, it is 0x77. Therefore, for www.baidu.com, the first is " ." is the delimiter, after dividing into 3 parts, use the corresponding segment length plus the ASIC of the domain name segment to form a segment, thus forming a complete domain name.
[Click to view the original size image]

The following two fields are Type and Class, where both fields are 1, where Type is A, which means that the request type is to obtain the IP address through the domain name, which is also the most common one. DNS request form. The Class field is 1, which means that the data queried here is internet data, which is also the most common form.

After introducing the request message, let's take a look at the response message:
[Click to view the original size image]

The same part of the response message and the response message will not be repeated. You can see that the Response value in Flags is 1, so This indicates that this is a response message, and the Transaction ID is also the same as the ID in the request message, indicating that this is the response message corresponding to the request message above.

The most important part of the request message body is the Answers field, which includes the IP address we want, but we also noticed that for the domain name www.baidu.com, there are actually 3 response fields, so what exactly is the response field? Which one shall prevail? Let's look at it one by one.

The first is the first Answer, where the Type type is CNAME, and the CNAME here indicates that the response is an alias of the domain name queried in the request message, that is, the returned here will be an alias of www.baidu.com, also It is www.a.shifen.com, followed by the next two Answers, the Type type is A, which means the return value will be an IPV4 address. Other common Type types include AAAA—IPV6 address, PTR—IP address Converted to domain name, NS - name server.

It can be seen that for the same domain name, multiple IP addresses can be returned. In the above response message, two IP addresses are returned, namely 61.135.169.125 and 61.135.169.121. This is the final result we want. In order to prevent an abnormality in one of the IP addresses, usually for a domain name, there will be two or more IP addresses corresponding to it, so that it can play an active and standby disaster recovery effect. You can switch to another IP for access. Enter or 61.135.169.121 in the browser, you can also access the page normally:
[Click to view the original size image]

For students who use the chrome browser, you can enter chrome://net-internals/#dns to view the browser DNS resolution list :
[click to view original size image]

Here you can see the websites you have visited and the corresponding resolution records. There is also a column of TTL, which represents the survival time of the domain name resolution results. Simply put, when we resolve a domain name, its records will be cached. For the access within the TTL time, we directly obtain it from the cache, instead of performing DNS resolution. If the domain name resolution of the server is changed, it will take a long time to take effect on the client, so the TTL should be adjusted according to the actual production environment requirements.

The DNS resolution will be here for the time being. It is recommended that you refer to the packet capture process above. Practice it, I believe you can have a deeper understanding of the whole process!

Application layer protocol After

introducing the TCP protocol and the DNS protocol, we will start to introduce the application layer protocol at the top of the TCP/IP model. The application layer protocol is also the most closely interacting with the user, so it has the most direct impact on the user's perception. The following introduces several common application layer protocols.

HTTP

HTTP (HyperText Transport Protocol) is the abbreviation of HyperText Transport Protocol, which is used to transmit WWW data. For details of HTTP protocol, please refer to RFC2616. The HTTP protocol adopts a request/response model. The client sends a request to the server, and the request header contains the requested method, URL, protocol version, and a MIME-like message structure containing request modifiers, client information, and content. The server responds with a status line containing the message protocol version, success or error code plus server information, entity meta information, and possibly entity content.

The above is the authoritative description of HTTP. HTTP can be said to be the most common and important protocol in the entire Internet. Including this article you can read now, it also uses HTTP for data transmission, so how is the entanglement? What about working? This time we use another packet capture artifact, Charles, to perform packet capture analysis.

When we use the browser to visit http://www.csdn.net/, the browser will display on the page after we enter the address and hit enter:
[Click to view the original size image]

This is a normal operation Now, we use Charles to capture the package, and get the following results:
[Click to view the original size image]

The above is the entire complete interaction, which actually contains two parts. The first is the HTTP request request, which is the upper half of the above. , it can be seen that several key points in the request request are GET, HTTP/1.1, Host, User-Agent, Accept and Cookie. These keywords constitute the header of a request request, which means that the client wants to pass this request. Which data is requested from the server, after the server receives the request, the response is the HTTP response message, which is the lower part of the above figure, because we are visiting the csdn homepage, and it is also specified in Accept. HTML is a kind of request data, so the data returned by the response message contains HTML data. Of course, the response message also has other components, as shown in the following figure:
[Click to view the original size image]

This includes the server side For the response data of this request, the most important one is the 200 OK field, which is the response status code. The most common one is 200, which means that the request is OK, and the more common ones are 404 and 502, the former representing the customer Illegal request on the side, the latter means that the response of the server fails. For example, when we enter http://www.csdn.net/test123, the page will prompt:
[Click to view the original size image]

HTTPS

HTTPS (full name: Hyper Text Transfer Protocol over Secure Socket Layer) is a secure HTTP channel, simply a secure version of HTTP. That is to say, the SSL layer is added under HTTP, and the security foundation of HTTPS is SSL, so the detailed content of encryption requires SSL. It is a URI scheme (abstract identifier scheme) with a syntax similar to the http: scheme. For secure HTTP data transfer. https: URL indicates that it uses HTTP, but HTTPS has a different default port than HTTP and an encryption/authentication layer (between HTTP and TCP)

From the above elaboration we can get a simple HTTPS = HTTP + TLS In conclusion, just try HTTPS is a more secure protocol than HTTP, and many websites have begun to support HTTPS. For example, Baidu:
[Click to view the original size image]

You can see that the HTTPS protocol is used, and the browser will prompt for security. Let's look at a few other examples:
[Click to view the original size image]

The above picture is the login interface of ICBC , you can see that the HTTPS protocol is also used. If the HTTP protocol is still used, the browser will not have a prompt of the word "safe":
[Click to view the original size image]

Ha, the CCB homepage has not yet used the HTTPS protocol, is that right? Does that mean it's not safe?

In fact, we click the login button:
[Click to view the original size image]

You can see that the protocol used is still the HTTPS protocol, which means that our login operation is still secure, which greatly reduces the possibility of account information being stolen

HTTP2

HTTP/2 (Hypertext Transfer Protocol Version 2, originally named HTTP 2.0), abbreviated as h2 (encrypted connections based on TLS/1.2 or above) or h2c (unencrypted connections) [1], is the The second major version, used on the World Wide Web.

HTTP/2 is the first update of the HTTP protocol since HTTP 1.1 was released in 1999, and is mainly based on the SPDY protocol. It is developed by the Hypertext Transfer Protocol Bis (httpbis) working group of the Internet Engineering Task Force (IETF). [2] The group submitted the HTTP/2 standard proposal to the IESG for discussion in December 2014 [3] and was approved on February 17, 2015. [4]

The HTTP/2 standard was officially published as RFC 7540 in May 2015. [5] The standardization work of HTTP/2 is supported by browsers such as Chrome, Opera, Firefox[6], Internet Explorer 11, Safari, Amazon Silk and Edge. [7]

Most major browsers already supported the protocol by the end of 2015. [8] Furthermore, according to W3Techs, in May 2017, 13.7% of the top 10 million websites supported HTTP/2.

It can be seen that the HTTP2 protocol has a place in the Internet, so where is it stronger than HTTP? To sum up, there are roughly the following points. Compared with HTTP/1.x, HTTP/2 has made great changes and optimizations in the underlying transmission:

    Binary transmission: HTTP2 transmits data in binary format instead of HTTP's text format. The binary format brings more advantages and possible
    multiplexing in the parsing and optimization extension of the protocol: HTTP2 achieves concurrent requests. At the same time, streams also support priority and flow control.
    Server push: The server can push resources to the client faster
    Header compression: Header compression makes the entire HTTP2 packet much smaller and faster to transmit.

QUIC

QUIC is a new transmission method that reduces latency compared to TCP. On the surface, QUIC is very similar to TCP+TLS+HTTP/2 implemented over UDP. Since TCP is implemented in the operating system kernel and middleware firmware, making major changes to TCP is nearly impossible. However, since QUIC is built on top of UDP, there are no such restrictions.

Compared with the HTTP, HTTPS and HTTP2 protocols introduced above, the biggest difference between QUIC is that its transport layer uses the UDP protocol instead of the TCP protocol, so it has the following features

    : Reliable
    transmission similar to TCP
    Encrypted transmission similar to TLS, supports perfect forward security
    user space congestion control, the latest BBR algorithm
    supports h2 stream-based multiplexing, but does not have the HOL problem of TCP
    forward error correction FEC
    is similar to MPTCP's Connection migration

. In the actual environment, how to know which access uses HTTP2 and which access uses QUIC protocol?

Here is a plug-in for chrome - HTTP/2 and SPDY indicator. When the plug-in is downloaded and successfully accessed, we can see that there will be an additional ⚡️ sign on the right side of the browser address bar:
[Click to view the original size image ]

When visiting Baidu, this ⚡️ logo is white, when we visit YouTube:
[Click to view original size image]

You will find that the logo turns blue, and when the mouse moves to the logo, it prompts that HTTP2 has been enabled, which means that the HTTP2 protocol has been used on YouTube. Enter chrome://net-internals/#http2 in the chrome browser. You can see which websites use HTTP2 and QUIC:
[Click to view the original size image] Reprinted

from: http://www.iteye.com/news/32765

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326178208&siteId=291194637