Ten thousand characters long text, understand TCP, IP and HTTP, HTTPS in one text

TCP/IP concept

TCP/IP (Transmission Control Protocol/Internet Protocol, Transmission Control Protocol/Internet Protocol) refers to a protocol cluster that can realize information transmission between multiple different networks. The TCP/IP protocol not only refers to the two protocols of TCP and IP, but refers to a protocol cluster composed of FTP, SMTP, TCP, UDP, IP and other protocols. It is also the most basic protocol of the Internet and the Internet international Internet network. The foundation is composed of the IP protocol of the network layer and the TCP protocol of the transport layer. TCP/IP defines how electronic devices connect to the Internet and how data is transferred between them.

My understanding:  devices on the Internet must communicate with each other in the same way, such as which party initiates the communication, what language is used for the communication, and how to end the communication must be determined in advance, and communication between different devices requires one Rules, we call this kind of rules an agreement.

TCP/IP hierarchical management diagram

Ten thousand characters long text, understand TCP, IP and HTTP, HTTPS in one text

 


The most important feature of the TCP/IP protocol is layering. From top to bottom are the application layer, transport layer, network layer, data link layer, and physical layer. Of course, there are 4 or 7 layers according to different models.

Why stratify? From a design perspective, it becomes flexible. When a layer needs to be modified, only the corresponding layer needs to be removed to realize pluggability without changing all layers. For users, the complex transmission process at the bottom is shielded.

Application layer

The TCP/IP model combines the functions of the session layer and the presentation layer in the OSI reference model into the application layer. The main representatives of this layer are DNS domain name resolution/http protocol

Transport layer

In the TCP/IP model, the function of the transport layer is to enable peer entities on the source host and the target host to conduct a conversation. Two protocols with different quality of service are defined in the transport layer. Namely: Transmission Control Protocol TCP and User Datagram Protocol UDP.

Network layer

The network layer is the core of the entire TCP/IP protocol stack. Its function is to send packets to the target network or host. At the same time, in order to send packets as quickly as possible, it may be necessary to carry out packet delivery along different paths at the same time. Therefore, the order of packet arrival and the order of transmission may be different, which requires the upper layer to sort the packets. The network layer defines the packet format and protocol, the IP protocol (Internet Protocol).

Physical layer

This layer is responsible for the transmission of bit streams between nodes, that is, physical transmission. The protocol of this layer is not only related to the link, but also related to the transmission medium. Generally speaking, it is a physical means of connecting computers.

data link layer

The main function of controlling the communication between the network layer and the physical layer is to ensure reliable data transmission on the physical line. In order to ensure transmission, the data received from the network layer is divided into specific frames that can be transmitted by the physical layer. The frame is a structure packet used to move the data structure. It contains not only the original data, but also the physical addresses of the sender and receiver, as well as error correction and control information. The address determines where the frame will be sent, while the error correction and control information ensure that the frame arrives without errors. If when transmitting data, the receiving point detects an error in the transmitted data, it will notify the sender to resend the frame.

Features of UDP and TCP:

  • User Datagram Protocol UDP (User Datagram Protocol): no connection; best-effort delivery; message-oriented; no congestion control; support one-to-one, one-to-many, many-to-one, and many-to-many interactive communication; header overhead Small (only four fields: source port, destination port, length, checksum). UDP is a message-oriented transmission method that is how long a message is handed over by the application layer to UDP, and how long a message is sent by UDP, that is, one message is sent at a time. Therefore, the application must select a message of the appropriate size.
  • Transmission Control Protocol TCP (Transmission Control Protocol): connection-oriented; each TCP connection can only be point-to-point (one-to-one); provide reliable delivery services; provide full-duplex communication; oriented byte stream. The interaction between the application and TCP is one data block at a time (with varying sizes), but TCP treats the application as a series of unstructured byte streams. TCP has a buffer, when the data block that should be transmitted by the program is too long, TCP can divide it into shorter ones and transmit it.

UDP header format:

Ten thousand characters long text, understand TCP, IP and HTTP, HTTPS in one text

 

The user datagram has two fields: the data field and the header field. The data field is very simple, only 8 bytes, composed of four fields, and each field is two bytes in length. The meaning of each field is as follows:

  1. Source port:  Source port number, used when you need to reply to the other party. It does not need to be all available 0.
  2. Destination port number:  This must be used when the destination delivers the message.
  3. Length: The length of the  user datagram UDP, the minimum is 8 (only the header).
  4. Checksum: It is  used to check whether the user datagram is wrong in the transmission process, and the packet is discarded if there is an error.

TCP message header format:

Ten thousand characters long text, understand TCP, IP and HTTP, HTTPS in one text

 

Source port and destination port:  each occupies two bytes, write the source port number and destination port number respectively.
Sequence number:  4 bytes; used to number the byte stream. For example, the sequence number is 301, which means that the number of the first byte is 301. If the length of the data carried is 100 bytes, then the next segment The serial number should be 401.
Acknowledgement number:  4 bytes; the sequence number of the next segment expected to be received. For example, B correctly receives a message segment sent by A, the sequence number is 501, and the data length carried is 200 bytes. Therefore, B expects the sequence number of the next message segment to be 701, and B sends to A in the confirmation message segment The confirmation number is 701.
Data offset:  occupies 4 bits; it refers to the offset of the data part from the beginning of the message segment, and actually refers to the length of the header.
Confirm ACK:  When ACK=1, the confirmation number field is valid, otherwise it is invalid. TCP stipulates that ACK must be set to 1 in all message segments transmitted after the connection is established.
Synchronization SYN: Used to synchronize the serial number when the connection is established. When SYN=1 and ACK=0, it means that this is a connection request segment. If the other party agrees to establish a connection, SYN=1 and ACK=1 in the response message.
Terminate FIN:  Used to release a connection. When FIN=1, it means that the data of the sender of this segment has been sent, and the connection is required to be released.
Window:  occupies 2 bytes; the window value is used as the basis for the receiver to let the sender set its sending window. The reason for this limitation is that the receiver's data buffer space is limited.
Checksum:  occupies 2 bytes; the scope of the checksum field check includes the two parts of the header and data. When calculating the checksum, a 12-byte pseudo header is added in front of the TCP message segment.
Socket: The endpoint of a TCP connection is called a socket or socket. The port number is spliced ​​to the IP address to form a socket.

Interview soul torture

TCP's three-way handshake and four waved hands:

  1. The first handshake: The client sets the flag SYN to 1, randomly generates a value seq=J, and sends the data packet to the server. The client enters the SYN_SENT state and waits for the server to confirm.
  2. The second handshake: After the server receives the data packet, the flag bit SYN=1 knows that the client requests to establish a connection. The server sets the flag bits SYN and ACK to 1, ack=J+1, randomly generates a value seq=K, and The data packet is sent to the Client to confirm the connection request, and the Server enters the SYN_RCVD state.
  3. The third handshake: After the client receives the confirmation, it checks whether ack is J+1 and whether ACK is 1. If it is correct, it sets the flag ACK to 1, ack=K+1, and sends the data packet to the server. Server checks whether ack is K+1 and ACK is 1. If it is correct, the connection is established successfully. Client and Server enter the ESTABLISHED state, complete the three-way handshake, and then start transmitting data between Client and Server.

Ten thousand characters long text, understand TCP, IP and HTTP, HTTPS in one text

 

Why do we need to conduct a three-way handshake? The  third handshake is to prevent invalid connection requests from reaching the server and let the server open the connection by mistake. If the connection request sent by the client stays in the network, it will take a long time to receive the connection confirmation from the server. After the client waits for a retransmission timeout, it will request a connection again. But this stuck connection request will eventually reach the server. If the three-way handshake is not performed, the server will open two connections. If there is a third handshake, the client will ignore the connection confirmation sent by the server to the stranded connection request, and will not perform the third handshake, so the connection will not be opened again.

What if it becomes two waves at this time? Give an example of calling, such as: the first handshake: A calls B and says, can you hear me? The second handshake: B received A's message, and then said to A: I can hear you, can you hear me? The third handshake: A received B's message, and then said yes, I want to send you a message! Conclusion: After the three handshake, A and B can be sure of one thing: I can hear you, and you can hear me. In this way, you can start normal communication. If it is twice, it will not be determined.

When the data transmission is complete, the disconnection requires four waves of TCP:

  1. The first wave of hands, the client sets seq and ACK, and sends a FIN (final) segment to the server. At this point, the client enters the FIN_WAIT_1 state, indicating that the client has no data to send to the server.
  2. Waving for the second time, the server received the FIN segment sent by the client and returned an ACK segment to the client.
  3. Waved for the third time, the server sends a FIN segment to the client, requesting to close the connection, and the server enters the LAST_ACK state.
  4. Waved for the fourth time, after the client receives the FIN segment sent by the server, it sends an ACK segment to the server, and then the client enters the TIME_WAIT state. After the server receives the ACK segment from the client, it closes the connection. At this point, if the client waits for 2MSL (referring to the maximum survival time of a fragment in the network) and still does not receive a reply, it means that the server has been shut down normally, so that the client can close the connection. Wave four times

The final complete process diagram

Ten thousand characters long text, understand TCP, IP and HTTP, HTTPS in one text

 

Why wave four times?

After the client sends a FIN connection release message, the server receives this message and enters the CLOSE-WAIT state. This state is for the server to send data that has not yet been transmitted. After the transmission is completed, the server will send a FIN connection release message.

HTTP persistent connection

If there are a large number of connections, each time they are connected and closed, they will experience three handshake and four waves, which will obviously cause poor performance. therefore. Http has a mechanism called keepalive connections. It can keep the connection after transmitting data. When the client needs to get data again, it directly uses the connection that has just been idle without having to shake hands again.

HTTP和HTTPS

What is HTTP?

Hypertext Transfer Protocol is a stateless, application layer protocol based on request and response. It is often based on TCP/IP protocol to transmit data. It is the most widely used network protocol on the Internet. All WWW documents must comply with this standard. . The original intention of designing HTTP is to provide a way to publish and receive HTML pages.

HTTP features:

  1. Stateless: The protocol has no state storage for the client, and no "memory" ability for transaction processing. For example, accessing a website requires repeated login operations.
  2. Connectionless: Before HTTP/1.1, due to the stateless nature, each request requires four waves of TCP three-way handshake to re-establish a connection with the server. For example, if a client requests the same resource multiple times in a short period of time, the server cannot distinguish whether it has already responded to the user's request. Therefore, each time it needs to respond to the request again, it takes unnecessary time and traffic.
  3. Based on request and response: The basic feature is that the client initiates a request and the server responds.
  4. Simple, fast and flexible.
  5. The use of plain text in communication, requests and responses will not confirm the communicating party and cannot protect the integrity of the data.

HTTP message composition:

  1. Request line: including request method, URL, protocol/version
  2. Request Header
  3. Request body
  4. Status line
  5. Response header
  6. Response body

Ten thousand characters long text, understand TCP, IP and HTTP, HTTPS in one text

 

Disadvantages of HTTP:

  1. The communication uses plain text (not encrypted), and the content may be eavesdropped.
  2. The identity of the communicating party is not verified, so it is possible to encounter masquerading.
  3. The integrity of the message cannot be proven, so it may have been tampered with.

What is HTTPS?

HTTPS: An HTTP channel with security as the goal. Simply speaking, it is a secure version of HTTP, that is, an SSL layer is added to HTTP. The security foundation of HTTPS is SSL. Therefore, the encryption details require SSL. The main function of the HTTPS protocol can be divided into two types: one is to establish an information security channel to ensure the security of data transmission; the other is to confirm the authenticity of the website.

Ten thousand characters long text, understand TCP, IP and HTTP, HTTPS in one text

 


HTTPS is not a new protocol at the application layer. It's just that the HTTP communication interface is replaced by SSL (Secure Socket Layer) and TLS (Transport Layer Security) protocols. Normally, HTTP communicates directly with TCP. When using SSL, it evolves to communicate with SSL first, and then communicate with SSL and TCP. In short, the so-called HTTPS is actually HTTP in the shell of the SSL protocol.

HTTPS communication method:

  1. The client uses the https URL to access the Web server and requires an SSL connection with the Web server.
  2. After the web server receives the client's request, it will send a copy of the website's certificate information (the certificate contains the public key) to the client.
  3. The client's browser and the Web server begin to negotiate the security level of the SSL connection, that is, the level of information encryption.
  4. The client's browser establishes a session key according to the security level agreed by both parties, and then uses the website's public key to encrypt the session key and transmit it to the website.
  5. The web server uses its private key to decrypt the session key.
  6. The web server uses the session key to encrypt the communication with the client.

Ten thousand characters long text, understand TCP, IP and HTTP, HTTPS in one text

 

Why HTTPS is safe

  1. SSL not only provides encryption, but also uses hybrid encryption.
  2. SSL also uses a method called a certificate, which can be used to determine the party. The certificate is issued by a trusted third-party organization to prove that the server and the client actually exist. In addition, forging certificates is extremely difficult from a technical point of view. So as long as the certificate held by the communicating party (server or client) can be confirmed.

Ten thousand characters long text, understand TCP, IP and HTTP, HTTPS in one text

 


Encryption method

Symmetric encryption: The method of using the same key for encryption and decryption is called shared key encryption (Common keycrypto system), which is also called symmetric key encryption.

The pairwise encryption method is relatively inefficient and the encryption speed is slow. In addition, symmetric encryption has a security risk. The encrypted key must be transmitted to the other party to decrypt it. If the other party obtains the key during the key transmission process, it is not that the key has lost the meaning of encryption, so use it completely. Symmetric encryption is also insecure.

Asymmetric encryption: Public key encryption uses a pair of asymmetric keys. One is called a private key, and the other is called a public key. As the name implies, the private key cannot be known to anyone else, while the public key can be released at will and anyone can obtain it. Public key encryption and private key decryption use public key encryption. The party sending the cipher text uses the other party's public key for encryption. After the other party receives the encrypted information, it uses its own private key for decryption.

So is asymmetric encryption necessarily safe? Asymmetric encryption is also not secure, why? Because there is an intermediate forged public key and private key, if someone obtains the public key when the public key is passed to the other party, although she can’t do anything with your public key, it intercepts the public key and uses the forged public key. Send to the other party so that the other party does not obtain the real public key. When the other party encrypts the file with the public key, then send the file to the other party, so that even if the interceptor does not obtain the real private key, the public key during encryption is The person who intercepts the person obtains the encrypted file and only needs to decrypt it with his private key to successfully obtain the file.

Hybrid encryption mechanism (a combination of symmetric encryption and asymmetric encryption), as the name suggests, is a combination of symmetric encryption and asymmetric encryption.

Ten thousand characters long text, understand TCP, IP and HTTP, HTTPS in one text

 

How to prove that disclosure does not require its authenticity. Because in the process of public key transmission, the real public key may have been replaced by the attacker.

In order to solve the above problems, the CA certification certificate was removed. The server sends the CA certificate to the client for public key encryption communication. The client receiving the certificate can use the public key of the digital certificate certification authority to verify the digital signature on that certificate. Once the verification is passed, the client can clarify two things:

  • One: The public key of the authentication server is a real and effective digital certificate certification authority;
  • Two: The public key of the server is trustworthy.

So how to hand over the public key to the client is a very important matter. Therefore, when most browser developers release the version, they will implant the public key of the commonly used certification authority in advance to ensure that the public key is used by the certification authority. The public key avoids the process of public key forgery, thereby ensuring security.

Author: Non-Coban Coban

Source: WeChat Official Account

Guess you like

Origin blog.csdn.net/qq_45401061/article/details/108609644