Detailed explanation of the main points of HTTP and HTTPS

OSI (Open System Interconnect) open system interconnection model

OSI reference model: application layer, presentation layer, session layer, transport layer, network layer, data link layer, physical layer

  • Application layer: The application layer provides an interface for applications to access network services, and directly provides various network services for users. Common application layer network service protocols are: HTTP, HTTPS, FTP, POP3, SMTP, etc.
  • Presentation layer: Provides various encoding and conversion functions for application layer data to ensure that the data sent by the application layer of one system can be recognized by the application layer of another system
  • Session layer: responsible for establishing, managing and terminating communication sessions between presentation layer entities.
  • Transport layer: Provides end-to-end reliable or unreliable data transmission services for upper-layer protocols. The transmission unit is the data segment segment; the protocols are: TCP, UDP
  • Network layer: establishes a connection between two nodes through IP addressing (solves network routing and addressing issues). Protocols are: IP, ARP, ICMP, and the transmission unit is packet
  • Data link layer: accepts the data in the form of bit stream from the physical layer, encapsulates it into a frame, and transmits it to the upper layer; similarly, it also disassembles the data frame from the upper layer into data in the form of bit stream and forwards it to the physical layer. layer; and is also responsible for processing the information of the acknowledgment frame sent back by the receiver in order to provide reliable data transmission. (The transmission unit is frame) MAC belongs to this layer
  • Physical layer: Use the transmission medium to provide physical connections for the data link layer, realize the transparent transmission of bit streams between adjacent computer nodes, and shield the differences between the specific transmission medium and physical equipment as much as possible. Network cards and hubs are at the physical layer (data transmission unit bit)

Protocol stack

Protocol Stack refers to the sum of the protocols of all layers in the network

router

It is a device that connects various local area networks and wide area networks in the Internet. It will automatically select and set routes according to the channel conditions, and send signals in the best path and in sequence.
The router consists of two parts, WAN and LAN. WAN is used for dial-up, which is a part that allows the router to access the Internet. LAN is used to exchange data in the local area network. Like the switch, our computer is plugged into the LAN. Only the mouth can access the Internet
and the router has a built-in DHCP server, which can automatically assign IP to the computer using the router

路由器在网络层

switch

It is a network device used for electrical (optical) signal forwarding. The function can be simply understood as connecting some machines to form a local area network. It can provide an exclusive electrical signal path for any two network nodes of the access switch.

He only has LAN, no WAN

Plug the network cable into any port of the switch, plug computer 1 into any port of the switch, and plug computer 2 into any port of the switch, so that both computers need to dial up to access the Internet, it seems that the switch is used for data exchange

The switch is at the data link layer (there are also multi-layer switches: data link layer + part of the network layer)

cat

把光纤转换成直接能插在电脑上的网线

TCP/IP model

Application layer (the session layer and presentation layer of the OSI model are merged into the application layer)

transport layer

The internetwork layer (corresponding to the network layer of the OSI reference model) also often becomes the IP layer

Network access layer or host to network layer (the physical layer in the OSI reference model corresponds to the data link layer)

  • The IP layer includes the Internet Control Message Protocol (ICMP: Internet Control Message Protocol) and the Address Resolution Protocol ARP. In fact, they are not part of the IP layer, but work directly with the IP layer. ICMP is used to transmit packet control data such as error information, time, echo, and network information. ARP is between the IP and data link layers, it is a protocol that performs translation between 32-bit IP addresses and 48-bit LAN addresses

  • Ethernet data format:

    Ethernet uses 48bit (6 bytes) to represent the original address and destination address. The source address and destination address here refer to the hardware address, such as the MAC address of the network card.

    After the address is a two-byte field representing the type. For example, 0800 indicates that the data of the frame is IP data, and 0806 indicates that the frame is an ARP request.

    The data after the type field. For Ethernet, the size range of the specified data segment is 46 bytes to 1500 bytes, and the insufficient data should be filled with null characters. For example, the data format of the ARP protocol is 28 bytes. In order to comply with the specification, there are 18-byte placeholders to meet the minimum 46-character requirement.

    The length of the data segment has a maximum value, which is 1500 for Ethernet. This feature is MTU, that is, the maximum transmission unit. If the length of the data to be transmitted at the IP layer is larger than the MTU, the data at the IP layer should be fragmented so that each fragment is smaller than the MTU.

    The CRC field is used to check the data in the frame to ensure the correctness of data transmission. It is usually implemented by hardware, such as the CRC check of network data in a network card device.

    The 14-byte feature of the Ethernet header will cause efficiency problems in the implementation of some platforms. For example, a platform with 4-byte alignment will usually re-copy it when obtaining IP data.

ARP Address Resolution Protocol

ARP (Address Resolution Protocol, Address Resolution Protocol) is a network layer located in the TCP/IP protocol stack, and is responsible for resolving an IP address into a corresponding MAC physical address.

In an Ethernet-based local area network, each network interface has a hardware address, which is a 48-bit value that identifies different Ethernet devices. In the local area network, you must know the hardware address of the network device to send data to the destination host. In the Internet, the destination address of data transmission is the IP address. If the data can be transmitted normally, the mapping record between the IP address and the hardware address must be established.

ARP protocol for 32-bit IP address to 48-bit hardware address mapping

IP (Internet Protocol)

The main purpose of the IP layer is to provide the interconnection of subnets to form a larger network so that data can be transmitted between different subnets.

The main role of the IP layer:

  • Data transfer: transfer data from one host to another
  • Addressing: Find the correct destination host address based on subnetting and IP address
  • Routing: choosing the path for data to travel over the Internet
  • Segmentation of data packets: When the transmitted data is larger than the MTU, the data is sent and received in segments and assembled.

The IP address of IPV4 is 32 bits, consisting of four groups of decimal numbers, each group of values ​​ranging from 0 to 255, separated by the middle.
An IP address is composed of IP address type, network ID, and host ID.
The network type identifier identifies the IP address to which Type
Network ID Identifier IP identifies the
network on which the device or host is located. Host ID identifies the workstation, server, or router on the network.

The address whose host ID is all 0 represents the network address of a certain network The address whose
host ID is all 1 represents the broadcast address The address
whose IP is all 0 represents the host itself, and the data packets sent to this IP address are received by the local machine and the
IP is all 1 The address indicates the limited broadcast address
IP address 127.0.0.1 is a special loopback address, which is generally used for local testing.

The IP protocol is used to connect multiple packet-switching networks. It transmits what is called a data packet between the source address and the destination address. It also provides the function of reassembling the data size to adapt to different networks. Requirements for package size.

The network layer IP provides an unreliable service. It just sends packets from the source node to the destination node as fast as possible, but does not provide any reliability guarantees.

Network Control Message Protocol (ICMP)

It is used to transmit message control data such as error information, time, echo, and network information. It is often used to detect whether the network is unreachable, whether the host is reachable, and whether the router is available.

The ICMP protocol can be divided into two categories, one is the query message and the other is the error message.

UDP protocol

UDP: user data protocol User data packet protocol --- an unreliable, connectionless protocol; suitable for scenarios where data loss is not feared, packets do not need to be sorted, and flow control is not required

The UDP protocol does not guarantee the order of data packet transmission or the accurate arrival of the data
. Compared with the TCP protocol, the UDP protocol is much faster than TCP because the UDP protocol is much simpler and causes less load on the system.

Application scenarios: streaming media transmission, domain name server, embedded set-top box system

TCP protocol

TCP: transmission control protocol transmission control protocol---provides a reliable, connection-oriented, flow-controlled transport layer protocol on the unreliable ip layer. In order to provide this reliable service, TCP adopts timeout retransmission , sliding window, sending and receiving end-to-end acknowledgment packets and other mechanisms to ensure that the receiving end can receive all the packets of the sending end, and the order is consistent with the sending order.

Sliding window: The receiver controls the sending speed of the sender by notifying the sender of its own window size, so as to prevent the sender from being overwhelmed due to the excessive sending speed.

TCP features:

  • Connection-oriented service: Before data is transmitted, a connection needs to be established, and then TCP packets are transmitted on the basis of this connection
  • Reliable transmission service: based on the checksum response retransmission mechanism to ensure the reliability of transmission
  • Buffered transmission: It can delay the transmission of data at the application layer, allowing the data to be transmitted by the application to be accumulated to a certain amount before centralized transmission
  • Full-duplex transmission: data stream exchange in full-duplex mode
  • Flow control transmission: Supports end-to-end flow control between hosts through a sliding window mechanism
  • Byte-stream service: Byte-stream-based service does not suffer from endianness issues
three-way handshake

Detailed explanation of the main points of HTTP and HTTPS

  • In the initial state, both client A and server B are in the CLOSED state
  • Server B creates a TCB and is in the LISTEN state, waiting for client A's request
  • Client A creates a TCB, sends a connection request (SYN = 1, seq = x) message, and enters the SYN-SENT state
  • Server B receives the connection request and sends an acknowledgment (SYN=ACK=1, confirmation number ack=x+1, initial sequence number seq=y) to client A to reply with an ACK message and enter the SYN-RCVD state
  • After client A receives the confirmation from server B, it sends confirmation to server B (ACK=1, ack=y+1, seq=x+1), and A enters the ESTABLISHED state
  • After server B receives the confirmation from client A, it enters the ESTABLISHED state
waved four times

Detailed explanation of the main points of HTTP and HTTPS

  • In the initial state, both client A and server B are in the ESTABLISHED state
  • Client A sends a connection release message (FIN=1, serial number seq=u) to the server, and stops sending data again, actively closes the TCP connection, enters the FIN-WAIT-1 (termination wait 1) state, and waits for the server Confirmation of Terminal B
  • After server B receives the connection release segment, it sends out an acknowledgement segment, (ACK=1, acknowledgement number ack=u+1, sequence number seq=v), and the server side enters CLOSE-WAIT (waiting for closing state), this When TCP is in a half-closed state, the connection from client A to server B is released
  • After client A receives the confirmation from server B, it enters the FIN_WAIT-2 (termination wait 2) state and waits for server B to send a connection release segment
  • Server B has no data to send to client A. Server B sends a connection release segment (FIN=1, ACK=1, sequence number seq=w, acknowledgment number ack=u+1), and the server enters LAST -ACK (last acknowledgment) state, waiting for client A's acknowledgment
  • After client A receives the connection release message from server B, it sends an acknowledgement segment (ACK=1, seq=u+1, ack=w+1), and client A enters TIME-WAIT (time waiting). )condition. At this time, TCP is not released, and it needs to wait for the time set by the timer 2MSL before A enters the CLOSED state.

https://www.cnblogs.com/Andya/p/7272462.html

Why does the TIME_WAIT state need to go through 2MSL (maximum segment lifetime) to return to the CLOSE state?

Answer: Although it is reasonable to say that all four packets have been sent, we can directly enter the CLOSE state, but we must pretend that the network is unreliable, and the last ACK may be lost. So the TIME_WAIT state is used to resend ACK messages that may be lost.

Server side is vulnerable to SYN attack

The server-side resource allocation is allocated during the second handshake, while the client's resources are allocated when the three-way handshake is completed, so the server is vulnerable to SYN flooding attacks. , and continuously send SYN packets to the Server, the Server replies to the confirmation packet and waits for the Client to confirm. Since the source address does not exist, the Server needs to continue to retransmit until the timeout. These forged SYN packets will occupy the unconnected queue for a long time, resulting in Normal SYN requests are discarded because the queue is full, causing network congestion or even system paralysis.

Measures to prevent SYN attacks: reduce the waiting time of the host so that the host can release the occupancy of the semi-connection as soon as possible, and discard the subsequent requests after repeated SYN of a certain IP in a short time.

TCB Transmission Control Block, which stores important information in each connection, such as TCP connection table, pointers to send and receive buffers, pointers to retransmission queues, and current send and receive sequence numbers.

A segment with SYN=1 cannot carry data

What is HTTP

The HTTP protocol is the abbreviation of Hyper Text Transger Protocol (Hyper Text Transfer Protocol). It is
used to transmit hypertext from the World Wide Web (www: World Wide Web) server to the client.
HTTP is a communication protocol based on TCP/IP to transfer data. The default HTTP The port number is 80.
HTTP is an application layer protocol
consisting of a request and a response. It is a standard client-server model.
HTTP allows the transmission of any type of data object. The type being transmitted is marked by Content-Type.
HTTP is a stateless protocol. Stateless means that the protocol has no memory capability for transaction processing.

Risks faced by HTTP transport

Eavesdropping Risk: Hackers can gain access to communications.
Tampering Risk: Hackers can modify the content of communications.
Risk of impersonation: Hackers can impersonate others and engage in communications

What is SSL/TLS

SSL is the abbreviation of Secure Sockets Layer, which is called "Secure Socket Layer"
in Chinese. TLS is the abbreviation of Transport Layer Security", which is called "Transport Layer Security" in Chinese
. Provides security support for data communication between IP protocol and various application layer protocols.

What is HTTPS

The HTTPS protocol (HyperText Transfer Protocol over Secure Socket Layer) still needs to be transmitted based on TCP (the so-called "HTTP over SSL" actually adds a layer of SSL encapsulation to the original HTTP data. The original GET, POST and other mechanisms are basically unchanged)
HTTPS protocol is carried on top of TLS or SSL protocol layer, and the port number of HTTPS is 443

Detailed explanation of the main points of HTTP and HTTPS

HTTPS Features

Confidentiality (anti-leakage - all information is encrypted and transmitted, hackers cannot eavesdrop), integrity (tamper-proof - with a verification mechanism, once tampered, both parties will find out immediately), authenticity (anti-counterfeiting - equipped with ××× book to prevent identity being impersonated)
Detailed explanation of the main points of HTTP and HTTPS

  1. Anti-traffic hijacking
    site-wide HTTPS is a solution to radically eliminate traffic hijacking by operators and middlemen. It can not only prevent small advertisements from being inserted in web pages, but also protect user privacy.
  2. Improve search rankings
    Using Https can help improve search rankings, improve site credibility and brand image.
  3. Eliminate phishing websites The
    green ××× mark in the Https address bar can help users identify phishing websites, protect the interests of users and enterprises from damage, and enhance user trust.

The relationship between HTTP and TCP

The HTTP protocol is usually carried on top of the TCP protocol; HTTP long and short connections are essentially TCP long and short connections. HTTP belongs to the application layer protocol, uses the TCP protocol at the transport layer, and uses the IP protocol at the network layer. The IP protocol mainly solves the problem of network routing and addressing, and the TCP protocol mainly solves how to reliably transmit data packets on the IP layer, so that the receiving end on the network receives all the packets sent by the sending end, and the order is consistent with the sending order. The TCP protocol is reliable and connection-oriented.

Network layer: IP protocol/ARP protocol
Transport layer: TCP/UDP protocol
Application layer: HTTP protocol

Socket is a middleware abstraction layer that communicates between the application layer and the TCP/IP protocol suite. It is a set of interfaces. In the design mode, Socket is actually a facade mode. It hides the complex TCP/IP protocol family behind the Socket interface. For users, a set of simple interfaces is all that is needed. Let the Socket organize the data to meet the specified requirements. protocol.

If the application of host A can communicate with the application of host B, a connection must be established through Socket, and the establishment of a Socket connection must require the underlying TCP/IP protocol to establish a TCP connection. Establishing a TCP connection requires the underlying IP protocol to address hosts on the network. We know that the IP protocol used by the network layer can help us find the target host based on the IP address, but there may be multiple applications running on a host. How to communicate with the specified application is through the TCP or UPD address, which is port number to specify.

Detailed explanation of the main points of HTTP and HTTPS

Long and short connections

HTTP uses TCP connections in two ways: commonly known as "short connections" and "long connections" ("Keep-Alive" or "Persistent Connection")
. The TCP connection used to transmit HTTP data will not be closed. When the client accesses the server again, it will continue to use this established connection. Keep-Alive does not keep the connection forever, it has a keep time, which can be set in different server software (such as Apache). Implementing a persistent connection requires both the client and the server to support persistent connections.
The HTTP protocol that uses a long connection will add this line of code to the response header: Connection:keep-alive
If the HTTP 1.1 version of the HTTP request message does not want to use a long connection, add Connection: close to the header of the HTTP request message .
The keep-alive function of TCP is mainly provided for server applications, trying to detect a half-open connection on the server side, and decide whether to close the connection according to the response

Short connection: Every time the client and the server perform an HTTP operation, a connection is established, and the connection is terminated when the task ends. Both parties can initiate the close operation at will, but generally the client initiates the close operation first.

The advantage of a short connection is that it is simpler to manage, all existing connections are useful connections, and no additional control means are required.

Operation steps for short connection

Establish a connection - data transmission - close the connection... Establish a connection - data transmission - close the connection
The operation steps of the long connection are:
establish a connection - data transmission... (keep the connection)... data transmission - close the connection

Advantages and disadvantages of long and short connections

Long connections can save more TCP establishment and closing operations, reduce waste and save time. For clients that frequently request resources, long connections are more suitable. However, there is a problem here. The detection period of the keepalive function is too long, and it only detects the survival of the TCP connection, which is a relatively gentle approach. When encountering a malicious connection, the keepalive function is not enough. In the application scenario of long connection, the client generally does not actively close the connection between them. If the connection between the client and the server is not closed, there will be a problem. As the number of client connections increases, the server Sooner or later, the server side needs to adopt some strategies, such as closing some connections that have not read and written events for a long time, so as to avoid some malicious connections that cause damage to the server side service; if conditions allow, you can use The client machine is granular, and the maximum number of long connections per client is limited, which can completely avoid a troublesome client from affecting the backend service.

Short connections are simpler to manage for the server, and all existing connections are useful connections that do not require additional control means. But if client requests are frequent, time and bandwidth will be wasted on TCP setup and shutdown operations.

The generation of long connections and short connections lies in the closing strategy adopted by the client and server. Specific application scenarios adopt specific strategies. There is no perfect choice, only suitable choices.

Data transmission completion identification of long connection

Determine whether the transmitted data reaches the size indicated by Content-Length; the
dynamically generated file does not have Content-Length, it is chunked transmission (chunked). The chunked block indicates the end of the current data transfer.

Expiration time for long connections

  keepalive_timeout 20; --long connection timeout
keepalive_requests 8192; --maximum number of requests per connection

When to use long connection, short connection  

Long connections are mostly used for frequent operations, point-to-point communication, and the number of connections cannot be too many. Each TCP connection requires a three-step handshake, which takes time. If each operation is connected first and then operated, the processing speed will be greatly reduced, so each operation will not be disconnected after each operation, and data packets will be sent directly during the next processing. OK, no need to establish a TCP connection. For example, long connections are used for database connections. If short connections are used for frequent communication, socket errors will occur, and frequent socket creation is also a waste of resources.

  And http services like WEB sites generally use short links, because long connections will consume a certain amount of resources for the server, and short links will be more economical for the connection of thousands or even hundreds of millions of clients as frequently as WEB sites. For some resources, if you use a long connection and there are thousands of users at the same time, if each user occupies a connection, then you can imagine it. Therefore, the amount of concurrency is large, but it is better to use a short connection if each user does not need to operate frequently.

How to understand that the HTTP protocol is stateless

The HTTP protocol is stateless, which means that the protocol has no memory capability for transaction processing, and the server does not know what state the client is. That is, there is no connection between opening a web page on a server and the last time you opened a web page on this server.

CA certificate

CA,Catificate Authority, popular understanding is an authentication mechanism. Its function is to provide a certificate (that is, a server certificate, public key + applicant and issuer information + signature) to strengthen the security of client and server access information, and to provide certificate issuance and other related work. Most of the domestic Internet companies have applied for CA certificates in international CA agencies, and when users access, they encrypt the user's information to ensure the user's information security.
When the client parses the certificate, it first verifies whether the public key is valid, such as the issuing authority, expiration time, etc. If an exception is found, a warning box will pop up, indicating that there is a problem with the certificate. If there is no problem with the certificate, then a random value is generated. This random value is then encrypted with the certificate. As mentioned above, lock the random value with a lock, so that unless there is a key, you cannot see the locked content

1. The server obtains the certificate from the CA agency. When the browser requests the server for the first time, the server returns the certificate to the browser.
2. After the browser gets the certificate, it starts to verify the certificate's owner, validity period and other information; the browser starts to look for the trusted certificate issuing authority CA built in the operating system, and compares it with the issuer CA in the certificate sent by the server , used to verify whether the certificate is issued by a legitimate organization; if it cannot be found, the browser will report an error, indicating that the certificate sent by the server is not trustworthy.
3. After verifying the certificate, if the certificate is valid, or the user accepts an untrusted certificate, the browser will generate a random number of passwords and encrypt it with the public key provided in the certificate. Send it to the server, and the server decrypts it with the private key to get a random number. After that, Shuangguang started to use the random number as a key to encrypt and decrypt the data to be transmitted.

Encryption Algorithm

Symmetric encryption

The same key is used for both encryption and decryption.
For example: AES, RC4, 3DES, DES, AES-GCM, ChaCha20-Poly1305, etc.

Disadvantages: There are a large number of different clients and servers, so both parties need to maintain a large number of keys, and the maintenance cost is very high.
Because the security level of each client and server is different, the keys are easily leaked

Asymmetric encryption

The key used for encryption and the key used for decryption are different. They are called public key and private key. The public key and algorithm are both public, and the private key is kept secret. The performance of asymmetric encryption algorithm is low, but the security is super strong. Due to its encryption characteristics, the length of data that can be encrypted by asymmetric encryption algorithm is also limited.
For example: RSA, DSA/DSS, ECDSA, DH, ECDHE

Disadvantages: The public key is public (that is, the hacker will also have the public key), if the information encrypted by the private key is intercepted by the hacker, it can be decrypted using the public key to obtain the content.

hash algorithm

Converts information of arbitrary length to a shorter fixed-length value, usually much smaller than the length of the information, and the algorithm is irreversible.
For example: MD5, SHA-1, SHA-2, SHA-256, etc.

digital signature

The signature is to add a piece of content (the value of the information after hashing) to the back of the information, which can prove that the information has not been modified. The hash value is generally encrypted (that is, a signature) and then sent with the information to ensure that the hash value is not modified.

optimal encryption method

Combine symmetric encryption and asymmetric encryption, take the best of them, get rid of the dross, and give full play to their respective advantages
Detailed explanation of the main points of HTTP and HTTPS

MTU related issues

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324785254&siteId=291194637