Understand computer network stereotyped essays in one day

Network Hierarchy

The computer network system is roughly divided into three types, OSI seven-layer model, TCP/IP four-layer model and five-layer model. In general interviews, the five-tier model is more investigated. The most comprehensive Java interview site

Five-layer model : application layer, transport layer, network layer, data link layer, physical layer.

  • Application layer : Provides interactive services for applications. There are many application layer protocols in the Internet, such as domain name system DNS, HTTP protocol, SMTP protocol, etc.
  • Transport layer : Responsible for providing data transmission services for communication between two host processes. The protocols of the transport layer mainly include the Transmission Control Protocol TCP and the User Data Protocol UDP.
  • Network layer : Select the appropriate routing and switching nodes to ensure timely data transmission. Mainly including the IP protocol.
  • Data link layer : When transmitting data between two adjacent nodes, the data link layer assembles the IP datagrams delivered by the network layer into frames, and transmits the frames on the link between two adjacent nodes.
  • Physical layer : realize the transparent transmission of bit streams between adjacent nodes, and shield the differences in transmission media and physical equipment as much as possible.

The ISO seven-layer model is a standard system developed by the International Organization for Standardization for interconnection between computer or communication systems.

  • Application layer: an interface between network services and end users, common protocols are: HTTP FTP SMTP SNMP DNS .
  • Presentation layer: presentation, security, and compression of data. , to ensure that the information sent by the application layer of one system can be read by the application layer of another system.
  • Session layer: establish, manage, and terminate sessions, corresponding to the host process, which refers to the ongoing session between the local host and the remote host.
  • Transport layer: defines the port number of the protocol for transmitting data, as well as flow control and error checking. The protocols include TCP and UDP .
  • Network layer: perform logical address addressing, realize path selection between different networks, protocols include ICMP, IGMP, IP , etc.
  • Data link layer: Based on the bit stream service provided by the physical layer, data links between adjacent nodes are established.
  • Physical layer: Establish, maintain, and disconnect physical connections.

TCP/IP four-layer model

  • Application layer: corresponding to the OSI reference model (application layer, presentation layer, session layer).
  • Transport layer: Corresponding to the transport layer of OSI, it provides end-to-end communication functions for application layer entities, ensuring the sequential transmission of data packets and the integrity of data.
  • Internet layer: Corresponding to the network layer of the OSI reference model, it mainly solves the communication problem from host to host.
  • Network interface layer: corresponds to the data link layer and physical layer of the OSI reference model.

three handshake

Assume that the sender is the client and the receiver is the server. At the beginning, the state of the client and the server are both CLOSED.

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-zmoiYpSh-1681047578393)(http://img.topjavaer.cn/img/three-way handshake illustration.png)]

  1. The first handshake: the client initiates a connection establishment request to the server, the client will randomly generate a starting sequence number x, and the fields sent by the client to the server include flag bits and sequence SYN=1numbers seq=x. The state of the client before the first handshake is CLOSE, and the state of the client after the first handshake is SYN-SENT. The state of the server at this time is LISTEN.
  2. The second handshake: After receiving the message from the client, the server will randomly generate a starting serial number y of the server, and then reply a message to the client, including flags, , serial SYN=1number ACK=1, seq=yconfirmation no ack=x+1. The state of the server before the second handshake is LISTEN, the state of the server after the second handshake is SYN-RCVD, and the state of the client is SYN-SENT. (Where it SYN=1means to establish a connection with the client, ACK=1means to confirm that the serial number is valid)
  3. The third handshake: After receiving the message from the server, the client will send a message to the server, which contains flags, ACK=1serial numbers seq=x+1, and confirmation numbers ack=y+1. The state of the client before the third handshake is SYN-SENT, and the states of the client and the server are both after the third handshake ESTABLISHED. At this point the connection is established.

This article has been included in the Github warehouse, which includes computer foundation, Java foundation, multithreading, JVM, database, Redis, Spring, Mybatis, SpringMVC, SpringBoot, distributed, microservices, design patterns, architecture, school recruitment and social recruitment sharing, etc. Core knowledge points, welcome to star~

Github address

If you can't access Github, you can access the gitee address.

gitee address

Is it okay to shake hands twice?

The reason why the third handshake is needed is mainly to prevent the invalid connection request segment from being suddenly transmitted to the server, causing problems.

  • For example, when client A sends a connection request, A may not receive the confirmation message due to network congestion, so A retransmits the connection request again.
  • Then the connection is successful, and after waiting for the data transmission to complete, the connection is released.
  • Then the first connection request sent by A does not reach server B until some time after the connection is released. At this time, B mistakenly thinks that A has sent a new connection request, so it sends a confirmation message segment to A.
  • If the three-way handshake is not used, as long as B sends a confirmation, a new connection will be established. At this time, A will not respond to B's confirmation and will not send data, then B will wait for A to send data, wasting resources.

waved four times

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-MueRbqpq-1681047578402)(http://img.topjavaer.cn/img/four waves 0.png) ]

  1. A's application process first sends a connection release segment ( FIN=1,seq=u) to its TCP, stops sending data, actively closes the TCP connection, enters FIN-WAIT-1the state of (stop waiting 1), and waits for B's confirmation.
  2. After B receives the connection release message segment, it sends a confirmation message segment ( ACK=1,ack=u+1,seq=v), and B enters CLOSE-WAITthe (closed waiting) state. At this time, TCP is in a half-closed state, and the connection from A to B is released.
  3. After A receives B's confirmation, it enters FIN-WAIT-2the (terminate waiting 2) state and waits for the connection release segment sent by B.
  4. After B sends the data, it will send a connection release segment ( FIN=1,ACK=1,seq=w,ack=u+1), and B enters LAST-ACKthe (final confirmation) state, waiting for A's confirmation.
  5. After A receives B's connection release message segment, it sends a confirmation message segment ( ACK=1,seq=u+1,ack=w+1), and A enters TIME-WAITthe (time waiting) state. At this time, the TCP is not released, and 2MSLA enters the state only after the time set by the waiting timer (the maximum message segment lifetime) has elapsed CLOSED. B closes the connection after receiving the acknowledgment segment from A. If it does not receive the acknowledgment segment from A, B will retransmit the connection release segment.

Why wait for 2MSL for the fourth wave?

  • Ensure that the last ACK segment sent by A can reach B. This ACKmessage segment may be lost. If B fails to receive the confirmation message, it will retransmit the connection release message segment over time, and then A can 2MSLreceive the retransmitted connection release message segment within the time, and then A retransmits Once confirmed, the 2MSL timer is restarted, and finally both A and B enter the CLOSEDstate. If A TIME-WAITdoes not wait for a period of time in the state, but releases the connection immediately after sending the ACK segment, it cannot receive the connection release retransmitted by B. message segment, so no confirmation message segment will be sent again, and B will not be able to enter CLOSEDthe state normally.
  • Prevent invalid connection request segments from appearing in this connection . After A sends the last ACKsegment, and then passes through 2MSL, all segments generated by this connection can disappear from the network, so that no old connection request segments will appear in the next new connection .

Why four waves?

SYNBecause when the server side receives the connection request message from the client side , it can send SYN+ACKthe message directly. But when closing the connection, when the server side receives the connection release message from the client side, it may not close the SOCKET immediately , so the server side replies with a ACKmessage first, telling the client side that I have received your connection release message text. Only when all the messages on the server end have been sent, can the server end send a connection release message, and then the two sides will actually disconnect. So four waves are required.

Talk about what fields are in the header of the TCP message, and what are their functions?

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-4YGxqurV-1681047578410)(http://img.topjavaer.cn/img/tcpgram.png)]

  • 16-bit port number : source port number, where the message segment of the host comes from; destination port number, which upper layer protocol or application to pass to
  • 32-bit sequence number : The number of each byte of the byte stream in a certain transmission direction during a TCP communication (from TCP connection establishment to disconnection).
  • 32-bit confirmation number : used as a response to the tcp segment sent by the other party. Its value is the sequence number value of the received TCP segment plus 1.
  • 4-bit header length : Indicates how many 32bit words (4 bytes) there are in the tcp header. Because 4 bits can identify up to 15, the longest TCP header is 60 bytes.
  • 6-bit flags : URG (whether the emergency pointer is valid), ACk (indicates whether the confirmation number is valid), PSH (the buffer has not been filled), RST (indicates that the other party is required to re-establish the connection), SYN (the connection establishment message flag), FIN (indicates that the other party is informed that the connection is to be closed)
  • 16-bit window size : is a means of TCP flow control. The window mentioned here refers to the receiving notification window. It tells the other party how many bytes of data the local TCP receiving buffer can hold, so that the other party can control the speed of sending data.
  • 16-bit checksum : Filled by the sender, the receiver performs the CRC algorithm on the TCP segment to check whether the TCP segment is damaged during transmission. Note that this check includes not only the TCP header, but also the data part. This is also an important guarantee for reliable transmission of TCP.
  • 16-bit urgent pointer : a positive offset. It is added to the value of the sequence number field to indicate the sequence number of the next byte of the last urgent data. Therefore, to be precise, this field is the offset of the urgent pointer relative to the current sequence number, which may be called the urgent offset. TCP's urgent pointer is a way for the sender to send urgent data to the receiver.

What are the characteristics of TCP?

  • TCP is a connection-oriented transport layer protocol.
  • Point-to-point , each TCP connection can only have two endpoints.
  • TCP provides a reliably delivered service.
  • TCP provides full-duplex communication .
  • Oriented to byte streams .

The difference between TCP and UDP?

  1. TCP is connection-oriented ; UDP is connectionless, that is, there is no need to establish a connection before sending data.
  2. TCP provides reliable service ; UDP does not guarantee reliable delivery.
  3. TCP is byte-oriented and regards data as a series of unstructured byte streams; UDP is packet-oriented.
  4. TCP has congestion control ; UDP has no congestion control, so network congestion will not reduce the sending rate of the source host (useful for real-time applications, such as real-time video conferencing, etc.).
  5. Each TCP connection can only be point-to-point ; UDP supports one-to-one, one-to-many, many-to-one and many-to-many communication methods.
  6. The TCP header overhead is 20 bytes; the UDP header overhead is small, only 8 bytes.

What are the common application layer protocols corresponding to TCP and UDP?

TCP-based application layer protocols include: HTTP, FTP, SMTP, TELNET, SSH

  • HTTP : HyperText Transfer Protocol (Hypertext Transfer Protocol), default port 80
  • FTP : File Transfer Protocol (File Transfer Protocol), default port (20 for transferring data, 21 for transferring control information)
  • SMTP : Simple Mail Transfer Protocol (Simple Mail Transfer Protocol), default port 25
  • TELNET : Teletype over the Network, default port 23
  • SSH : Secure Shell (secure shell protocol), default port 22

UDP-based application layer protocols: DNS, TFTP, SNMP

  • DNS : Domain Name Service (Domain Name Service), default port 53
  • TFTP : Trivial File Transfer Protocol (Trivial File Transfer Protocol), default port 69
  • SNMP : Simple Network Management Protocol (Simple Network Management Protocol), received through UDP port 161, only Trap information uses UDP port 162.

TCP sticky packet and unpacking

TCP is stream-oriented, a string of data without boundaries. The bottom layer of TCP does not understand the specific meaning of the upper layer business data. It will divide the packets according to the actual situation of the TCP buffer zone. Therefore, in terms of business, it is considered that a complete packet may be split into multiple packets by TCP for transmission . It is also possible to encapsulate multiple small packets into one large data packet and send it . This is the so-called TCP sticky packet and unpacking problem.

Why does sticking and unpacking occur?

  • The data to be sent is smaller than the size of the TCP sending buffer, and TCP will send out the data that has been written to the buffer multiple times at one time, and sticky packets will occur;
  • The application layer at the receiving data end does not read the data in the receiving buffer in time, and sticky packets will occur;
  • The data to be sent is larger than the remaining space of the TCP send buffer, and unpacking will occur;
  • If the data to be sent is larger than MSS (maximum message size), TCP will unpack it before transmission. That is, TCP packet length - TCP header length > MSS.

solution:

  • The sender encapsulates each data packet as a fixed length
  • Add special characters at the end of the data to split
  • Divide the data into two parts, one is the header and the other is the content body; the header structure has a fixed size, and there is a field to declare the size of the content body.

Talk about how TCP ensures reliability?

  • First of all, TCP connection is based on three-way handshake , while disconnection is based on four-way handshake . Ensure reliability of connection and disconnection.
  • Secondly, the reliability of TCP is also reflected in its state ; TCP will record which data is sent, which data is received, and which data is not accepted, and ensures that data packets arrive in order to ensure that data transmission is error-free.
  • Again, the reliability of TCP is also reflected in its controllability . It has mechanisms such as data packet verification, ACK response, timeout retransmission (sender) , out-of-sequence data retransmission (receiver), discarding duplicate data, flow control (sliding window) and congestion control.

Finally, I would like to share with you a Github warehouse, which has more than 300 classic computer book PDFs compiled by Dabin, including C language, C++, Java, Python, front-end, database, operating system, computer network, data structure and algorithm, machine learning , programming life , etc., you can star it, next time you look for a book directly search on it, the warehouse is continuously updated~

Github address

Talk about the sliding window mechanism of TCP

TCP uses sliding windows to implement flow control. Flow control is to control the sending rate of the sender to ensure that the receiver has time to receive. Both parties to a TCP session maintain a sending window and a receiving window. The size of the receiving window depends on the limitations of the application, system, and hardware. The sending window depends on the receiving window advertised by the peer. The window field in the confirmation message sent by the receiver can be used to control the window size of the sender, thereby affecting the sending rate of the sender. If the window field of the receiver's confirmation message is set to 0, the sender cannot send data.

The TCP header contains a window field, 16 bits, which represents the byte capacity of the window, up to 65535. This field is for the receiving end to tell the sending end how many buffers it has to receive data. Therefore, the sending end can send data according to the processing capability of the receiving end without causing the receiving end to be unable to process it. The size of the receive window is approximately equal to the size of the send window.

Tell me about congestion control in detail?

Prevent excessive data injection into the network. Several congestion control methods: slow start (slow-start), congestion avoidance (congestion avoidance), fast retransmission (fast retransmit) and fast recovery (fast recovery).

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-g4RLMOVs-1681047578415)(http://img.topjavaer.cn/img/congestion control.jpg)]

slow start

Set the congestion window cwnd to the value of a maximum segment MSS. After receiving an acknowledgment for a new message segment, the congestion window is increased by at most one MSS value. Every time a transmission round passes, the congestion window cwnd is doubled. In order to prevent network congestion caused by excessive growth of the congestion window cwnd, it is also necessary to set a slow start threshold ssthresh state variable.

When cwnd < ssthresh, the slow start algorithm is used.

When cwnd > ssthresh, stop using the slow start algorithm and use the congestion avoidance algorithm instead.

When cwnd = ssthresh, either the slow start algorithm or the congestion control avoidance algorithm can be used.

congestion avoidance

Let the congestion window cwnd increase slowly, and increase the sender's congestion window cwnd by 1 every time a round-trip time RTT passes, instead of doubling. In this way, the congestion window cwnd grows slowly according to a linear law.

Regardless of whether it is in the slow start phase or the congestion avoidance phase, as long as the sender judges that the network is congested (the basis is that no acknowledgment has been received), the slow start threshold ssthresh must be set to half of the sender window value when congestion occurs (but not less than 2). Then reset the congestion window cwnd to 1 and execute the slow start algorithm. The purpose of doing this is to quickly reduce the number of packets sent by the host to the network, so that the congested router has enough time to process the backlog of packets in the queue.

fast retransmission

Sometimes individual segments are lost in the network, but the network is not actually congested. If the sender fails to receive the confirmation for a long time, a timeout will occur, and it will be mistaken for congestion in the network. This causes the sender to mistakenly start the slow start and set the congestion window cwnd to 1, thus reducing the transmission efficiency.

The fast retransmission algorithm can avoid this problem. The fast retransmission algorithm first requires the receiver to send a repeated confirmation immediately after receiving an out-of-sequence segment, so that the sender can know early that a segment has not reached the other party.

As long as the sender receives three repeated acknowledgments in a row, it should immediately retransmit the message segment that the other party has not yet received, instead of waiting for the retransmission timer to expire. Since the sender retransmits unacknowledged segments as soon as possible, the throughput of the entire network can be increased by about 20% after fast retransmission is adopted.

fast recovery

When the sender receives three repeated confirmations in a row, it will halve the slow start threshold ssthresh, then set the cwnd value to the value after the slow start threshold ssthresh is halved, and then start to execute the congestion avoidance algorithm to make the congestion window slowly linear increase.

When using the fast recovery algorithm, the slow start algorithm is only used when the TCP connection is established and the network times out. Adopting such a congestion control method makes the performance of TCP significantly improved.

What are the characteristics of the HTTP protocol?

  1. HTTP allows the transmission of arbitrary types of data. The type of transmission is marked by Content-Type.
  2. stateless . For each request sent by the client, the server considers it a new request, and there is no connection between the previous session and the next session.
  3. Support client/server mode .

HTTP message format

An HTTP request consists of four parts : request line, request header, blank line, and request body .

  • Request line : including the request method, the resource URL accessed, and the HTTP version used. GETand POSTare the most common HTTP methods, among others DELETE、HEAD、OPTIONS、PUT、TRACE.
  • Request header : The format is "attribute name:attribute value", and the server obtains the client's information according to the request header, mainly including cookie、host、connection、accept-language、accept-encoding、user-agent.
  • Request body : the user's request data such as username, password, etc.

Example request message :

POST /xxx HTTP/1.1 请求行
Accept:image/gif.image/jpeg, 请求头部
Accept-Language:zh-cn
Connection:Keep-Alive
Host:localhost
User-Agent:Mozila/4.0(compatible;MSIE5.01;Window NT5.0)
Accept-Encoding:gzip,deflate

username=dabin 请求体

The HTTP response also consists of four parts, namely: status line, response header, blank line, and response body .

  • Status line : protocol version, status code and status description.
  • Response header : There are mainly response header fields connection、content-type、content-encoding、content-length、set-cookie、Last-Modified,、Cache-Control、Expires.
  • Response body : the content returned by the server to the client.

Example response message :

HTTP/1.1 200 OK
Server:Apache Tomcat/5.0.12
Date:Mon,6Oct2003 13:23:42 GMT
Content-Length:112

<html>
    <body>响应体</body>
</html>

What are the HTTP status codes?

The HTTP status code is the response status code returned by the server to the client. According to the status code, we can know the specific meaning that the server wants to express to the client. For example, 200 means that the request access is successful, and 500 means that the server-side program error and so on. HTTP status codes can be divided into 5 categories:

  1. 1XX: message status code.
  2. 2XX: Success status code.
  3. 3XX: Redirection status code.
  4. 4XX: Client error status code.
  5. 5XX: Server error status code.

These five categories contain many specific status codes.

1XX is the message status code , where:

  • 100: Continue to continue. The client should proceed with its request.
  • 101: Switching Protocols switching protocols. The server switches protocols at the client's request. You can only switch to a higher-level protocol, for example, to a new version of the HTTP protocol.

2XX is the successful status code , where:

  • 200: OK The request was successful. Generally used for GET and POST requests.
  • 201: Created has been created. A new resource was successfully requested and created.
  • 202: Accepted has been accepted. The request has been accepted but not processed to completion.
  • 203: Non-Authoritative Information Non-authoritative information. The request was successful. But the returned meta information is not on the original server, but a copy.
  • 204: No Content No content. The server processed successfully, but no content was returned. Ensures that the browser continues to display the current document without updating the web page.
  • 205: Reset Content Reset content. The server processing is successful, and the user terminal (eg: browser) should reset the document view. The browser's form field can be cleared with this return code.
  • 206: Partial Content Partial content. The server successfully processed some GET requests.

3XX is the redirection status code , where:

  • 300: Multiple Choices Multiple choices. The requested resource may include multiple locations, and a list of resource characteristics and addresses may be returned for user terminal (eg browser) selection.
  • 301: Moved Permanently Move permanently. The requested resource has been permanently moved to the new URI, the returned information will include the new URI, and the browser will automatically be directed to the new URI. Any new future requests should use the new URI instead.
  • 302: Found Temporary move, similar to 301. But the resources are only moved temporarily. Clients should continue to use the original URI.
  • 303: See Other View other addresses. Similar to 301. View using GET and POST requests.
  • 304: Not Modified Not modified. The requested resource has not been modified. When the server returns this status code, no resource will be returned. Clients typically cache accessed resources by providing a header indicating that the client wishes to return only resources modified after a specified date.
  • 305: Use Proxy Use a proxy. The requested resource must be accessed through a proxy.
  • 306: Unused HTTP status code that has been deprecated.
  • 307: Temporary Redirect Temporary redirection. Similar to 302. Redirect using a GET request.

4XX is the client error status code , where:

  • 400: Bad Request The syntax of the client request is wrong and the server cannot understand it.
  • 401: Unauthorized The request requires user authentication.
  • 402: Payment Required reserved for future use.
  • 403: Forbidden The server understands the client's request, but refuses to execute it.
  • 404: Not Found The server cannot find the resource (web page) according to the client's request. This code allows website designers to set up a "The resource you requested could not be found" personality page.
  • 405: Method Not Allowed The method in the client request is forbidden.
  • 406: Not Acceptable The server cannot complete the request based on the content characteristics requested by the client.
  • 407: Proxy Authentication Required The request requires proxy authentication, similar to 401, but the requester should use a proxy for authorization.
  • 408: Request Time-out The server waited too long for the request sent by the client and timed out.
  • 409: Conflict This code may be returned when the server completes the client's PUT request, and a conflict occurred when the server processed the request.
  • 410: Gone The resource requested by the client no longer exists. 410 is different from 404. If the resource has been permanently deleted before, the 410 code can be used. The website designer can specify the new location of the resource through the 301 code.
  • 411: Length Required The server cannot process the request information sent by the client without Content-Length.
  • 412: Precondition Failed The prerequisites for the information requested by the client are wrong.
  • 413: Request Entity Too Large The request is rejected because the requested entity is too large for the server to handle. To prevent continuous requests from the client, the server may close the connection. If the server cannot handle it temporarily, it will contain a Retry-After response message.
  • 414: Request-URI Too Large The requested URI is too long (URI is usually a URL), and the server cannot handle it.
  • 415: Unsupported Media Type The server cannot handle the media format attached to the request.
  • 416: Requested range not satisfactory The range requested by the client is invalid.
  • 417: Expectation Failed The server cannot satisfy the request header information of Expect.

5XX is the error status code of the server , where:

  • 500: Internal Server Error The server has an internal error and cannot complete the request.
  • 501: Not Implemented The server does not support the requested functionality and cannot complete the request.
  • 502: Bad Gateway When the server working as a gateway or proxy tried to execute the request, it received an invalid response from the remote server.
  • 503: Service Unavailable Due to overload or system maintenance, the server is temporarily unable to process the client's request. The length of the delay can be included in the server's Retry-After header.
  • 504: Gateway Time-out The server acting as a gateway or proxy did not obtain the request from the remote server in time.
  • 505: HTTP Version not supported The server does not support the requested HTTP protocol version and cannot complete the processing.

To sum it up :

HTTP status codes are divided into five categories: 1XX: message status code; 2XX: success status code; 3XX: redirection status code; 4XX: client error status code; 5XX: server error status code. The common specific status codes are: 200: The request is successful; 301: Permanent redirection; 302: Temporary redirection; 404: This page cannot be found; 405: The requested method type is not supported; 500: An internal server error.

What requests does the HTTP protocol include?

Eight methods are defined in the HTTP protocol to represent different operation modes for resources specified by Request-URI, as follows:

  • GET: Make a request to a specific resource.
  • POST: Submit data to a specified resource for processing requests (such as submitting a form or uploading a file). Data is included in the request body. POST requests may result in the creation of new resources and/or the modification of existing resources.
  • OPTIONS: Returns the HTTP request methods supported by the server for a specific resource. It is also possible to test the functionality of the server by sending a '*' request to the web server.
  • HEAD: Ask the server for a response consistent with the GET request, but the response body will not be returned. This method makes it possible to obtain the meta information contained in the response header without having to transmit the entire response content.
  • PUT: Upload the latest content to the specified resource location.
  • DELETE: Request the server to delete the resource identified by the Request-URI.
  • TRACE: Echoes the requests received by the server, mainly for testing or diagnosis.
  • CONNECT: Reserved in the HTTP/1.1 protocol for a proxy server that can change the connection to a pipeline.

What is the difference between HTTP status code 301 and 302?

  • 301: (Permanent Transfer) The requested webpage has been permanently moved to a new location. When the server returns this response, it automatically forwards the requester to the new location.
  • 302: (Temporary Transfer) The server is currently responding to requests from web pages in different locations, but the requester should continue to use the original location for future requests. This code is similar to the 301 code in response to GET and HEAD requests and automatically redirects the requester to a different location.

To give a vivid example : when a website or web page is temporarily moved to a new location within 24-48 hours, a 302 redirect is required at this time. For example, I have a house, but I recently moved to a relative’s house Yes, I'll be back in two days. The scenario of using 301 redirection is that the previous website needs to be removed for some reason, and then it needs to be accessed at a new address, which is permanent. For example, your house is actually rented, and now the lease period is up. You found a house in another place, and the house you rented before is no longer available.

What is the difference between POST and GET?

  • The most essential difference between GET and POST is the difference in specifications. In the specification, it is defined that GET requests are used to obtain resources, that is, query operations, while POST requests are used to transmit entity objects, so POST will be used to Perform operations such as adding, modifying and deleting.
  • GET request parameters are passed through the URL, and POST parameters are placed in the request body.
  • GET requests can be rolled back and refreshed directly without any impact on users and programs; and POST requests will submit the data again if they are rolled back and refreshed directly.
  • GET generates one TCP packet; POST generates two TCP packets. For GET requests, the browser will send the request header and request body; for POST, the browser sends the request header first, the server responds with 100 continue, and the browser sends the request body.
  • GET requests are generally cached, such as common CSS, JS, HTML requests, etc.; and POST requests are not cached by default.
  • GET request parameters will be completely retained in the browser history, while parameters in POST will not be retained.

Difference between URI and URL

  • URI, the full name is Uniform Resource Identifier), Chinese translation is Uniform Resource Identifier, the main function is to uniquely identify a resource.
  • URL, the full name is Uniform Resource Location), Chinese translation is Uniform Resource Locator, the main function is to provide the path of resources. To use a classic analogy, a URI is like an ID card, which can uniquely identify a person, while a URL is more like an address, and the person can be found through the URL.

How to understand that the HTTP protocol is stateless

When the browser sends a request to the server for the first time, the server responds; if the same browser sends a second request to the server, it will still respond, but the server does not know that you are the browser just now. In short, the server does not remember who you are, so it is a stateless protocol.

HTTP long connection and short connection?

HTTP short connection: Every time the browser and server perform an HTTP operation, a connection is established, and the connection is terminated when the task is completed. HTTP1.0 uses short connections by default .

HTTP long connection: refers to the multiplexed TCP connection . Multiple HTTP requests can reuse the same TCP connection, which saves the consumption of TCP connection establishment and disconnection.

Since HTTP/1.1, persistent connections are used by default . To use a persistent connection, the Connection in the HTTP header of both the client and the server must be set to keep-alive to support persistent connections.

How does HTTP implement long connections?

HTTP is divided into long connection and short connection, which is essentially the long and short connection of TCP . A TCP connection is a two-way channel, and it can be kept open for a period of time, so the TCP connection has the real long connection and short connection.

A TCP long connection can reuse a TCP connection to initiate multiple HTTP requests, which can reduce resource consumption, such as requesting HTML once, and if it is a short connection, it may also need to request subsequent JS/CSS.

How to set up a long connection?

By setting the Connection field in the header (request and response headers) keep-alive, the HTTP/1.0 protocol supports it, but it is disabled by default. Since HTTP/1.1, the connection is always a long connection by default.

When will the HTTP persistent connection time out?

HTTP generally has an httpd daemon process, in which you can set a keep-alive timeout . When the tcp connection is idle for more than this time, it will be closed. You can also set the timeout period in the HTTP header.

The keep-alive of TCP contains three parameters, which can be set in net.ipv4 of the system kernel; when TCP is connected, tcp_keepalive_time is idle , and a detection packet will be generated. If no ACK is received from the other party, it will Send it again until tcp_keepalive_probes are sent , and the connection will be discarded.

What is the difference between HTTP1.1 and HTTP2.0?

Features supported by HTTP2.0 compared to HTTP1.1:

  • New binary format : HTTP1.1 transmits data based on text format; HTTP2.0 transmits data in binary format, which makes parsing more efficient.

  • Multiplexing : In one connection, multiple requests or responses are allowed to be sent at the same time, and these requests or responses can be transmitted in parallel without being blocked , avoiding the "queue head blocking" problem that occurs in HTTP1.1.

  • Header compression , the header of HTTP1.1 contains a lot of information, and it must be sent repeatedly every time; HTTP2.0 separates the header from the data, and encapsulates it into a header frame and a data frame, and uses a specific algorithm to compress the header frame , effectively reducing the header size. And HTTP2.0 records the key-value pairs sent before on the client and server, and will not send the same data repeatedly. For example, request a sends all header information fields, and request b only needs to send difference data , which can reduce redundant data and reduce overhead.

  • Server-side push : HTTP2.0 allows the server to push resources to the client without the client sending a request to the server to obtain.

What is the difference between HTTPS and HTTP?

  1. HTTP is a hypertext transfer protocol, and information is transmitted in plain text ; HTTPS is a secure ssl encrypted transfer protocol.
  2. HTTP and HTTPS use different ports, HTTP port is 80, HTTPS is 443.
  3. The HTTPS protocol needs to apply for a certificate from a CA institution , which generally requires a certain fee.
  4. HTTP runs on the TCP protocol; HTTPS runs on the SSL protocol, and SSL runs on the TCP protocol.

What is a digital certificate?

The server can apply for a certificate from the certificate authority CA to avoid man-in-the-middle attacks (to prevent certificates from being tampered with). The certificate consists of three parts: the content of the certificate, the certificate signature algorithm and the signature . The signature is to verify the identity.

The server transmits the certificate to the browser, and the browser retrieves the public key from the certificate. The certificate can prove that the public key corresponds to this website.

The process of making a digital signature :

  1. The CA uses a certificate signature algorithm to perform hash operations on the certificate content .
  2. Encrypt the hashed value with the CA's private key to obtain a digital signature.

Browser verification process :

  1. Obtain the certificate, get the certificate content, certificate signature algorithm and digital signature.
  2. Use the public key of the CA organization to decrypt the digital signature (since it is an organization trusted by the browser, the browser will save its public key).
  3. Use the signature algorithm in the certificate to hash the certificate content .
  4. Compare the decrypted digital signature with the hash value obtained by hashing the content of the certificate, and if they are equal, the certificate is trusted.

Principle of HTTPS

The first is the TCP three-way handshake, and then the client initiates an HTTPS connection establishment request. The client first sends a Client Hellopacket, then the server responds Server Hello, and then sends its certificate to the client. Then the two parties undergo key exchange, and finally use the exchanged The key encrypts and decrypts data.

  1. Negotiate an encryption algorithm . In Client Helloit, the client will inform the server of its current information, including the TLS version to be used by the client, the supported encryption algorithms, the domain name to be accessed, and a random number (Nonce) generated for the server. You need to inform the server in advance of the domain name you want to access so that the server can send the certificate of the corresponding domain name.

  2. The server responds Server Hello, telling the client the encryption algorithm selected by the server .

  3. Then the server sends two certificates to the client. The second certificate is the certificate of the issuing authority (CA) of the first certificate.

  4. The client verifies the certificate using the RSA public key publicly released by the certificate authority CA. The following figure shows that the certificate authentication is successful.

  5. After the verification is passed, the browser and the server generate a shared symmetric key through a key exchange algorithm .

  6. Start transmitting data, using the same symmetric key to encrypt and decrypt.

DNS resolution process?

  1. The browser searches its own DNS cache
  2. If not, search the DNS cache and hosts file in the operating system
  3. If not, the operating system sends the domain name to the local domain name server , and the local domain name server queries its own DNS cache. If the search is successful, the result is returned. Otherwise, the query request is sent to the root domain name server, the top-level domain name server, and the authority domain name server in turn, and finally the IP is returned. address to the local domain name server
  4. The local domain name server returns the obtained IP address to the operating system , and at the same time caches the IP address itself
  5. The operating system returns the IP address to the browser, and at the same time caches the IP address itself
  6. The browser gets the IP address corresponding to the domain name

Enter the URL in the browser to return to the page process?

  1. Resolve the domain name and find the host IP.
  2. The browser uses IP to directly communicate with the website host, three-way handshake , and establishes a TCP connection. The browser will use a random port to initiate a TCP connection to port 80 of the web program on the server.
  3. After establishing the TCP connection, the browser initiates an HTTP request to the host.
  4. Parameters are passed from client to server.
  5. After the server gets the client parameters, it performs corresponding business processing, and then encapsulates the result into an HTTP package and returns it to the client.
  6. The interaction between the server and the client is completed, and the TCP connection is disconnected (4 waves).
  7. The browser parses the response content, renders it , and presents it to the user.

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-Ky8cm5bp-1681047578455) (http://img.topjavaer.cn/img/Enter the url to return to page process 1. png)]

The process of DNS domain name resolution

Positioning in the network relies on IP for identity positioning, so the first step in URL access is to obtain the IP address of the server. To obtain the IP address of the server, DNS (Domain Name System, Domain Name System) domain name resolution is required. DNS domain name resolution is to find the corresponding IP address through the URL.

The general process of DNS domain name resolution is as follows:

  1. First check the DNS cache in the browser , if there is a corresponding record in the browser, it will be used directly and the resolution will be completed;
  2. If the browser does not have a cache, then query the cache of the operating system . If the record is found, it can directly return the IP address and complete the analysis;
  3. If the operating system does not have a DNS cache, it will check the local host file . Under the Windows operating system, the host file is generally located at "C:\Windows\System32\drivers\etc\hosts". If the host file has records, use it directly;
  4. If there is no corresponding record in the local host file, it will request the local DNS server , which is generally provided by local network service providers such as China Mobile and China Unicom. Normally, it can be assigned automatically through DHCP, of course, it can also be configured manually. Currently, the public DNS provided by Google is 8.8.8.8 and the domestic public DNS is 114.114.114.114.
  5. If the local DNS server does not have a corresponding record, it will go to the root domain name server to query . In order to complete the resolution requests of all domain names in the world more efficiently, the root domain name server itself will not directly resolve domain names, but will assign different resolution requests to other servers below for completion.

The picture comes from the Internet

What are cookies and sessions?

Since the HTTP protocol is a stateless protocol, some mechanism is needed to identify the specific user identity and track the user's entire session. Commonly used session tracking technologies are cookies and sessions.

Cookies are special information sent by the server to the client, and these information are stored on the client in the form of text files, and then the client will bring these special information every time it sends a request to the server. To be more specific: When a user uses a browser to visit a website that supports cookies, the user will provide personal information including the user name and submit it to the server; then, the server returns the corresponding hypertext information to the client. At the same time, these personal information will also be sent back. Of course, these information are not stored in the HTTP response body, but in the HTTP response header; when the client browser receives the response from the server, the browser will store the information in one unified location. Since then, when the client sends a request to the server, it will store the corresponding cookie in the HTTP request header and send it back to the server again. After the server receives the request from the client browser, it can obtain the client-specific information by analyzing the cookie stored in the request header, thereby dynamically generating content corresponding to the client. The option of "remember me" in the login interface of the website is realized through cookies.

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-DkFdm767-1681047581237)(null)]

Cookie workflow :

  1. The servlet creates a cookie, saves a small amount of data, and sends it to the browser.
  2. The browser obtains the cookie data sent by the server and automatically saves it to the browser.
  3. The next time you visit, the browser will automatically send the cookie data to the server.

Session principle : First, when the browser requests the server to access the web site, the server will first check whether the client request already contains a session identifier, called SESSIONID. If it already contains a sessionid, it means that a session has been created for this client before. , the server retrieves the session according to the session id and uses it. If the client request does not contain the session id, the server creates a session for the client, and generates a unique session id associated with the session and stores it in the cookie. The session id It will be returned to the client to save in this response, so that during the interaction process, the browser will carry this sessionid with each request, and the server can find the corresponding session based on this sessionid. In order to achieve the purpose of sharing data. It should be noted here that the session will not die when the browser is closed, but wait for the timeout.

The difference between cookie and session?

  • The scope of action is different , the cookie is saved on the client side, and the session is saved on the server side.
  • The validity period is different , and the cookie can be set to last for a long time. For example, the default login function we often use, the session generally has a short expiration time, and it will become invalid when the client is closed or the session times out.
  • The privacy policy is different . Cookies are stored on the client side, which is easy to be stolen; Sessions are stored on the server side, and the security is better than cookies.
  • The storage size is different , and the data saved by a single cookie cannot exceed 4K; there is no upper limit for the storage of the Session, but for the sake of server performance, do not store too much data in the Session, and a Session deletion mechanism needs to be set.

What are symmetric and asymmetric encryption?

Symmetric encryption : Both parties in the communication use the same key for encryption. The characteristic is that the encryption speed is fast, but the disadvantage is that the leakage of the key will lead to the decryption of the ciphertext data. Common symmetric encryption has AESand DESalgorithm.

Asymmetric encryption : It requires the generation of two keys, a public key and a private key . The public key is public and available to anyone, while the private key is kept privately. The public key is responsible for encryption and the private key is responsible for decryption; or the private key is responsible for encryption and the public key is responsible for decryption. This encryption algorithm is more secure , but the amount of calculation is much larger than that of symmetric encryption , and the encryption and decryption are very slow. Common asymmetric algorithms are RSAand DSA.

Talk about the difference between WebSocket and socket

Socket is a set of standards. It completes a high degree of encapsulation of TCP/IP and shields network details to facilitate developers to perform network programming better. Socket is actually equal to IP address + port + protocol .

WebSocket is a persistent protocol. It is a protocol accompanying H5 to solve the problem that http does not support persistent connections .

Socket is a standard interface for network programming , while WebSocket is an application layer communication protocol.

The working process of the ARP protocol?

ARP solves the resolution of the IP and MAC addresses of hosts and routers on the same LAN.

  • Each host will build an ARP list in its own ARP buffer to represent the correspondence between IP addresses and MAC addresses.
  • When the source host needs to send a data packet to the destination host, it will first check whether the MAC address corresponding to the IP address exists in its ARP list, and if so, it will directly send the data packet to this MAC address; if not, it will Initiate an ARP request broadcast packet to the local network segment, and query the MAC address corresponding to the destination host. The ARP request packet includes the IP address of the source host, the hardware address, and the IP address of the destination host.
  • After receiving this ARP request, all hosts in the network will check whether the destination IP in the data packet is consistent with their own IP address. If they are not the same, ignore this data packet; if they are the same, the host will first add the MAC address and IP address of the sender to its own ARP list, if the information of the IP already exists in the ARP table, it will be overwritten, and then the source The host sends an ARP response packet, telling the other party that it is the MAC address it needs to find.
  • After the source host receives the ARP response data packet, it adds the obtained IP address and MAC address of the destination host to its own ARP list, and uses this information to start data transmission.
  • If the source host has not received an ARP response packet, it means that the ARP query fails.

Functions of the ICMP protocol

ICMP, Internet Control Message Protocol, Internet Control Message Protocol.

  • The ICMP protocol is a connectionless protocol used to transmit error reporting control information.
  • It is a very important protocol, and it has extremely important significance for network security. It belongs to the network layer protocol and is mainly used to transfer control information between hosts and routers, including reporting errors, exchanging restricted control and status information , etc.
  • When the IP data cannot reach the target, the IP router cannot forward the data packet according to the current transmission rate, etc., it will automatically send the ICMP message.

For example, ping, which we use a lot every day , is based on ICMP.

What are DoS, DDoS, DRDoS attacks?

  • DOS : (Denial of Service), the translation is denial of service, all attacks that can cause DOS behavior are called DOS attacks. The most common DoS attacks include computer network broadband attacks and connectivity attacks .
  • DDoS : (Distributed Denial of Service), the translation is Distributed Denial of Service. It means that multiple attackers in different locations launch attacks on one or several targets at the same time, or an attacker controls multiple machines in different locations and uses these machines to attack the victim at the same time. Common DDos include SYN Flood, Ping of Death, ACK Flood, UDP Flood, etc.
  • DRDoS : (Distributed Reflection Denial of Service), Chinese means Distributed Reflection Denial of Service, which relies on sending a large number of data packets with the victim's IP address to the attacking host, and then the attacking host responds in large numbers to the source of the IP address. Thus forming a denial of service attack.

What is a CSRF attack and how to avoid it

CSRF, cross-site request forgery (English full name is Cross-site request forgery), is an attack method that coerces users to perform unintended operations on the currently logged-in web application.

How to solve CSRF attack?

  • Check the Referer field.
  • Add verification token.

What is an XSS attack?

XSS, Cross-Site Scripting (Cross-Site Scripting). It refers to malicious attackers inserting malicious html codes into web pages. When users browse the page, the html codes embedded in the web will be executed, so as to achieve the special purpose of maliciously attacking users. XSS attacks are generally divided into three types: storage type, reflection type, DOM type XSS

How to solve the XSS attack problem?

  • Filter the input, filter tags, etc., only allow legal values.
  • HTML escape
  • For link jumps, such as <a href="xxx"etc., the content must be verified, and illegal links starting with script are prohibited.
  • Limit input length

Anti-leech

Hotlinking refers to the content that service providers do not provide services themselves, and use technical means (which can be understood as crawlers) to obtain resources from other websites and display them on their own websites. Common hotlinks include the following: picture hotlinks, audio hotlinks, video hotlinks, etc.

Website hotlinking will consume a lot of bandwidth of the hotlinked website, but the real click rate may be very small, which seriously damages the interests of the hotlinked website.

Stolen websites will naturally prevent hotlinking , either by changing the picture name frequently, or by detecting the referer. Because normal users must click on a link from their own website to access a picture. If the referer of a request is another website, it means that this is a crawler.

What is Referer?

The Referer here refers to a field in the HTTP header, also known as the HTTP source address (HTTP Referer), which is used to indicate where to link to the current webpage, and the format used is URL. In other words, the HTTP Referer header page can check where visitors come from, which is often used to deal with forged cross-site requests.

Hotlinking websites will carry out targeted anti-hotlinking . You can bypass anti-hotlinking by setting referer in the headers of the request . We are now using crawlers to crawl other people's websites.

What is an empty referer and when will it appear?

First, we define an empty Referer as the content of the Referer header is empty, or an HTTP request does not contain the Referer header at all.

So when does the HTTP request not contain the Referer field? According to the definition of Referer, its function is to indicate where a request is linked from, so when a request is not triggered by a link, then naturally there is no need to specify the link source of the request.

For example, if you directly enter the URL address of a resource in the address bar of the browser, the request will not include the Referer field, because this is a "generated out of thin air" HTTP request, not linked from a place.

Talk about the principle of ping

Ping, Packet Internet Groper , is an Internet packet explorer, a program used to test the amount of network connections. Ping is a service command working in the application layer of the TCP/IP network architecture. It mainly sends an ICMP (Internet Control Message Protocol Internet Message Control Protocol) request message to a specific destination host to test whether the destination station is reachable and understandable. its related status.

Generally speaking, ping can be used to detect whether the network is unreachable. It ICMPworks on a protocol basis. Suppose machine A pings machine B , the working process is as follows:

  1. Ping notification system, create a new ICMP request packet with a fixed format
  2. The ICMP protocol packs the data packet with the IP address of the target machine B and forwards it to the IP protocol layer together
  3. The IP layer protocol uses the local IP address as the source address, the IP address of machine B as the destination address, and adds some other control information to construct an IP data packet.
  4. First obtain the MAC address of the target machine B.
  5. The data link layer constructs a data frame, the destination address is the MAC address transmitted from the IP layer , and the source address is the MAC address of the machine
  6. After machine B receives it, it compares the target address to see if it is consistent with its own MAC address.
  7. Calculate the round-trip time based on the timestamp in the ICMP echo reply message returned by the destination host
  8. The final display results include these items: the IP address sent to the destination host, the number of packets sent & received & lost, the minimum, maximum & average round-trip time

Guess you like

Origin blog.csdn.net/Tyson0314/article/details/130048528