(Technical direction) I was often asked about computer network problems when I went to interview for Java.

Insert picture description here

The difference between GET and POST

  • GET Please note that the query string (name/value pair) is sent in the URL of the GET request: /test/demo_form.asp?name1=value1&name2=value2

  • GET requests can be cached

  • GET requests remain in the browser history

  • GET request can be bookmarked

  • GET requests should not be used when processing sensitive data

  • GET request has a length limit

  • GET request should only be used to retrieve data. POST method (POST). Please note that the query string (name/value pair) is sent in the HTTP message body of the POST request: POST /test/demo_form.asp HTTP/1.1Host: w3schools.comname1=value1&name2=value2

  • POST requests will not be cached

  • POST requests will not remain in the browser history

  • POST cannot be bookmarked

  • POST request does not require data length

Protocol used by dns

Use both TCP and UDP

First understand the limitation on the length of bytes transmitted by TCP and UDP:

  • The maximum length of UDP message is 512 bytes, while TCP allows message length to exceed 512 bytes. When the DNS query exceeds 512 bytes, the TC mark of the protocol appears to be deleted, and then TCP is used to send. Generally, traditional UDP packets are generally not larger than 512 bytes.

When using TCP in zone transfer, there are two main considerations:

  • The secondary domain name server will regularly (usually 3 hours) to query the main domain name server to find out whether the data has changed. If there is a change, a zone transfer will be performed for data synchronization. Zone transfer will use TCP instead of UDP, because the amount of data transmitted synchronously is much more than the amount of data in a request and response.
  • TCP is a reliable connection, which ensures the accuracy of data.

Use the UDP protocol for domain name resolution:

The client queries the DNS server for the domain name. Generally, the returned content does not exceed 512 bytes, which can be transmitted by UDP. There is no need to go through the TCP three-way handshake, so the DNS server load is lower and the response is faster. Although theoretically, the client can also specify to use TCP when querying the DNS server, but in fact, many DNS servers only support UDP query packets when configuring.

Idempotent

The characteristic of an idempotent operation is that the impact of any number of executions is the same as that of one execution. Idempotent functions, or idempotent methods, are functions that can be executed repeatedly with the same parameters and obtain the same results. These functions will not affect the state of the system, and there is no need to worry that repeated execution will cause changes to the system. For example, the "getUsername() and setTrue()" functions are an idempotent function.

The difference between cookies and session

Cookies are a technology that allows the website server to store a small amount of data on the client's hard disk or memory, or to read data from the client's hard disk.

  • Cookies are a very small text file placed on your hard drive by a web server when you browse a certain website. It can record your user ID, password, webpages visited, time spent and other information.
    session: When a user requests a Web page from an application, if the user does not have a session, the Web server will automatically create a Session object. When the session expires or is abandoned, the server will terminate the session.
    Cookie mechanism: It adopts the scheme of keeping state on the client side, and the session mechanism adopts the scheme of keeping state on the server side. At the same time, we see that because the server-side state-keeping scheme also needs to save an identity on the client side, the session mechanism may need to use the cookie mechanism to achieve the purpose of storing the identity.
  • Session is a means used by the server to track users. Each Session has a unique identifier: session ID. When the server creates a Session, the response message sent to the client contains the Set-cookie field, which contains a key-value pair named sid, the key-value Session ID. After the client receives it, it saves the Cookie in the browser, and all request reports sent afterwards include the SessionID. HTTP is to track the user's status through the cooperation of the two sending of Session and Cookie. Session is used for the server and Cookie is used for the client.

Causes of TCP sticking and unpacking

The byte size of the data written by the application is greater than the size of the socket send buffer

Perform TCP segmentation of MSS size. MSS is the abbreviation for Maximum Segment Length. MSS is the maximum length of the data field in the TCP segment. The data field plus the TCP header equals the entire TCP segment. So MSS is not the maximum length of TCP segment, but: MSS=TCP segment length-TCP header length

The Ethernet payload is larger than the MTU for IP fragmentation. MTU refers to the maximum data packet size that can pass on a certain layer of a communication protocol. If there is a data packet to be transmitted at the IP layer and the length of the data is greater than the MTU of the link layer, then the IP layer will fragment and divide the data packet into dry fragments so that each fragment does not exceed the MTU. Note that IP fragmentation can occur on the original sender host or on the intermediate router.

TCP sticky and unpacking solutions

The message has a fixed length. For example, 100 bytes.

Add special characters such as carriage return or space character at the end of the packet for segmentation, typically such as FTP protocol

Divide the message into a header and a tail.

Other complex protocols, such as RTMP protocol, etc.

Three handshake

The first handshake: When establishing a connection, the client sends a syn packet (syn=j) to the server, and enters the SYN_SEND state, waiting for the server to confirm;

The second handshake: the server receives the syn packet, must confirm the client's SYN (ack=j+1), and at the same time send a SYN packet (syn=k), that is, the SYN+ACK packet, and the server enters the SYN_RECV state;

The third handshake: The client receives the SYN+ACK packet from the server and sends an acknowledgment packet ACK (ack=k+1) to the server. After the packet is sent, the client and server enter the ESTABLISHED state and complete the three-way handshake.

After the three-way handshake is completed, the client and server begin to transmit data

Wave four times

The client sends FIN first and enters the FIN_WAIT1 state

The server receives the FIN, sends an ACK, enters the CLOSE_WAIT state, and the client receives this ACK, enters the FIN_WAIT2 state

The server sends FIN and enters the LAST_ACK state

The client receives FIN, sends ACK, enters TIME_WAIT state, server receives ACK, enters CLOSE state

The state of TIME_WAIT is the state that the actively disconnected party (here, the client) enters after sending the last ACK. And the duration is quite long. The client TIME_WAIT lasts for 2 times the MSL duration, which is about 60s in the Linux system, and switches to the CLOSE state

TIME_WAIT

TIME_WAIT is formed when the link is actively closed, waiting for 2MSL time, about 4 minutes. Mainly to prevent the last ACK from being lost. Since the time of TIME_WAIT will be very long, the server side should try to minimize actively closing the connection

CLOSE_WAIT

CLOSE_WAIT is formed by passively closing the connection. According to the TCP state machine, the server receives the FIN sent by the client, and then sends ACK according to the TCP implementation, so it enters the CLOSE_WAIT state. But if the server does not execute close(), it cannot migrate from CLOSE_WAIT to LAST_ACK, and there will be many connections in the CLOSE_WAIT state in the system. At this point, it may be that the system is busy processing read and write operations, but has not closed the connection that has received the FIN. At this point, recv/read has received the connection socket of FIN and will return 0.

Why is TIME_WAIT status needed?

Assuming that the final ACK is lost, the server will resend the FIN. The client must maintain TCP status information so that the final ACK can be retransmitted, otherwise it will send an RST, and the server thinks that an error has occurred. The TCP implementation must reliably terminate the connection in both directions (full-duplex closed), and the client must enter the TIME_WAIT state, because the client may face the situation of retransmitting the final ACK.

Why does the TIME_WAIT state need to maintain 2MSL for such a long time?

If the TIME_WAIT state is not maintained for long enough (for example, less than 2MSL), the first connection is terminated normally. A second connection with the same related 5-tuple appears, and duplicate packets from the first connection arrive, disturbing the second connection. The TCP implementation must prevent duplicate messages of a connection from appearing after the connection is terminated, so the TIME_WAIT state is kept long enough (2MSL), and the TCP messages in the corresponding direction of the connection are either completely responded or discarded. There is no confusion when establishing the second connection.

Too many sockets in TIME_WAIT and CLOSE_WAIT states

If the server is abnormal, 80 to 90% are in the following two situations:

1. The server maintains a large number of TIME_WAIT states

2. The server maintains a large number of CLOSE_WAIT states. Simply put, the excessive number of CLOSE_WAITs is caused by improper handling of passively closing connections.
Insert picture description here

A complete HTTP request process

Domain name resolution --> initiate a TCP 3-way handshake --> initiate an http request after establishing a TCP connection --> the server responds to the http request, and the browser gets the html code --> the browser parses the html code and requests the resources in the html code (Such as js, css, pictures, etc.) --> the browser renders the page to the user

Talk about long connections

1. Long connection based on http protocol

Both the HTTP1.0 and HTTP1.1 protocols have support for long connections. Among them, HTTP1.0 needs to add "Connection: keep-alive" header to the request to be able to support it, while HTTP1.1 supports it by default.

The interaction process between http1.0 request and server:

The client sends a request with a header: "Connection: keep-alive"

After the server receives this request, it judges that this is a long connection based on http1.0 and "Connection: keep-alive", it will also add "Connection: keep-alive" in the header of the response, and it will not be closed at the same time Established tcp connection.

After the client receives the response from the server and finds that it contains "Connection: keep-alive", it is considered to be a long connection and the connection is not closed. And use this connection to send the request. Go to a), click here to understand the difference between http 1.0 vs 2.0.

2. Send a heartbeat packet. Send a data packet every few seconds

How does TCP guarantee reliable transmission?

Three handshake.

Truncate the data to a reasonable length. Application data is divided into data blocks that TCP considers the most suitable for sending (numbered by bytes, and reasonably fragmented)

Resend after timeout. When TCP sends a segment, it starts a timer and resends if it cannot receive an acknowledgement in time

For the received request, give a confirmation response

Check that there is an error in the packet, discard the segment, and give no response

Reorder out-of-order data before handing it over to the application layer

For duplicate data, the duplicate data can be discarded

flow control. Each side of the TCP connection has a fixed size buffer space. The receiving end of TCP only allows the other end to send the data that the buffer of the receiving end can accept. This will prevent the faster host from causing the slower host's buffer to overflow.

Congestion control. When the network is congested, reduce data transmission.

Detailed introduction http

The HTTP protocol is the abbreviation of Hyper Text Transfer Protocol (Hyper Text Transfer Protocol), which is a transfer protocol used to transfer hypertext from a World Wide Web (WWW: World Wide Web) server to a local browser. Click here to understand the difference between http 1.0 vs 2.0.

Features

  • Simple and fast: When a client requests a service from the server, it only needs to transmit the request method and path. Commonly used request methods are GET, HEAD, and POST. Each method provides a different type of contact between the client and the server. Because the HTTP protocol is simple, the program size of the HTTP server is small, and the communication speed is very fast.

  • Flexible: HTTP allows the transmission of any type of data object. The type being transmitted is marked by Content-Type.

  • No connection: The meaning of no connection is to limit each connection to only process one request. After the server has processed the client's request and received the client's response, it will disconnect. This way can save transmission time.

  • Stateless: The HTTP protocol is a stateless protocol. Statelessness means that the protocol has no memory capacity for transaction processing. The lack of status means that if the previous information is needed for subsequent processing, it must be retransmitted, which may result in an increase in the amount of data transmitted per connection. On the other hand, when the server does not need previous information, its response is faster.

  • Support B/S and C/S mode.
    Request message Request

  • The request line is used to indicate the type of request, the resource to be accessed, and the HTTP version used.

  • The request header, the part immediately after the request line (that is, the first line), is used to indicate the additional information to be used by the server from the second line as the request header, and HOST will indicate the destination of the request. User-Agent, server Both client and client scripts can access it. It is an important basis for browser type detection logic. This information is defined by your browser and is automatically sent in each request, etc.

  • Blank line, the blank line after the request header is required

  • The request data is also called the subject, and any other data can be added.
    Response message

  • The status line consists of three parts: HTTP protocol version number, status code, and status message.

  • Message header, used to describe some additional information to be used by the client

  • Blank line, the blank line after the message header is required

  • The response body is the text message that the server returns to the client.
    status code

  • 200 OK //Client request is successful

  • 301 Moved Permanently //Permanent redirect, use domain name to redirect

  • 302 Found // Temporary redirect, users who have not logged in visit the user center and redirect to the login page

  • 400 Bad Request //The client request has a syntax error and cannot be understood by the server

  • 401 Unauthorized //The request is unauthorized. This status code must be used with the WWW-Authenticate header field

  • 403 Forbidden //The server received the request, but refused to provide service

  • 404 Not Found //The requested resource does not exist, eg: the wrong URL is entered

  • 500 Internal Server Error //An unexpected error occurred on the server

  • 503 Server Unavailable //The server is currently unable to process the client's request, it may return to normal http method after a period of time

  • get: The client initiates a request to the server to obtain resources. Request the resource at the URL.

  • post: Submit a new request field to the server. Add new data after requesting the resource of the URL.

  • head: request to obtain the response report of the URL resource, that is, get the head of the URL resource

  • patch: request to partially modify the data item of the resource where the URL is located

  • put: Request to modify the data element of the resource where the URL is located.

  • delete: request to delete the data of the url resource

The difference between URI and URL

URI is a uniform resource identifier, which is used to uniquely identify a resource. Every resource available on the web, such as HTML documents, images, video clips, programs, etc., is located by a URI

URI generally consists of three parts:

  • Naming mechanism for access resources
  • Host name where the resource is stored
  • The name of the resource itself, represented by the path, with emphasis on the resource.
    URL is a uniform resource locator, which is a specific URI, that is, URL can be used to identify a resource, and it also specifies how to locate the resource. URL is a string used to describe information resources on the Internet. It is mainly used in various WWW client programs and server programs, especially the famous Mosaic. The URL can be used to describe various information resources in a unified format, including files, server addresses, and directories.

URL generally consists of three parts:

  • Agreement (or service method)
  • The IP address of the host where the resource is stored (sometimes including the port number)
  • The specific address of the host resource. Such as directory and file name, etc.

The difference between HTTPS and HTTP

  • The https protocol requires a CA to apply for a certificate. Generally, there are few free certificates and a fee is required.
  • http is a hypertext transfer protocol, and information is transmitted in plain text; https is a secure ssl encrypted transfer protocol.
  • HTTP and https use completely different connection methods and use different ports. The former is 80 and the latter is 443.
  • The http connection is very simple and stateless; the HTTPS protocol is a network protocol constructed by the SSL+HTTP protocol for encrypted transmission and identity authentication, which is more secure than the http protocol.
  • HTTP uses port 80 by default, https uses port 443 by default

How does https ensure the security of data transmission

https actually adds SSL/TLS between the TCP layer and the http layer to protect the security of the upper layer. It mainly uses symmetric encryption, asymmetric encryption, certificates, and other technologies to encrypt data between the client and the server, and finally achieve Ensure the security of the entire communication. Click here to understand the 9 issues with https.

SSL/TLS protocol function:

Authenticate users and servers to ensure that data is sent to the correct client and server;

Encrypt data to prevent data from being stolen in the middle;

Maintain the integrity of the data and ensure that the data is not changed during transmission.
Insert picture description here

End of sentence

Recently, it is the best time to find a job. If you want to get more JAVA-related questions or the real interview questions of major manufacturers in 2020, you can join the group 1149778920 to share and communicate with us. The code: qf
is part of Data screenshots (all data have been integrated into documents, pdf compression and packaging processing).

Insert picture description here

Guess you like

Origin blog.csdn.net/w1103576/article/details/109027433