Some knowledge points about the Http protocol

 

1. Introduction to Http

  HTTP (HyperText Transfer Protocol): Hypertext Transfer Protocol, a protocol complied with by www-based file transfer, the initial purpose is to provide a method for publishing and receiving HTML pages.

  HTTP is a communication protocol based on TCP/IP to transmit data (HTML files, image files, query results, etc.), mainly in the application layer in the 7-layer architecture, the default port number of HTTP is 80, and the port number of HTTPS is 443.

2. Main Features

  1. The HTTP protocol is stateless

  That is to say, each HTTP request is independent, and there is no necessary connection between any two requests. However, this is not entirely the case in practical applications. Cookie and Session mechanisms are introduced to associate requests.

  2. No connection

  The meaning of connectionless is to limit the processing of only one request per connection. After the server processes the client's request and receives the client's response, it disconnects the connection, which can save transmission time. (Most servers today support the Keep-Alive function, and use the server to support long connections to solve the problem of no connection)

  3. Based on TCP protocol

  The purpose of the HTTP protocol is to specify the format and data interaction behavior of client and server data transmission, and is not responsible for the details of data transmission. The bottom layer is implemented based on TCP. The version in use now uses persistent connections by default, that is, multiple HTTP requests use a single TCP connection.

3. Main Request URL Components

  HTTP uses Uniform Resource Identifiers (URIs) to transfer data and establish connections.

  http://www.abc.com:8080/def/index.jsp?id=58&page=1#name

  

  1. Protocol part: The protocol part of the URL is "http:", which means that the web page uses the HTTP protocol. Various protocols can be used in the Internet, such as HTTP, FTP, etc. In this example, the HTTP protocol is used. The "//" after "HTTP" is the delimiter

  2. Domain name part: The domain name part of the URL is "www.abc.com". In a URL, you can also use the IP address as a domain name

  3. Port part: The port is followed by the domain name, and ":" is used as the separator between the domain name and the port. The port is not a required part of a URL, if the port part is omitted, the default port will be used

  4. Virtual directory part: From the first "/" after the domain name to the last "/", it is the virtual directory part. A virtual directory is also not a necessary part of a URL. The virtual directory in this example is "/def/"

  5. File name part: It starts from the last "/" after the domain name to "?", which is the file name part. If there is no "?", it starts from the last "/" after the domain name to "#". , is the file part, if there is no "?" and "#", then from the last "/" after the domain name to the end, it is the file name part. The file name in this example is "index.jsp". The filename part is also not a required part of a URL. If this part is omitted, the default filename is used

  6. Anchor section: From the beginning of "#" to the end, it is an anchor section. The anchor part in this example is "name". The anchor part is also not a required part of a URL

  7. Parameter part: The part from "?" to "#" is the parameter part, also known as the search part and the query part. The parameter part in this example is "id=58&page=1". A parameter can have multiple parameters, and "&" is used as a separator between parameters.

          (Original: http://blog.csdn.net/ergouge/article/details/8185219  )

4. Http's three-way handshake and four waves

 1. Three-way handshake

  Http is based on tcp transmission control protocol.

  TCP is a host-to-host layer transmission control protocol, providing reliable connection services, using three-way handshake confirmation to establish a connection:

    The bit code is the tcp flag bit, and there are 6 kinds of signs: SYN (synchronous connection establishment) ACK (acknowledgement confirmation) PSH (push transmission) FIN (finish end) RST (reset reset) URG (urgent emergency) Sequence number (sequence number) Acknowledge number

  three-way handshake

  (1) The first handshake: Host A sends a bit code of syn=1, and randomly generates a data packet with seq number=1234567 to the server. Host B is known by SYN=1, and A requests to establish a connection;

  (2) Second handshake: Host B needs to confirm the online information after receiving the request, and sends ack number=(host A's seq+1), syn=1, ack=1, and randomly generates a packet of seq=7654321

  (3) The third handshake: After host A receives it, check whether the ack number is correct, that is, whether the seq number+1 sent for the first time, and whether the bit code ack is 1, if it is correct, host A will send the ack number=( Host B's seq+1), ack=1, after host B receives it and confirms the seq value and ack=1, the connection is established successfully.

   After completing the three-way handshake, host A and host B begin to transmit data.

 2. Four waves

  Since TCP connections are full-duplex, each direction must be closed individually. The principle is that when a party completes its data transmission task, it can send a FIN to terminate the connection in this direction. Receiving a FIN only means that there is no data flow in this direction, a TCP connection can still send data after receiving a FIN. The side that shuts down first will perform an active shutdown, while the other side performs a passive shutdown. 

    waved four times

  The teardown of the CP's connection requires four packets to be sent, hence the name four-way handshake. Either the client or the server can actively initiate a hand-waving action. In socket programming, either party can perform a close() operation to generate a hand-waving action.

  (1) Client A sends a FIN to close the data transfer from client A to server B. 

  (2) Server B receives this FIN, and it sends back an ACK, confirming that the sequence number is the received sequence number plus 1. Like SYN, a FIN will occupy a sequence number. 

  (3) Server B closes the connection with client A and sends a FIN to client A. 

  (4) Client A sends an ACK message confirmation, and sets the confirmation sequence number to the received sequence number plus 1.

    (Original: https://www.cnblogs.com/Jessy/p/3535612.html )

5. Http request method

  1.GET: Get resources

  The GET method is used to request access to a resource identified by a URI. That is, it specifies the content of the response after the server processes the request.

  2.POST: transfer entity body

  The POST method is used to transfer the entity body. One of the differences between POST and GET is that the purpose is different. The difference between the two will be explained in detail at the end of the article. Although the GET method can also be transmitted, it is generally not used, because the purpose of GET is to obtain, and the purpose of POST is to transmit.

  3.PUT: transfer files

  The PUT method is used to transfer files. Similar to the FTP protocol, the content of the file is included in the entity of the request message, and then the request is saved to the server location specified by the URL.

  4. HEAD: Get the message header

  The HEAD method is similar to the GET method, but the difference is that the HEAD method does not require data to be returned. Used to confirm the validity of the URI and the update time of the resource.

  5.DELETE: delete a file

  The DELETE method is used to delete files and is the opposite of PUT. DELETE is required to return the resource specified by the URL.

  6. OPTIONS: Ask for supported methods

  Because not all servers support the specified methods, some servers may prohibit some methods such as DELETE, PUT, etc. for security. Then OPTIONS is the method used to ask the server for support.

  7.TRACE: trace the path

  The TRACE method is the method that lets the web server loop back the previous request traffic to the client. This method is not commonly used.

  8. CONNECT: Requires a tunneling protocol to connect to the proxy

  The CONNECT method requires the establishment of a tunnel when communicating with the proxy server, enabling TCP communication with the tunneling protocol. The communication content is encrypted and transmitted mainly using the SSL/TLS protocol.

  Summary:

      

6. Status code

  The status code is used to inform the client of the result of the server-side processing of the request. With the status code, the user can know whether the server has successfully processed the request, failed or has been forwarded; it is easy to locate an error in this way. The status code consists of a 3-digit number followed by a reason phrase. The first of the 3 digits is used to specify the category of the state. There are 5 kinds in total.

      

  Common status codes:

  (1) 200: The request was successful.

  (2) 302: Found-----" stands for temporary redirection. This status code indicates that the requested resource has been assigned a new URL, but the difference from 301 is that 302 represents not a permanent move, but a temporary one. This means that the URL may still change. It will not be updated if it is saved as a bookmark.

  (3) 400: Bad Request----" indicates that there is a syntax error in the request message. It needs to be modified and sent again.

  (4) 403: Forbidden----" indicates that the resource requested to be accessed is rejected. No access to the server was obtained, the IP was banned, etc.

  (5) 404: Not Found----" indicates that the requested resource cannot be found on the server. Of course it can also be used when the server rejects the request and doesn't want to give a reason.

  (6) 500: Internal Server Error----" indicates that an error occurred on the server side when executing the request, which is likely to be a bug or temporary failure of the server program.

  (7) 503: Service Unavailable----" indicates that the server is temporarily overloaded or is being shut down for maintenance and cannot process requests now. If you know in advance the time required to resolve the above conditions, it is best to write the Retry-After field and return it to the client.

  (8) 504: Getaway Timeout----" Gateway timeout is the timeout when the proxy server waits for the response of the application server, but the difference from 408 Request Timeout is that 504 is the reason for the server rather than the client.

7. Frequently Asked Questions and Answers

  1. The difference between GET and POST

    A. From the literal meaning and the HTTP specification, GET is used to obtain resource information and POST is used to update resource information.

    B. The data entity of the GET submission request will be placed after the URL, separated by ?, and the parameters are connected by &, for example: /index.html?name=wang&login=1

    C. The length of the data submitted by GET is limited, because the URL length is limited, and the specific length limit depends on the browser. And POST doesn't.

    D. The data submitted by GET is not safe because the parameters will be exposed on the URL.

  2. Difference and connection between Cookie and Session

    Cookie and Session are both used to save the interaction state between the client and the server. The implementation mechanisms are different, and each has its own advantages and disadvantages.

    First of all, the biggest difference is that cookies are stored on the client side and sessions are stored on the server side.

    Cookie is that when the client requests the server, the server returns some information to the client in the form of key-value pairs, which are stored in the browser, and these cookie values ​​can be added during interaction. Cookies can be used to facilitate some caching.

    The disadvantage of cookies is that the size and number of cookies are limited; cookies exist on the client side and may be disabled, deleted, or tampered with, which is unsafe; if cookies are large, they must be carried every time a request is made, which will affect the transmission efficiency.

    Session is implemented based on cookies. The difference is that the session itself exists on the server side, but no data is transmitted each time it is transmitted, but the unique ID (usually JSESSIONID) representing a client is written in the client's cookie, so that It's OK to transmit this ID every time.

    The advantage of Session is that the amount of data transmitted is small and relatively safe.

    However, Session also has shortcomings, that is, if the Session does not do special processing, it is prone to failure, expiration, loss, or too many Sessions, resulting in server memory overflow, and it is also complicated to implement a stable, available and secure distributed Session framework. In actual use, it is necessary to combine the advantages and disadvantages of Cookie and Session to design solutions for different problems.

  3. Why is a three-way handshake, not two or four

    Not twice, in order to ensure reliable transmission.

      第一次握手CLIENT告诉SERVER“我将要开始传输数据了”。

      第二次握手SERVER告诉CLIENT“我已经知道你将要传输数据了,我已经做好准备”。

      第三次握手CLIENT告诉SERVER“我已经知道你已经知道'

    Not four times, in order to improve the efficiency of transmission

      总之不管多少次握手,总会有一方不知道对方已经知道。因此为了传输效率,只要3次握手就认为已经可以开始传输数据,三次握手之后,CLIENT和SERVER就进入ESTABLISHED状态,开始数据传输。

  4. What are the advantages and disadvantages of Http and Https?

    (1). The communication is not encrypted in plain text, and the content may be eavesdropped, that is, the packet is captured and analyzed.

    (2). Without verifying the identity of the communicating party, it may be disguised

    (3). The integrity of the message cannot be verified and may be tampered with

    Https is Http plus encryption processing (usually SSL secure communication line) + authentication + integrity protection

   5. What does Http send request include? 

    The request message (Request) that the client sends an HTTP request to the server includes the following format: request line, request header, blank line and request data.

   Under normal circumstances, the server will return an HTTP response message after receiving and processing the request (Responses) sent by the client. The HTTP response also consists of four parts, namely: the status line, the message header, the blank line and the response body.

 

 

    

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325260239&siteId=291194637