http study notes 1

Graphical HTTP study notes
1.2 The birth of HTTP
CERN (European Nuclear Research Organization) Dr. Tim Berners-Lee (Tim BernersLee) proposed a vision that allows researchers in two places to share knowledge. The basic idea initially conceived is: with the help of hypertext (HyperText) formed by the interrelationships between multiple documents, they can be connected into a WWW (World Wide Web, World Wide Web) that can be referred to each other.

In November 1990, CERN successfully developed the world's first web server and web browser. Two years later, in September 1992, the homepage of Japan's first website went live.

In January 1993, Mosaic developed by NCSA (National Center for Supercomputer Applications), the ancestor of modern browsers, came out. It displays HTML images in the form of in-line (inline), etc., and its excellent performance in images has made it popular all over the world.

In December 1994, Netscape Communications released Netscape Navigator 1.0, and in 1995 Microsoft released Internet Explorer 1.0 and 2.0.
In 2004, the Mozilla Foundation released the Firefox browser, and
it took 5 years for the version of the Internet Explorer browser to be upgraded from 6 to 7. After that, versions 8, 9, and 10 were released one after another. In addition, browsers such as Chrome, Opera, and Safari have also seized market share one after another.

1.3 Network Basics TCP/IP
An important point in the TCP/IP protocol suite is layering. The TCP/IP protocol family is divided into the following four layers according to the level: application layer, transport layer, network layer and data link layer.

There are advantages to layering TCP/IP. For example, if the Internet is governed by only one protocol, when the design needs to be changed somewhere, all parts must be replaced as a whole. After layering, you only need to replace the changed layer. After the interface between the layers is planned, the internal design of each layer can be changed freely.
It is worth mentioning that after layering, the design becomes relatively simple. The application on the application layer can only consider the tasks assigned to itself, and does not need to find out where the other party is on the earth, what is the transmission route of the other party, and whether the transmission can be guaranteed.

The link layer (aka data link layer, network interface layer)
handles the hardware part of connecting to the network. Including control operating system, hardware device driver, NIC (Network Interface Card, network adapter, that is, network card), and physically visible parts such as optical fibers (also including all transmission media such as connectors). The scope of hardware is within the scope of the link layer

The network layer is used to process data packets flowing on the network. A packet is the smallest unit of data transmitted over a network. This layer specifies the path (the so-called transmission route) to reach the other party's computer, and transmit the data packet to the other party. When transmitting with the other computer through multiple computers or network devices, the role of the network layer is to select a transmission route among many options.

The transport layer provides data transmission between two computers in a network connection to the upper application layer. There are two different protocols in the transport layer: TCP (Transmission Control Protocol, Transmission Control Protocol) and UDP (User Data Protocol, User Datagram Protocol).

The application layer determines the communication activities when providing application services to users.
Various common application services are pre-stored in the TCP/IP protocol suite. For example, FTP (File Transfer Protocol, File Transfer Protocol) and DNS (Domain Name System, Domain Name System) services are two of them.

The role of the IP protocol is to transmit various data packets to each other. To ensure that it is indeed sent to the other party, various conditions need to be met. Two of the important conditions are IP address and MAC address (Media Access Control Address).

The IP address indicates the address assigned to the node, and the MAC address refers to the fixed address to which the network card belongs. IP addresses can be paired with MAC addresses. The IP address can be changed, but the MAC address basically does not change.

Communication between IPs relies on MAC addresses. On the network, it is rare for both parties to communicate to be in the same local area network (LAN). Usually, they can only be connected to each other through the transfer of multiple computers and network devices. While in transit, the MAC address of the next transit device will be used to search for the next transit destination. At this time, the ARP protocol (AddressResolution Protocol) will be used. ARP is a protocol used to resolve addresses. According to the IP address of the communication party, the corresponding MAC address can be found out.

In the transit process before reaching the communication target, those network devices such as computers and routers can only learn a very rough transmission route. This mechanism is called routing, which is a bit like the delivery process of a courier company. People who want to send express delivery can know whether the express company is willing to accept and deliver the goods as long as they send their goods to the distribution center. The distribution center of the express company checks the delivery address of the goods and specifies which area the next stop should be sent to distribution center. Then, the distribution center in that area will judge whether it can be delivered to the other party's home.

TCP protocol to ensure reliability
In order to deliver data to the destination without error, the TCP protocol uses a three-way handshaking strategy. After the data packet is sent out with the TCP protocol, TCP will not ignore the situation after the transmission, and it will definitely confirm to the other party whether it has been successfully delivered.

The TCP flags (flag) - SYN (synchronize) and
ACK (acknowledgment) are used in the handshake process. The sender first sends a data packet with the SYN flag to the other party. After the receiving end receives it, it returns a data packet with the SYN/ACK flag to show the confirmation information. Finally, the sender sends back a data packet with the ACK flag, which means the end of the "handshake".

DNS service responsible for domain name resolution
DNS protocol provides the service of looking up IP addresses through domain names, or reversely looking up domain names from IP addresses. www.baidu.com will request the nearest domain name server to resolve the IP address. http is responsible for generating the http request message to the target server and parsing the request url. It turns out that you want resources under a certain directory of a certain host. tcp is responsible for byte stream service, splitting multiple message segments and reorganizing For the arriving segment, ip is responsible for searching for the other party's address and transferring it through the relay route.

Persistent connection is designed to establish multiple requests and responses after one TCP connection. The advantage of persistent connection is to reduce the additional overhead caused by repeated establishment and disconnection of TCP connections, and reduce the load on the server side. In addition, the part of the overhead time is reduced, so that the HTTP request and response can be completed earlier, so that the display speed of the Web page is correspondingly improved. In HTTP/1.1, all connections are persistent by default.

Persistent connections make it possible to send most requests in a pipelined manner. After sending a request before, you need to wait and receive a response before sending the next request. With the emergence of pipeline technology, the next request can be sent directly without waiting for a response.

Assuming that the web page that requires login authentication cannot manage the state itself (does not record the logged-in state), then each time you jump to a new page, you need to log in again, or add parameters to each request message to manage the login state.

Cookie technology controls the state of the client by writing cookie information in request and response messages.
The cookie will notify the client to save the cookie according to a header field information called Set-Cookie in the response message sent from the server. When the client sends a request to the server next time, the client will automatically add the Cookie value in the request message and send it out.
After the server finds the cookie sent by the client, it will check which client sent the connection request, then compare the records on the server, and finally get the previous status information.

3.2 Structure of request message and response message
Request message structure:
Request URL: https://www.mydrivers.com/zhuanti/tianti/cpu/index.html
Request method: GET
Status code: 304 Not Modified
Remote address: 101.28.132.13:443
Referrer Policy: strict-origin-when-cross-origin
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng, / ;q= 0.8,application/signed-exchange;v=b3;q=0.7
Accept-Encoding:gzip,deflate,br
Accept-Language:zh-CN,zh;q=0.9,en;q=0.8,en-GB;q= 0.7,en-US;q=0.6
Cache-Control:max-age=0
Connection:keep-alive
Cookie:Hm_lvt_5c6ea7c88034ab979d4a14f9d840e0d0=1690280205,1690532327,1690771670,1690859915; Hm_lpvt_5c6ea7c88034ab979d4a14f9d840e0d0=1690859915; Hm_lvt_fa993fdd33f32c39cbb6e7d66096c422=1690280205,1690532327,1690771670,1690859915; Hm_lpvt_fa993fdd33f32c39cbb6e7d66096c422=1690859915
Host:www.mydrivers.com
If-Modified-Since:Sun, 09 Jul 2023 14:59:48 GMT
If-None-Match:“c4d451176b2d91:0”
Sec-Ch-Ua:“Not/A)Brand”;v=“99”, “Microsoft Edge”;v=“115”, “Chromium”;v=“115”
Sec-Ch-Ua-Mobile:?0
Sec-Ch-Ua-Platform:“Windows”
Sec-Fetch-Dest:document
Sec-Fetch-Mode:navigate
Sec-Fetch-Site:none
Sec-Fetch-User:?1
Upgrade-Insecure-Requests:1
User-Agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36 Edg/115.0.1901.188

响应报文结构:
Connection:keep-alive
Content-Type:text/html
Date:Tue, 01 Aug 2023 03:18:38 GMT
Etag:“c4d451176b2d91:0”
Last-Modified:Sun, 09 Jul 2023 14:59:48 GMT
Vary:Accept-Encoding
X-Cache:HIT from BC13_lt-hebei-handan-6-cache-1(baishan)
X-Ser:BC13_lt-hebei-handan-6-cache-1

Title

In HTTP communication, in addition to client and server, there are some applications for communication data forwarding, such as proxy, gateway and tunnel. They can work with the server. These applications and servers can forward requests to the next server on the communication line, and can receive responses from that server and forward them to the client.
Proxy
proxy is an application with forwarding function. It acts as a "middleman" between the server and the client, receiving the request sent by the client and forwarding it to the server, and also receiving the response returned by the server and forwarding it to the client. .
Gateway
A gateway is a server that forwards communication data from other servers. When receiving a request from a client, it processes the request as if it had its own source server. Sometimes the client may not even realize that the destination of its communication is a gateway.
Tunnel
Tunnel is an application program that transfers between a client and a server that are far apart and maintains a communication connection between the two.

Reasons for using a proxy server include: using caching technology (explained later) to reduce network bandwidth traffic, access control for specific websites within the organization, and the main purpose of obtaining access logs, etc.
Proxies can be used in a variety of ways, categorized on two basis. One is whether to use the cache, and the other is whether to modify the message.
When the proxy
forwards the response, the caching proxy (Caching Proxy) will save a copy (cache) of the resource on the proxy server in advance. When the proxy receives a request for the same resource again, it can return the previously cached resource as a response instead of obtaining the resource from the origin server.
Transparent Proxy
A proxy that does not process any packets when forwarding requests or responses is called a transparent proxy (Transparent Proxy). Conversely, an agent that processes message content is called a non-transparent agent.

Utilizing a gateway increases the security of communications because encryption can be made on the communication line between the client and the gateway to secure the connection. For example, the gateway can connect to the database and use SQL statements to query data. In addition, when credit card settlement is carried out on the Web shopping site, the gateway can be linked with the credit card settlement system.

The tunnel can establish a communication line with other servers as required, and then use encryption methods such as SSL to communicate. The purpose of the tunnel is to ensure that the client can communicate securely with the server. The tunnel itself does not parse HTTP requests. In other words, the request remains unchanged and forwarded to the subsequent server. The tunnel ends when the communicating parties disconnect.

Request header field name description
Cache-Control Control cache behavior
Connection Hop-by-hop header, connection management
Date Date and time of creating the message
Pragma message instruction
Trailer List of headers at the end of the message
Transfer-Encoding Specifies the transfer encoding method of the message body
Upgrade Information about upgrading to other protocols
Via proxy server
Warning error notification

The field name of the response header indicates
whether Accept-Ranges accepts the byte range request
Age calculates the resource creation elapsed time
ETag resource matching information
Location redirects the client to the specified URI
Proxy-Authenticate the proxy server authenticates the client's authentication information
Retry-After initiates again Request timing requirements
Server HTTP server installation information
Vary Proxy server cached management information
WWW-Authenticate server-to-client authentication information

Entity header field name description
Allow HTTP method supported by the resource
Content-Encoding Encoding method applicable to the entity body
Content-Language Natural language of the entity body
Content-Length Size of the entity body (unit: byte)
Content-Location Replaces the URI of the corresponding resource
Content-MD5 The message digest of the entity body
Content-Range The location range of the entity body
Content-Type The media type of the entity body
Expires The date and time when the entity body expires
Last-Modified The last modification date and time of the resource

Guess you like

Origin blog.csdn.net/AnalogElectronic/article/details/132333517