Article directory
1. What is HTTP?
protocol at the application layer
Application layer: The application programs of the client and the server are located in the network layer, so that the two programs are in the application layer (the same layer of the network layer) and need to use the same application layer protocol (http is one of the protocols in the application layer)
Protocol: http own text format
HTTP was born in 1991, and it has developed into the most mainstream application layer protocol.
At present, we mainly use HTTP1.1 and HTTP2.0
to access a resource (web page, picture, js, video...) in the browser. say, that isFormat based on http packets, a process A and B transmitted from a process of host A to a process of host B
can be one or different
Hypertext Transfer Protocol: can transfer text and other formats of data resources, such as pictures, music, video, etc.
About text and binary data description:
Text is actually just encoded binary data
, so it can be converted to each other.
1. In Java, String has getBytes("encoding") to convert strings to binary data
. 2. new String(byte array, "encoding") can Converting binary data to string
http is a text format, but it can contain some binary data inside
Second, understand the process of interaction between client and server
Client: browser process
Server: web server process
Browser input www.sogou.com
3. HTTP protocol format
HTTP is a text-formatted protocol that can be captured by Chrome developer tools or Fiddler to analyze the details of HTTP requests/responses
Network packet capture: During network communication, capture transmitted requests and corresponding data packets
- Developer tools, network panel: There is no content in the native format of the http protocol, and you can directly capture the https package
- fiddler: You can see data packets in http native format
To capture https packages, you need to configure: tools->options->
about fiddler:
http protocol
https protocol
4. URL
1. Know the URL
A path that identifies a resource in the network
Format:protocol name://server address:server port number/resource path with hierarchy? query string
Such as the url in mysql jdbc:
jdbc:mysql://localhost:3306/test?useUnicode=true&characterEcoding=utf-8&useSSL=false
to access web resources:
- On the browser, you can do not enter the protocol name
- Server path, you can use ip, or domain name
1. IP is not easy to remember, and it is more convenient to use a domain name
. 2. There is another advantage: if you purchase an Alibaba Cloud server (providing the right to use the host on the public network) and
switch it to a Tencent Cloud server, the user still uses the same domain name (switching the domain name) Just bind the ip)
ping is a way to test whether a host is reachable
View the domain name, when you actually visit, which ip
ping ip address can also be
- Port number: The browser does not enter the port number, the http port uses port 80 by default, and https uses port 443 by default
- Hierarchical resource path: identifies the path of a specific resource in a server. If the resource path is not entered, it is to access / (also known as the root path of a web application)
- query string: queryString
The function is a certain resource, data under different conditions. For
example: the educational administration system obtains the webpage of a certain student and
passes in different userIds to obtain different student information.
The content in queryString is a key-value pair structure (key=value), in which the number and content of key and value values are completely agreed by the programmer, and the interval between multiple key-value pairs (key1=value1&key2=value2)
When the front-end (js code) sends, it carries the agreed key, and when the back-end (server) receives it, it parses the agreed key
- Omitable parts of URLs
1. Protocol name: can be omitted. If omitted, it defaults to http://
2.ip address/domain name: it can be omitted in HTML (such as img, link, script, src or href attribute of a tag), after omission, it means that the ip/domain name of the server belongs to the current HTML. ip/domain name is the same
3. Port number: can be omitted, if it is http protocol, the port number is automatically set to 80; if it is https protocol, the port number is automatically set to 443
4. Hierarchical file path: it can be omitted, after omission Equivalent to /, some servers will automatically access /index.html when the / path is found
5. Query string: can be omitted
6. Fragment tag: can be omitted
2. About URL encode
The browser will automatically encode the url to us:
- Enter the url in the address bar
- In html and css, import external resource urls (such as js, css, pictures, videos, etc.)
If the url contains special characters, Chinese, spaces, etc., they will be escaped, put in the data packet of the http protocol, and then send the http request
url encode (url encoding): convert the Chinese, spaces, etc. into hexadecimal
url encode (url decoding): Convert the hexadecimal data in the url to the original Chinese, spaces
The browser enters a local html file path, which is actually a url format for local file access
So in html, there may be a problem with the introduction of a Chinese external file
For example, in js, img src=="has Chinese" (saving the url-encoded content), the result may not be as expected
Therefore, it is necessary to pay attention to:
front-end codeimport urlIt is possible to use Chinese space special characters directly,
butget url, you need to be careful, you may need to decode
js: in the img.src attribute, what is actually saved is the encoded url
backend: to get the url, it also needs to be decoded
If you need to get the content in the url, and this part of the content contains Chinese special characters, you need to consider decoding
Regarding the length of the url:
For browsers and web servers, you can configure the maximum length of the url. If it is not configured, the default length will be used.
5. HTTP protocol format
ask:
- The first line (request line): ①request method ②url ③HTTP version number; response line/status line: ①HTTP version number ②status code ③status code description
- Header header (request header/response header): the content of multiple key-value pairs (attributes that identify the http protocol), each key-value pair is: key:value (preferably there is a space in the specification after the colon), multiple keys Value pairs are separated by newlines
Key: In the http protocol, there are header keys that specify the http standard, but programmers can also agree on the header keys
themselves. The programmer decides which header keys need to be used.
- Empty line: the end of the header, until the empty line is read, it can be parsed to the header
- body (request body/response body): the data carried by the request
For example, the landing page of gitee:
this part of the data is in any format, how to parse it after the server receives it?
Generally speaking, the format of the request body is commonly used in the form format (transmitting data to the server), pictures, videos (uploading these files to the server) and other file formats.
The format of the response body , commonly used are text/javascript, text/css , text/html (return web pages, css style files, js files) pictures, videos and other files (the client can use files: such as rendering pictures, playing videos, downloading files)
and a common format application/json (transmission Some data to the other party), the request and response are commonly used (similar to the format of the js object, but the key needs to be enclosed in double quotes)
Request: Generally, after entering some content, submit data to the server.
Response: Generally, the server returns some data, and the client-side js code obtains the response data, and then fills it in the html