The content of the "Preface" article is roughly an explanation of the HTTP protocol of the application layer protocol.
"Belonging column" network programming
"Homepage link" personal homepage
"Author" Mr. Maple Leaf (fy)
"Mr. Maple Leaf is a little literary" "Sentence Sharing"
As the saying goes, there is no turning back when you open a bow, there are only three results: arrow breaking, arrow falling, and arrow hitting the target.
——Jiang Xiaoying "Su Dongpo: The Most True Love in the World"
Table of contents
1. Introduction to HTTP protocol
HTTP(Hyper Text Transfer Protocol)
The protocol, also known as Hypertext Transfer Protocol, is a request-response protocol that works at the application layer
Although we said that the application layer protocol can be customized by ourselves, in fact, some excellent engineers have already defined some ready-made protocols, and the application layer protocol HTTP (Hypertext Transfer Protocol) is one of them for our direct reference. .
Second, know the URL
Usually what we commonly call "URL" actually meansURL
URL(Uniform Resource Lacator)
It is called a uniform resource locator, which is what we usually call a URL.
A URL roughly consists of the following parts:
(1) Protocol scheme name
http://
Indicateshttp
the name of the protocol, indicating the protocol that needs to be used when making a request. The protocols that we often see on the Internet in our daily life are:http
andhttps
, what we want to explain is thathttp协议
thehttps
protocol is called a secure data transmission protocol, which will be discussed in the next chapter.
(2) Login information
usr:pass
Indicates the login authentication information, including the user name and password of the login user. This field is now omitted for most URLs
(3) Server address
www.example.jp
Indicates the server address, also known as the domain name. This domain name isIP
an address, which is used to identify a unique host. This domain name will be resolved intoIP
an address, and the domain name resolution is completed by the domain name resolution server.
In Linux, ping
the domain name can be resolved through the command
(4) Server port number
80
Indicates the server port number,http
the default port number of the protocol is80
, andhttps
the default port number of the protocol is443
.- In the URL, the port number of the server is generally omitted, because the correspondence between the service and the port number is clear (the code has been written), so there is no need to specify the port number corresponding to the protocol when using the
http
protocol
(5) Hierarchical file paths
/dir/index.htm
Indicates the path where the resource to be accessed is located- The first one
/
is the root directory of the web, not the root directory of Linux. The root directory of the web can be any directory under Linux - The purpose of accessing the server is to obtain a certain resource on the server. The corresponding server process can already be found through the previous domain name and port. What needs to be done at this time is to indicate the path where the resource is located.
http
A protocol is a protocol for obtaining resources from a remote server to the local.
Everything we see on the Internet is a resource, such as text, audio, pictures, web pages, etc. These resources (files) must be stored on a certain server. HTTP
The protocol can transmit various types of file resources, so it is called hypertext transfer protocol instead of text transfer protocol. The types of file resources that can be transferred are reflected in 超
the word.
(6) query string
uid=1
Represents the parameters provided at the time of the request, &
separated by symbols
(7) Fragment identifier
ch1
Represents the fragment identifier, which is a partial supplement to the resource
Three, urlencode and urldecode
In the URL, characters like /
and ?
etc. have been interpreted as special meanings by the url. Therefore, these characters cannot appear randomly.
For example, if these special characters are required in a parameter, the special characters must be escaped first
The rules for escaping are as follows:
Convert the characters that need to be transcoded to 16进制
, and then from right to left, take 4 digits (less than 4 digits and process them directly), make one digit for every 2 digits, add it in front, and encode %
it as%XY
For example, when we search for something in the browser:
For example, when we search C++
, wd
all of the following are our search parameters ( wd
the name of the parameter), +
the plus sign is a special symbol in the URL, and +
the value after the character is converted to hexadecimal is 0x2B
, so one +
will be encoded into a %2B
note : Chinese characters and special characters must be converted. This process becomes URL. encode
When the server receives our request, it will %xx
decode the special symbols. This process is called URL decode
. When using C++
to write the server, we need to do this work (the source code is available on the Internet, just use it directly) Let’s
verify the decoding process, just search for an online URL decoding tool on the Internet and use it
Fourth, the format of HTTP protocol request and response
HTTP is an application layer service based on requests and responses. As a client, you can initiate a request to the server . After request
the server receives this , it will analyze it to find out what resources you want to access, and then the server will build a response to complete this. An HTTP request. Based on this working method, it is called or mode, c means , s means , b means that the browser is the client of the protocol, which means that we do not need to write the client to use the protocolrequest
request
response
request&response
cs
bs
client
server
browser
http
http
4.1 HTTP request protocol format
The HTTP request protocol format is roughly as follows:
An HTTP request consists of four parts:
- Request line: [request method]+[url]+[http version]+[\r\n]
- Request header: the attributes of the request, these attributes are
name:value
listed in the form of + ending with [\r\n] - Blank line: Encountering a blank line (\r\n) indicates the end of the request header
- Request body: The request body is allowed to be an empty string, and the request body can be empty. If the request body exists, there will be one in the request header
Content-Length
to identify the length of the request body
Notice: http uses special symbols (\r\n) to divide the content
The first three parts are generally included with the HTTP protocol, and the last part of the request body can be omitted (empty string). After the request is packaged, it is directly delivered to the next layer: the transport layer, which will then process it
4.2 HTTP response protocol format
The format of the HTTP response protocol is roughly as follows:
The HTTP response consists of four parts:
- Status line: [http version]+[status code]+[status code description]]+[\r\n]
- Response header: the attributes of the response, these attributes are
name:value
listed in the form of + ending with [\r\n] - Empty line: encountering an empty line (\r\n) indicates the end of the response header
- Response body: The response body is allowed to be an empty string, and the response body can be empty. If the response body exists, there will be an attribute in the response header
Content-Length
to identify the length of the response body
Notice: http is divided by special symbols (\r\n).
The first three parts of the content are generally provided by the HTTP protocol. The last part of the response body can be omitted (empty string). After the request is packaged, it is directly delivered to the next Layer: transport layer, which is then processed by the transport layer
4.3 Questions
How to ensure that an http request and response are completely read at the application layer? ?
- First, for requests and responses it can be read line by line (each line has
\r\n
) - Use
while
a loop to read a complete line (for\r\n
splitting) until all request headers or response headers are read, and a blank line is read to indicate that the read is complete - The next step is to read the text, how to ensure that the text is read? ? There are no special symbols in the text
- We have already ensured that the request or response header has been read, and there must be a field in the header:
Content-Length
, which is used to identify the length of the response body or request body - For
Content-Length
parsing, get the length of the text, so that you can ensure that the read text is complete, and you can read it directly according to the parsed length
This ensures that an http request and response are completely read at the application layer
How are http requests and responses serialized and deserialized? ?
- Serialization and deserialization
http
are implemented by themselves by using special characters\r\n
.第一行 + 请求/响应报头
As long as the special characters are read line by line, the entire string can be obtained - The body does not need to be serialized and deserialized, if necessary, customize it yourself
The above is http
a macro understanding of the protocol, and the following code is written to understand http
the protocol.
Five, HTTP test code
5.1 HTTP requests
Let's write a simple TCP server. What this server needs to do is to print the HTTP request sent by the browser.
httpServer.hpp
#pragma once
#include <iostream>
#include <string>
#include <functional>
#include <strings.h>
#include <unistd.h>
#include <pthread.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include "protocol.hpp"
static const int gbacklog = 5;
using func_t = std::function<bool(const httpRequest &req, httpResponse &resp)>;
// 错误类型枚举
enum
{
UAGE_ERR = 1,
SOCKET_ERR,
BIND_ERR,
LISTEN_ERR
};
// 业务处理
void handlerHttp(int sockfd, func_t func)
{
char buffer[4096];
httpRequest req;
httpResponse resp;
size_t n = recv(sockfd, buffer, sizeof(buffer) - 1, 0);
if (n > 0)
{
buffer[n] = 0;
req.inbuffer = buffer;
func(req, resp);
send(sockfd, resp.outbuffer.c_str(), resp.outbuffer.size(), 0);
}
}
class ThreadDate
{
public:
ThreadDate(int sockfd, func_t func) : _sockfd(sockfd), _func(func)
{
}
public:
int _sockfd;
func_t _func;
};
class httpServer
{
public:
httpServer(const uint16_t &port) : _listensock(-1), _port(port)
{
}
// 初始化服务器
void initServer()
{
// 1.创建套接字
_listensock = socket(AF_INET, SOCK_STREAM, 0);
if (_listensock == -1)
{
std::cout << "create socket error" << std::endl;
exit(SOCKET_ERR);
}
std::cout << "create socket success: " << _listensock << std::endl;
// 2.绑定端口
// 2.1 填充 sockaddr_in 结构体
struct sockaddr_in local;
bzero(&local, sizeof(local)); // 把 sockaddr_in结构体全部初始化为0
local.sin_family = AF_INET; // 未来通信采用的是网络通信
local.sin_port = htons(_port); // htons(_port)主机字节序转网络字节序
local.sin_addr.s_addr = INADDR_ANY; // INADDR_ANY 就是 0x00000000
// 2.2 绑定
int n = bind(_listensock, (struct sockaddr *)&local, sizeof(local)); // 需要强转,(struct sockaddr*)&local
if (n == -1)
{
std::cout << "bind socket error" << std::endl;
exit(BIND_ERR);
}
std::cout << "bind socket success" << std::endl;
// 3. 把_listensock套接字设置为监听状态
if (listen(_listensock, gbacklog) == -1)
{
std::cout << "listen socket error" << std::endl;
exit(LISTEN_ERR);
}
std::cout << "listen socket success" << std::endl;
}
// 启动服务器
void start(func_t func)
{
for (;;)
{
// 4. 获取新链接,accept从_listensock套接字里面获取新链接
struct sockaddr_in peer;
socklen_t len = sizeof(peer);
// 这里的sockfd才是真正为客户端请求服务
int sockfd = accept(_listensock, (struct sockaddr *)&peer, &len);
if (sockfd < 0) // 获取新链接失败,但不会影响服务端运行
{
std::cout << "accept error, next!" << std::endl;
continue;
}
std::cout << "accept a new line success, sockfd: " << sockfd << std::endl;
// 5. 为sockfd提供服务,即为客户端提供服务
// 多线程版
pthread_t tid;
ThreadDate *td = new ThreadDate(sockfd, func);
pthread_create(&tid, nullptr, threadRoutine, td);
}
}
static void *threadRoutine(void *args)
{
pthread_detach(pthread_self()); // 线程分离
ThreadDate *td = static_cast<ThreadDate *>(args);
handlerHttp(td->_sockfd, td->_func); // 业务处理
close(td->_sockfd); // 必须关闭,由新线程关闭
delete td;
return nullptr;
}
~httpServer()
{
}
private:
int _listensock; // listen套接字,不是用来数据通信的,是用来监听链接到来
uint16_t _port; // 端口号
};
httpServer.cc
#include "httpServer.hpp"
#include <memory>
// 使用手册
// ./httpServer port
static void Uage(std::string proc)
{
std::cout << "\nUage:\n\t" << proc << " local_port\n\n";
}
bool get(const httpRequest &req, httpResponse &resp)
{
std::cout << "----------------------http start----------------------" << std::endl;
std::cout << req.inbuffer;
std::cout << "----------------------http end ----------------------" << std::endl;
}
int main(int argc, char *argv[])
{
if (argc != 2)
{
Uage(argv[0]);
exit(UAGE_ERR);
}
uint16_t port = atoi(argv[1]); // string to int
std::unique_ptr<httpServer> tsvr(new httpServer(port));
tsvr->initServer(); // 初始化服务器
tsvr->start(get); // 启动服务器
return 0;
}
protocol.hpp
#pragma once
#include <iostream>
#include <string>
#include <vector>
class httpRequest
{
public:
std::string inbuffer;
};
class httpResponse
{
public:
std::string outbuffer;
};
After running the server program, and then access it with a browser, our server will receive the HTTP request from the browser and print it out.
Since there is nothing in the code, only the following information will be displayed.
The server will receive the browser's request Incoming HTTP requests and print them out (although only visited once, but will receive multiple HTTP requests, the behavior of the browser)
GET / HTTP/1.1
Host: 119.3.185.15:8080
Connection: keep-alive
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.67
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Accept-Encoding: gzip, deflate
Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6
explain:
- Since the browser uses the HTTP protocol by default when it initiates a request, we can directly enter the server’s public network address and port number without specifying the HTTP protocol when entering the URL in the browser’s url box, such as the
- The first line is the status line:
GET / HTTP/1.1
,GET
which is the request method, which is the browser’s default, and the URL is\
, because we don’t have a specific request, the browser will visit\
(web root directory) by default, whichHTTP/1.1
is the version number of HTTP
The rest are all request headers, all of which are name: value
various request attributes displayed in the form of lines.
A blank line will also be printed. Since there is no request body, the default is an empty string, and there will be no printed information displayed by
the client. Host version information:
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.67
User-Agent
It is to display the version information of the client host that initiated the request.
For example, when we search for something to download, it will show us the download that matches our own operating system by default. How does it know that we want to download the computer version? ?
The reason is that when we initiate the request, the request already carries the version information of our operating system . The rest
is to tell the server what my client currently supports, such as the encoding format, what kind of text, etc.
talk again
How to separate HTTP headers from payload?
- For HTTP, the status line and response/request header are HTTP header information, and the response/request body here is actually the HTTP payload.
- If a blank line is read, it means that the header has been read. The blank line is the key to separating the HTTP header from the payload.
- That is, http uses special symbols to separate headers and payloads
Why does HTTP need an interactive version?
- The request line in the HTTP request and the status line in the HTTP response both contain http version information. . The HTTP request is sent by the client, so the HTTP request indicates the http version of the client, and the HTTP response is sent by the server, so the HTTP response indicates the server's http version
- When the client and the server communicate, they will exchange the http versions of both parties, mainly for compatibility issues. Because the server and the client may use different http versions, in order to allow the clients of different versions to enjoy the corresponding services, the communication parties are required to perform version negotiation
- For example, an application whose version is 1.0 is upgraded to 2.0 today (new features are provided, but the old version does not). Some users upgrade, and some users choose not to upgrade. At this time, there will be a problem of version differences. The old version accesses the server, but cannot access the new version of the server. The old version must be allowed to access the old server. At this time, the version information of both parties needs to be exchanged, so that clients of different versions can enjoy the corresponding services.
- Therefore, in order to ensure good compatibility, both parties need to exchange their version information
5.2 HTTP response
Simply add a little code, let's observe the HTTP response
bool get(const httpRequest &req, httpResponse &resp)
{
std::cout << "----------------------http request start----------------------" << std::endl;
std::cout << req.inbuffer;
std::cout << "+++++++++++++++++++++++++++++" << std::endl;
std::cout << "request method: " << req.method << std::endl;
std::cout << "request url: " << req.url << std::endl;
std::cout << "request httpversion: " << req.httpversion << std::endl;
std::cout << "request path: " << req.path << std::endl;
std::cout << "request file suffix: " << req.suffix << std::endl;
std::cout << "request body size: " << req.size << "字节" << std::endl;
std::cout << "----------------------http request end ----------------------" << std::endl;
std::cout << "----------------------http response start ----------------------" << std::endl;
std::string respline = "HTTP/1.1 200 OK\r\n"; // 响应状态行
std::string respheader = Util::suffixToDesc(req.suffix);
std::string respblank = "\r\n"; // 响应空行
std::string respbody; // 响应正文
respbody.resize(req.size);
if (!Util::readFile(req.path, (char *)respbody.c_str(), req.size)) // 访问资源不存在,打开404html
{
struct stat st;
stat(html_404.c_str(), &st);
respbody.resize(st.st_size);
Util::readFile(html_404, (char *)respbody.c_str(), st.st_size); // 一定成功
}
resp.outbuffer = respline;
respheader += "Content-Length: ";
respheader += std::to_string(respbody.size());
respheader += respblank;
resp.outbuffer += respheader;
resp.outbuffer += respblank;
std::cout << resp.outbuffer;
resp.outbuffer += respbody;
std::cout << "----------------------http response end ----------------------" << std::endl;
return true;
}
Too much code will not be posted, gitee link: link
The result of the operation, the server responds back (when the browser accesses our server, the server will index.html
respond to this file to the browser, the default index.html
file is the home page of the visited website)
and print out part of the response information
Note: Just as an example, when constructing the HTTP response, only two attribute information are added to the response header, and there are many attribute information in the actual HTTP response header
Six, HTTP method
The common methods of HTTP are as follows: (in the request)
The most commonly used are the GET method and the POST method
When interacting with front-end and back-end data, the essence is that the front-end form
submits through the form, and the browser will automatically convert the content of the form into GET/POST
a request .
For example, the front-end form submission page
action="/a/test.py"
means that the form is submitted to the specified path file, method="GET"
which means http access The method is GET
to start the server, visit the browser
and submit the content, such as Zhang San, 123123,
because the accessed page /a/test.py
does not exist, display to 404
the page (set by yourself)
to view the request information printed by the server
GET
When the method submits parameters, the parameter submission will be spliced to the back of the URL
/a/test.py?
The front is the resource we want to request, and the back xxxname=%E5%BC%A0%E4%B8%89&yyypwd=123123
is the information submitted by the form. You will also see the submitted content in the browser URL bar. Let’s
try POST
the method below, modify the HTML
browser to access
the submission form, and you will not see it in the browser URL bar The content we submit, but we can see the resources we access
View the request information printed by the server
POST
The method submits the form information, and the submitted parameters are placed in the body of the http request
In the URL bar of the browser, we will not see the content we submit, but we can see the resources we visit
Summary:
GET/POST
The difference between http request methods
GET
The method submission parameter is to pass the parameter through the URL, for example:http://ip:port/xxx/yyy?name=value&name2=value2...
POST
The method submission parameter is to submit the parameter through the http request bodyPOST
The method submits parameters through the request body, which is generally invisible to users and better in privacyGET
Method submission parameters are parameters passed through the URL, which can be seen by anyoneGET
The method passes the parameter through the URL, and the parameter is destined not to be too large, whilePOST
the method passes the parameter through the body, and the body can be very large
Notice: Privacy! = Security, HTTP security is not good, can be directly caught by others
Seven, HTTP status code
HTTP status codes are as follows:
Note: 1xx represents the status code starting with 1, the status code has three digits, for example, 404 is
the most common status code, such as200(OK), 404(Not Found), 403(Forbidden), 302(Redirect, 重定向), 504(Bad Gateway)
Let's talk about Redirection (redirection status code)
Redirection is to redirect various network requests to other locations through various methods. At this time, the server is equivalent to providing a guiding service.
Redirection is done by the client, and the server tells the client to
redirect It can be divided into temporary redirection and permanent redirection. Status code 301 indicates permanent redirection, while status codes 302 and 307 indicate temporary redirection.
Moved Permanently, permanently redirected
- Permanent means that the originally accessed resources have been permanently deleted, and the client should be redirected according to the new URI access
Temporary Redirect
- Temporary means that the accessed resources may be temporarily accessed using the location URI first, but the old resources are still there, and you may not need to redirect the next time you visit
- 302 redirection may have URL hijacking (URL hijacking). For example, the search results still display URL A, but the content of the webpage used is the content on your URL B. This situation is called URL hijacking
For more explanations of redirection, link to the article: Redirection
Here's a demonstration of temporary redirection
- The Location field is an attribute information in the HTTP header, which indicates the target website you want to redirect to
Change the status code in the HTTP response to 307, and then keep up with the corresponding status code description. In addition, you need to add a Location field in the HTTP response header. This Location is followed by the webpage you need to redirect to, such as here Set it as the home page of my CSDN
At this time, when the browser accesses our server, it will immediately jump to the home page of CSDN
The server responds with printing information
Eight, HTTP Common Header
Common HTTP headers are as follows:
Content-Type
: data type (text/html, etc.)Content-Length
: the length of the BodyHost
: The client informs the server that the requested resource is on which port of the host;User-Agent
: declare the user's operating system and browser version information;Referer
: Which page the current page is redirected fromLocation
: Use it with 3xx status code to tell the client where to visit nextCookie
: Used to store a small amount of information on the client side. Usually used to implement the session function
Host
Host
The field indicates the IP and port of the service that the client wants to access. For example, when the browser accesses our server, the Host field in the HTTP request sent by the browser is filled with our IP and port.
User-Agent
As mentioned earlier, User-Agent
it represents the version information of the operating system and browser corresponding to the client.
Refer
Referer
It represents which page you are currently jumping from. Referer
The advantage of recording the previous page is that it is convenient to roll back, and on the other hand, we can know the correlation between our current page and the previous page.
Keep-Alive (long connection)
Keep-Alive
, also known as long connection, is a technology used in the HTTP protocol to maintain a persistent connection between the client and the server to reduce the delay and resource consumption of each request
In the traditional HTTP protocol, every time the client sends a request, the server will immediately return a response and close the connection. Such a connection is called a short connection. The long connection is that after a connection is established between the client and the server, multiple requests can be sent and multiple responses can be received through the connection. The
advantages of a long connection include:
- Reduce the overhead of connection establishment and disconnection: In short connections, each request needs to establish and disconnect connections, while long connections can reuse established connections, reducing these overheads
- Reduce delay: In short connections, each request needs to re-establish the connection, while long connections can avoid this delay and improve the response speed
- Reduce resource consumption: in a short connection, each request needs to re-establish the connection, and a long connection can reduce the consumption of server resources
Please be aware of: The long connection is not permanent. Both the server and the client can actively close the connection. The value corresponding to the field
in the HTTP request or response header means that the long connection is supported. Let's talk about it in detailConnect
Keep-Alive
Cookie和Session
Nine, Cookie and Session
HTTP is actually a stateless protocol , there is no relationship between each request/response of HTTP, but you find that this is not the case when you use a browser.
For example, when we log in to a website, such as bilibili, after logging in once, the login status can remain for a long time. After closing the bilibili website and reopening it, we find that the account is still logged in, and there is no need to log in again. Close the browser, too
This is achieved through
cookie
and , this is called session persistencesession
Notice: Strictly speaking, session retention is not a natural feature of http. It is found that session retention is required after later use.
The http protocol is stateless, but users need it. When the user performs web page operations, it is necessary to view a new web page. If a page jump occurs, the new page will not be able to identify which user it is, and it is necessary to log in again. This is obviously inappropriate. Therefore, for the user
once Log in, you can visit the entire website according to your own identity, which requires session persistence
session persistence (old way)
- When the user visits the website, the website will induce the user to log in. After the user logs in, the client browser will save the user's account number and password. In the future, as long as the user visits the same website, the browser will automatically push the saved history. information, authentication
- The browser saves account numbers and passwords. This technique is called
cookie
cookie
Cookie
It is a small text file stored in the user's browser, which is used to store the user's identity authentication information, personalized settings, etc. When a user visits a website, the server stores some information in and sends this to the serverCookie
in future requestsCookie
cookie
There are two ways to save:cookie文件
save andcookie内存
save- Close the browser and open it again, visit the website you have logged in before, if you need to re-enter the account number and password, it means that the cookie information saved in the browser when you logged in before is at the memory level
- Close the browser or restart the computer and open it again, visit the website you have logged in before, if you do not need to re-enter the account and password, it means that the cookie information saved in the browser when you logged in before is at the file level
This cookie
can be managed in the browser, cookie
delete all these, and all websites need to log in again
In the website, after logging in, we can also view the website for cookie
testing, cookie
delete the website, after deletion, The user is not logged in and needs to log in again
cookie
Problems in use
Under normal circumstances, there is no problem. If
the user's unsafe operation is infected with a virus, worm, Trojan horse, etc., the user's own cookie
will be leaked.
- Worms: aiming at directly attacking user hosts (mainly attacking CPU, memory, etc.), causing exhaustion of system resources
- Trojan horse virus: Trojan horses are similar to the Trojan horses in ancient legends. They hide enemy soldiers and come out at night to destroy them. The Trojan horse is not aimed at destroying the computer, but is hidden in a seemingly normal program. It cooperates with hackers to cooperate with the inside and outside. The Trojan horse is aimed at stealing user information and remotely controlling the computer, and will not maliciously attack the user host
cookie
After being obtained by someone with malicious intentions, the hacker can directly access the server from his own browser, and the server will mistakenly believe that the user is accessing the server (great harm to society)
Solutionsession
session
session
It is a server-side storage technology for storing user session information.- When the user visits the website for the first time, the server will create a unique ID for the user
Session ID
, store the ID in the browser andCookie
send it to the browser. The browser will automatically send this to the server in subsequent requestsSession ID
. The serverSession ID
finds the corresponding session information according to the session
It is stored on the server side, and each user has onesession文件
,session ID
which is unique on the server (a string)- The client browser does not need to store the user's account password,
sessionID
just store it, that is,sessionID
put itcookie
in
SessionID
When we log in to a website for the first time and enter the account number and password , the server will generate a corresponding one after the server authentication is SessionID
successful
. When responding, the generated SessionID value will be responded to the browser. After the browser receives the response, it will automatically extract Session ID
the value and save it in the browser cookie
file. When accessing the server later, this will be automatically carried in the corresponding HTTP request Session ID
.
- At this time, the leakage of user information has been greatly improved, but there are still problems
- The hacker has stolen the user's session file. The hacker can access the server as the user, and the server will mistakenly believe that the illegal user is a normal user. This cannot be solved.
- At this time, a certain strategy, such as IP, is used to make it
session ID
invalid. Only the person with the password can log in, and the login is successful againsession ID
, which alleviatessession ID
the problem of theft to a certain extent (it cannot be cured)
security is relative
- While not really addressing security concerns, this approach is relatively safe. There is no concept of absolute security on the Internet. Any security is relative. Even if you encrypt the information sent to the network, it may be cracked by others.
- There is a rule in the security field: if the cost of cracking a piece of information is far greater than the benefits obtained after cracking it (indicating that doing this is a loss), then the information can be said to be safe .
Verify below, the client will carry the cookie information
- When the browser accesses our server, if the HTTP response from the server to the browser contains fields, then this information
Set-Cookie
will be carried when the browser accesses the server againcookie
j Simply modify the above code, if there is too much code, don’t paste it, link: Code
Add a field to the server’s response header Set-Cookie
to see if the browser will bring this Set-Cookie
field when it initiates the HTTP request for the second time
. After running the server , use a browser to access our server, cookie
the value is set by us 1234567asdf
, at this time, such a cookie is written in the browser,
the second request of the client has already carried the cookie information
, after that, every http request will automatically Carry all the cookies that have been set to help the server perform authentication behaviors. This is the function of http session retention
Tool recommendation:
postman: HTTP debugging tool, simulate browser behavior
fiddler: packet capture tool, HTTP tool
--------------------- END --------- -------------
「 作者 」 枫叶先生
「 更新 」 2023.7.11
「 声明 」 余之才疏学浅,故所撰文疏漏难免,
或有谬误或不准确之处,敬请读者批评指正。