What is a session?

Session is generally translated as a session. Looking at the session from a different level, it has similar but not identical meanings. For example, from the perspective of a user of a web application, he opens a browser to visit an e-commerce site, logs in, and completes a purchase until he closes the browser, which is a session. In the development of web applications, when a user logs in, I need to create a data structure to store the user's login information, which is also called a session. So pay attention to the context when talking about sessions. And this article is talking about a mechanism or a scheme based on HTTP protocol to enhance the ability of web applications, it does not only refer to a specific dynamic page technology, and this ability is to maintain state, which can also be called maintaining session.

Why do you need a session

When it comes to sessions, it is generally in the context of web applications. We know that web applications are based on the HTTP protocol, and the HTTP protocol is precisely a stateless protocol. That is to say, if the user jumps from page A to page B, the HTTP request will be resent, and the server cannot know what the user did before requesting page B when it returns the response.

Reasons for HTTP's statelessness:

  • HTTP was originally designed to provide a way to publish and receive HTML pages. At that time there was no dynamic page technology, only pure static HTML pages, so there was no need for a protocol to maintain state;
  • When the user receives the response, it often takes some time to read the page, so if the connection between the client and the server is maintained, the connection will be idle most of the time, which is a kind of resource. Unnecessary waste. Therefore, the original design of HTTP is a short connection by default, that is, the client and the server will disconnect the TCP connection after completing a request and response. Therefore, the server cannot predict the client's next action, and it does not even know whether the user will visit again. , so it is completely unnecessary for the HTTP protocol to maintain the user's access state;
  • Transferring some of the complexity to the technology based on the HTTP protocol can make HTTP relatively simple at the protocol level, and this simplicity also gives HTTP stronger scalability. In fact, session technology is essentially an extension of the HTTP protocol.

All in all, HTTP's statelessness is dictated by its historical mission.

With the development of the HTTP protocol, a difficult problem is presented to the HTTP protocol: how can a stateless protocol associate two consecutive requests? That is to say, how can a stateless protocol meet the requirements of state?

At this time, the state is an inevitable trend, and the statelessness of the protocol is also a done deal. Therefore, we need some solutions to solve this contradiction and maintain the HTTP connection state, so cookies and sessions appear.

For this part of the content, readers may have some questions, the author will first talk about two points:

1. Statelessness and persistent connections

Some people may ask, now widely used HTTP1.1 uses long connection by default, is it still stateless?

The connection method and the presence or absence of state are two completely unrelated things. Because the state is data in a sense, and the connection method only determines the transmission method of the data, but not the data. Long connection is a reasonable performance optimization adopted with the improvement of computer performance and network environment. Generally, the web server will limit the number of long connections to avoid excessive resource consumption.

2. Statelessness and sessions

Session is stateful, while HTTP protocol is stateless. Are the two contradictory?

Session and HTTP protocols belong to different levels. The latter belongs to the highest application layer of the ISO seven-layer model. The former does not belong to the latter. The former is implemented by specific dynamic page technology, but at the same time it is based on the latter.

 

Cookie和Session

The stateless methods of the HTTP protocol itself include cookies and sessions. Both can record the state, the former saves the state data on the client side, and the latter saves it on the server side.

Let's first look at how cookies work, which requires a basic HTTP protocol foundation.

Cookies were first described in RFC2109 (obsolete and replaced by RFC2965), each client maintains a maximum of 300 cookies, and each domain name has a maximum of 20 cookies (in fact, most browsers are now more than this, such as Firefox is 50), and each cookie can be up to 4K in size, although different browsers have their own implementations. For the use of cookies, the most important thing is to control the size of cookies, do not put useless information, and do not put too much information.

No matter what server-side technology is used, as long as the HTTP response sent back contains a header of the following form, it is considered that the server requires a cookie to be set:

Set-cookie:name=name;expires=date;path=path;domain=domain

Browsers that support cookies will respond to this, that is, create a cookie file and save it (it may also be a memory cookie). Every time the user makes a request in the future, the browser must determine whether all the current cookies are invalid (according to expires attribute judgment) and match the cookie information of the path attribute, if any, it will be added to the request header in the following form and sent back to the server:

Cookie: name="zj"; Path="/linkage"

The dynamic script on the server will analyze it and deal with it accordingly. Of course, you can choose to ignore it directly.

It should be noted that cookies can be disabled by browsers for security reasons.

Let's take a look at the principle of session:

Its basic principle is that the server maintains a session information data for each session, and the client and the server rely on a globally unique identifier to access the session information data. When a user accesses a web application, the server program decides when to create a session. Creating a session can be summarized into three steps:

  1. Generate a globally unique identifier (sessionid);
  2. Open up data storage space. Corresponding data structures are generally created in memory, but in this case, once the system is powered off, all session data will be lost. If it is an e-commerce website, such an accident will cause serious consequences. However, it can also be written to a file or even stored in the database. Although this will increase the I/O overhead, the session can achieve a certain degree of persistence, and it is more conducive to session sharing;
  3. Send the session's globally unique identifier to the client.

The crux of the problem lies in how the server sends the unique identifier of this session. In relation to the HTTP protocol, data can be placed in the request line, header field or Body. Based on this, there are generally two commonly used methods: cookie and URL rewriting.

1. Cookie

The server can transmit the session identifier to the client as long as the Set-cookie header is set, and every subsequent request from the client will bring this identifier. Since the cookie can set the expiration time, generally the cookie containing session information will be Set the expiration time to 0, that is, the effective time of the browser process. As for how the browser handles this 0, each browser has its own solution, but the difference is not too big (generally reflected in the new browser window);

2. URL Rewriting

The so-called URL rewriting, as the name suggests, is to rewrite the URL. Imagine, before returning to the page requested by the user, add the session identifier to all URLs in the page in the form of get parameters (or add it to the path info part, etc.), so that after the user receives the response, no matter which one he clicks When linking or submitting a form, the identifier of the session will be added, so as to maintain the session. Readers may find this cumbersome, and it is, but URL rewriting will be preferred if cookies are disabled on the client side.

The following two pictures are screenshots from the Firebug plugin for Firefox. As you can see, when I visit index.jsp for the first time, the Set-cookie header is included in the response header, but not in the request header. When I refresh the page again, Figure 2 shows that there is no Set-cookie header in the response, but there is a Cookie header in the request header. Pay attention to the name of the cookie: jsessionid, as the name implies, is the identifier of the session. In addition, you can see that the value of jsessionid in the two pictures is the same, and the author will not explain more. In addition, readers may have seen on some websites that a URL in the form of jsessionid=xxx is appended at the end, which is a session implemented by URL rewriting.

First request to index.jsp

Request index.jsp again

 

Due to the different implementation methods of cookies and sessions, they also have their own advantages and disadvantages and their respective application scenarios:

1. Application scenarios

The typical application scenario of cookie is the Remember Me service, that is, the user's account information is stored on the client side in the form of a cookie. When the user requests a matching URL again, the account information will be transmitted to the server side, and the corresponding program will automatically complete it. login, etc. Of course, some client-side information can also be saved, such as page layout and search history.

The typical application scenario of Session is that after a user logs in to a website, his login information is put into the session, and the corresponding login information is queried in each subsequent request to ensure that the user is legitimate. Of course, there are still classic scenes such as shopping carts;

2. Security

The cookie stores information on the client side. If it is not encrypted, it will undoubtedly expose some private information, and the security is very poor. Generally, sensitive information is encrypted and stored in the cookie, but it is easy to be stolen.

The session only stores the information on the server. If it is stored in a file or database, it may be stolen, but the possibility is much smaller than that of cookies. The most prominent aspect of session security is the problem of session hijacking, which is a security threat, which will be explained in more detail below. Generally speaking, session security is higher than cookie;

3. Performance

Cookies are stored on the client side and consume I/O and memory on the client side, while sessions are stored on the server side and consume server-side resources. However, the pressure caused by the session on the server is relatively concentrated, and the cookie spreads the resource consumption well. In this regard, the cookie is better than the session;

4. Timeliness

Cookies can be stored on the client side for a long time by setting the validity period, while the session generally only has a relatively short validity period (the user actively destroys the session or closes the browser, causing a timeout);

5. Others

The handling of cookies is not as convenient as session in development. Moreover, the number and size of cookies are limited on the client side, while the size of the session is only limited by hardware, and the data that can be stored is undoubtedly too large.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325808915&siteId=291194637