Understand the nature of Session

One thing we must admit is that most web applications are inseparable from the use of sessions. This article will combine php and http protocols to analyze how to establish a secure session management mechanism.

AD:

One thing we must admit is that most web applications are inseparable from the use of sessions. This article will combine php and http protocols to analyze how to establish a secure session management mechanism. Let's first briefly understand some knowledge of http, so as to understand the stateless nature of the protocol. Then, learn some basic operations about cookies. Finally, I will explain step by step how to use some simple and effective methods to improve the security and stability of your PHP application.

I think most php junior programmers will think that the security of php's default session mechanism seems to be guaranteed, but the fact is just the opposite - the php team just provides a set of convenient session solutions for programmers to use. As for security, it should be enforced by programmers, which is the responsibility of the application development team. Because there are many methods in it, so to speak, there is no best, only better. The way of attack is constantly changing, and the defensive side also needs to constantly change its tactics. Therefore, I personally think that the php team's approach is quite wise.

stateless

Http is a stateless protocol. This is because this protocol does not require the browser to identify itself in each request, and there is no persistent connection between the browser and the server for access between multiple pages. When a user visits a site, the user's browser sends an http request to the server, and the server returns an http response to the browser. In fact, it is a very simple concept. The client makes a request and the server responds. This is the entire communication process based on the http protocol.

Because web applications communicate based on the http protocol, and we have already mentioned that http is stateless, this increases the difficulty of maintaining the state of web applications, which is a big challenge for developers. Cookies was born as an extension of http, its main purpose is to make up for the stateless nature of http, providing a way to maintain the state between the client and the server, but due to security considerations, some users Cookies are disabled in browsers. In this case, the status information can only be passed to the server through the parameters in the url, but the security of this method is poor. In fact, according to the usual thinking, there should be a client to identify itself, so as to maintain a state with the server, but for security reasons, we should all understand one thing - information from the client cannot Complete trust.

Even so, there are relatively elegant solutions to the problem of maintaining state in web applications. However, it should be said that there is no perfect solution, and even the best solution cannot be applied to all situations. This article will introduce some techniques. These technologies can be used to maintain the state of the application more stably and resist some attacks against sessions, such as session hijacking. And you can learn how cookies work, what php sessions do, and how to hijack sessions.

HTTP overview

How can we maintain the state of the web application and choose the most suitable solution? Before answering this question, we must first understand the underlying protocol of the web – Hypertext Transfer Protocol (HTTP).

When a user visits the domain name http://example.com, the browser will automatically establish a tcp/ip connection with the server, and then send an http request to port 80 of the example.com server. The syntax for this request is as follows:

GET / HTTP/1.1

Host: example.org

The first line above is called the request line, and the second parameter (a backslash in this example) indicates the path to the requested resource. The backslash represents the root directory; the server will convert this root directory to a specific directory in the server file system.

Apache users often use the command DocumentRoot to set the document root path. If the requested url is http://example.org/path/to/script.php, then the requested path is /path/to/script.php. If the document root is defined as usr/lcoal/apache/htdocs, the resource path of the entire request is /usr/local/apache/htdocs/path/to/script.php.

The second line describes the syntax of the http header. The header in this example is Host, which identifies the host of the domain name that the browser wants to fetch the resource from. There are many other request headers that can be included in the HTTP request, such as the user-Agent header. In PHP, the header information carried in the request can be obtained through $_SERVER['HTTP_USER_AGENT'].

But unfortunately, in this request example, there is no information that can uniquely identify the current requesting client. Some developers use the ip header in the request to uniquely identify the client that sent the request, but there are many problems in this way. Because some users access through a proxy, for example, user A connects to the website www.example.com through proxy B, and the IP information obtained by the server is the IP address assigned to A by proxy B. If the user disconnects the proxy at this time, then If you connect to the proxy again, its proxy ip address will change again, which means that a user corresponds to multiple ip addresses. In this case, if the server identifies the user based on the ip address, it will think that the request is from a different user. It's actually the same user. Another situation is that, for example, if many users connect to the Internet through routing in the same LAN, and then all visit www.example.com, since these users share the same external network IP address, this will cause the server to think that these users It is a request from the same user, because they are accessed from the same ip address.

The first step in maintaining application state is knowing how to uniquely identify each client. Because only the information carried in the HTTP request can be used to identify the client, the request must contain some kind of information that can be used to uniquely identify the client. Cookies are designed to solve this problem.

Cookies

If you think of Cookies as an extension of the http protocol, it will be much easier to understand. In fact, cookies are essentially an extension of http. There are two http headers that are responsible for setting and sending cookies, which are Set-Cookie and Cookie. When the server returns an http response message to the client, if it contains the Set-Cookie header, it means to instruct the client to create a cookie, and automatically send the cookie to the server in subsequent http requests until the cookie Expired. If the lifetime of the cookie is the entire session, the browser will save the cookie in memory, and the cookie will be automatically cleared when the browser is closed. Another situation is that it is stored in the hard disk of the client. If the browser is closed, the cookie will not be cleared. The next time the browser is opened to visit the corresponding website, the cookie will be automatically sent to the server again. The setting and sending process of a cookie is divided into the following four steps:

The client sends an http request to the server

The server sends an http response to the client, which contains the Set-Cookie header

The client sends an http request to the server, which contains the Cookie header

The server sends an http response to the client

This communication process can also be described by the following diagram:

In the Cookie header included in the client's second request, the server is provided with information that can be used to uniquely identify the client. At this time, the server can also determine whether the client has enabled cookies. Although, the user may suddenly disable the use of cookies in the process of interacting with the application, but this situation is basically unlikely to happen, so it can be ignored, which has also been proved to be correct in practice.

GET and POST Data

In addition to cookies, the client can also include the data sent to the server in the requested url, such as the requested parameters or the requested path. Let's look at an example:

GET /index.php?foo=bar HTTP/1.1

Host: example.org

The above is a regular http get request, which is sent to the index.php script under the web server corresponding to the example.org domain name. In the index.php script, the corresponding url can be obtained through $_GET['foo'] The value of the foo parameter in , which is 'bar'. Most php developers call such data GET data, and a few call it query data or url variables. But everyone needs to pay attention to one point, it does not mean that GET data can only be included in HTTP GET type requests, HTTP POST type requests can also include GET data, as long as the relevant GET data is included in the request url, or That is to say, the transmission of GET data does not depend on the type of specific request.

Another way for the client to pass data to the server is to include the data in the content area of ​​the HTTP request. This method requires the type of request to be POST, see the following example:

POST /index.php HTTP/1.1

Host: example.org

Content-Type: application/x-www-form-urlencoded

Content-Length: 7

foo=bar

In this case, the corresponding value bar can be obtained by calling $_POST['foo'] in the script index.php. Developers call this data POST data, which is the well-known method of submitting a request in the form of a post.

Both forms of data can be included in a request:

POST /index.php?myget=foo HTTP/1.1

Host: example.orgContent-Type: application/x-www-form-urlencoded

Content-Length: 11

mypost=bar

These two ways of passing data are more stable than using cookies to pass data, because cookies may be disabled, but this does not exist when passing data in GET and POST. We can include the PHPSESSID in the url of the http request, like the following example:

GET /index.php?PHPSESSID=12345 HTTP/1.1

Host: example.org

Passing the session id in this way can achieve the same effect as passing the session id in the cookie header, but the disadvantage is that the developer needs to attach the session id to the url or add it to the form as a hidden field. Unlike cookies, as long as the server instructs the client to create a cookie successfully, the client will automatically pass the corresponding unexpired cookie to the server in subsequent requests. Of course, after php opens session.use_trans_sid, it can also automatically append the session id to the url and the hidden field of the form, but this option is not recommended because of security issues. In this case, it is easy to leak the session id. For example, some users will bookmark a url or share a url, then the session id will be exposed. If the session id has not expired, there are certain security issues, unless the server side , in addition to the session id, there are other ways to verify the legitimacy of the user!

Although the session id is passed by POST, it will be much safer than GET. However, the disadvantage of this method is that it is more troublesome, because in this case, it is obviously not appropriate to convert all requests into post requests in your application.

Session management

Until now, I've only discussed how to maintain the state of the application, and only briefly how to maintain the relationship between requests. Next, I will explain the technology that is used more in practice - Session management. When it comes to session management, it is not just to maintain the state between requests, but also to maintain the data used for each specific user during the session. We often refer to this kind of data as session data, because these data are associated with the session between a specific user and the server. If you use the built-in session management mechanism of php, the session data is generally saved in the server-side folder /tmp, and the session data in it will be automatically saved in the super array $_SESSION. One of the simplest examples of using sessions is to pass related session data from one page (note: the session id is actually passed) to another page. Let's use sample code 1, start.php, to demonstrate this example:

  1. <?php 
  2. session_start(); 
  3. $_SESSION['foo'] = 'bar'; 
  4. ?> 
  5. <a href="continue.php">continue.php</a> 

If the user clicks the link in start.php to access continue.php, then in continue.php, the value 'bar' defined in start.php can be obtained through $_SESSION['foo']. See sample code 2 below:

Sample Code 2 – continue.php

  1. <?php 
  2. session_start(); 
  3. echo $_SESSION['foo']; /* bar */ 
  4. ?> 

Is it very simple, but I want to point out that if you really write the code like this, it means that you don't have a very thorough understanding of the implementation mechanism of the session at the bottom of PHP. If you don't understand how many things are automatically done for you inside php, you will find that if the program goes wrong, such code will become difficult to debug. In fact, such code is not safe at all.

Session security issues

Many developers have always believed that the built-in session management mechanism of PHP has certain security and can defend against general session attacks. In fact, this is a misunderstanding, and the PHP team only implemented a convenient and efficient mechanism. Specific security measures should be implemented by the development team of the application. As mentioned at the beginning, there is no best solution, only the one that suits you best.

Now, let's look at the next more conventional attack on sessions:

A user visits http://www.example.org, and logs in.

The server setting for example.org instructs the client to set the relevant cookie - PHPSESSID=12345

The attacker visits http://www.example.org/ at this time, and carries the corresponding cookie in the request – PHPSESSID=12345

In this case, because the example.orge server uses PHPSESSID to identify the corresponding user, the server mistakenly regards the attacker as a legitimate user.

For a description of the entire process, please see the example diagram below:

Of course, the premise of this attack method is that the attacker must fix, hijack or guess the PHPSESSID of a legitimate user by some means. Although this may seem difficult, it is not impossible.

Enhanced security

There are many technologies that can be used to enhance the security of Session. The main idea is to make the verification process as simple as possible for legitimate users, and for attackers, the steps should be as complicated as possible. Of course, this seems to be a difficult balance to make, depending on the specific design of your application.

The simplest HTTP/1.1 request consists of a request line and some Host headers:

GET / HTTP/1.1

Host: example.org

If the client passes the relevant session identifier through PHPSESSID, PHPSESSID can be passed in the cookie header:

GET / HTTP/1.1

Host: example.org

Cookie: PHPSESSID=12345

Similarly, the client can also pass the session identifier in the requested url.

GET /?PHPSESSID=12345

HTTP/1.1Host: example.org

Of course, the session identifier can also be included in the POST data, but this has an impact on the user experience, so this method is rarely used.

Because information from TCP/IP may not be fully trusted, it is not appropriate for web developers to use information in TCP/IP to enhance security. However, the attacker must also provide a unique identifier of a legitimate user in order to enter the system pretending to be a legitimate user. Therefore, it seems that the only way to effectively protect the system is to hide the session identifier as much as possible or make it difficult to guess. It is best if both can be implemented.

PHP will automatically generate a random session ID, which is basically impossible to guess, so the security in this area is still guaranteed. However, it is quite difficult to prevent an attacker from obtaining a legitimate session ID, which is basically out of the developer's control.

In fact, many situations may lead to session ID disclosure. For example, if the session ID is passed through GET data, it is possible to expose this sensitive identity information. Because some users may cache, bookmark or send the link with the session ID in the email content. Cookies is a relatively safe mechanism, but users can disable cookies in the client! There are also serious security holes in some versions of IE. The more famous one is that cookies will be leaked to some users with security Hidden Evil Site.

Therefore, as a developer, you can be sure that the session ID cannot be guessed, but it may still be obtained by an attacker using some methods. So, some extra security measures must be taken to prevent such situations from happening in your application.

In fact, a standard HTTP request includes some optional headers in addition to the mandatory headers such as Host. For an example, look at the following request:

GET / HTTP/1.1

Host: example.org

Cookie: PHPSESSID=12345

User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1

Accept: text/html;q=0.9, */*;q=0.1

Accept-Charset: ISO-8859-1, utf-8;q=0.66, *;q=0.66

Accept-Language: en

We can see that four additional headers are included in the above request example, namely User-Agent, Accept, Accept-Charset and Accept-Language. Since these headers are not required, it is not wise to rely entirely on them to function in your application. However, if a user's browser does send these headers to the server, it is certain that these headers will also be carried in the next request sent by the same user through the same browser. Of course, there will be very few special cases. Assuming the above example is a request from a current user who has established a session with the server, consider the following request:

GET / HTTP/1.1

Host: example.org

Cookie: PHPSESSID=12345

User-Agent: Mozilla/5.0

Because the same session id is included in the Cookie header of the request, the same php session will be accessed. However, the User-Agent header in the request is different from the information in the previous request. Can the system assume that the two requests are sent by the same user?

In this case, if it is found that the header of the browser has changed, but it is not sure whether this is a request from an attacker, a better measure is to pop up an input box asking for a password for the user to enter. The impact on user experience will not be great, and it can effectively prevent attacks.

Of course, you can add code to check the User-Agent header in the system, similar to the code in Example 3:

sample code 3

  1. <?php  
  2. session_start();  
  3. if (md5($_SERVER['HTTP_USER_AGENT']) != $_SESSION['HTTP_USER_AGENT'])  
  4. {  /* popup password input box */  exit; 
  5. }  
  6. ?> 

Of course, you must first encrypt the user agent information with the MD5 algorithm and save it in the session when initializing the session for the first request, similar to the code in Example 4 below:

sample code 4

  1. <?php  
  2. session_start();  
  3. $_SESSION['HTTP_USER_AGENT'] = md5($_SERVER['HTTP_USER_AGENT']);  
  4. ?> 

Although it is not necessary to use MD5 to encrypt the User-Agent information, after using this method, there is no need to filter the $_SERVER['HTTP_USER_AGENT'] data. Otherwise, data filtering must be performed before using this data, because any data from the client cannot be trusted, and this must be taken care of.

After you check the User-Agent client header, an attacker must complete two steps to hijack a session:

Get a legal session id

Include an identical User-Agent header in the forged request

You may say that the attacker can obtain a valid session id, so at his level, it is not difficult to forge the same User-Agent. Not bad, but we can say that this at least adds some trouble to him, and also increases the security of the session mechanism to a certain extent.

You should also be able to think of it, since we can check the User-Agent header to enhance security, then we might as well use some other header information, combine them to generate an encrypted token, and let the client in the subsequent Carry this token in the request! In this case, it is basically impossible for the attacker to guess how such a token is generated. This is like paying with a credit card in a supermarket. You must have a credit card (such as a session id), and you must also enter a payment password (such as a token). Only when both of them match can you successfully enter the account for payment . Look at the following piece of code:

  1. <?php  
  2. session_start();  
  3. $token = 'SHIFLETT' . $_SERVER['HTTP_USER_AGENT']; 
  4. $_SESSION['token'] = md5($token . session_id());  
  5. ?> 
  6. Note: The Accept header should not be used to generate tokens, because some browsers will automatically change this header when the user refreshes the browser. 

After adding this very difficult to guess token to your authentication mechanism, security will be greatly improved. If the token is passed in the same way as the session id, in this case, an attacker must complete the necessary 3 steps to hijack the user's session:

Get a valid session ID

Add the same User-Agent header to the request and use it to generate token

Carry the victim's token in the request

There is a problem here. If the session id and token are both passed through GET data, then an attacker who can obtain the session ID can also obtain the token. Therefore, a safer and more reliable way should be to use two different data transfer methods to transfer session id and token respectively. For example, the session id is passed through the cookie, and the token is passed through the GET data. Therefore, if an attacker obtains this unique user identity through some means, it is unlikely to easily obtain this token at the same time, and it is still relatively safe.

There are also many technical means that can be used to strengthen the security of your session mechanism. I hope that after you have a general understanding of the internal nature of the session, you can design an authentication mechanism suitable for your application system, thereby greatly improving the security of the system. After all, you are one of the developers most familiar with the system you are developing at the moment, and can implement some unique and additional security measures according to the actual situation.

Summarize

The above is only a general description of the working mechanism of the session, and a brief description of some security measures. But keep in mind that the above methods can enhance security, not that they can completely protect your system. I hope readers will investigate the relevant content by themselves. In this research process, I believe you will learn a very practical solution.

Guess you like

Origin blog.csdn.net/2301_77162959/article/details/130971405