File Operation Security - Principles of File Analysis

This section will explain the relevant content of file parsing in detail, as a section in my column "Web Security Principles and Interpretation of Multiple Defense Methods".

The file analysis involved in this article mainly refers to the file analysis in the WEB field. When accessing a website or a specific page of a website, it is actually requesting access to a certain file from the background of the website, as shown in Figure 1 for the visit to the Baidu homepage:
insert image description here

As shown in Figure 1
, although no specific file name is specified, in website development, the home page often does not need to specify a file name, but in the background, access to index.html and other files is usually routed by default. It can be seen that the background of the website will analyze the files corresponding to the front-end mapping, and some will accept the parameters passed by the front-end and respond on demand. This is the most basic file parsing process.

The principle of loopholes in file parsing

The definition of file parsing vulnerabilities often refers to the design and implementation defects that exist in the WEB background in the file parsing process. In layman's terms, non-script files are accidentally parsed according to the syntax of the script. In modern web software development, file parsing is a function of WEB servers, so file parsing vulnerabilities are often concentrated in some WEB servers, including apache, nginx, and Microsoft IIS. These popular WEB servers in history have had multiple file parsing vulnerabilities, as follows:

For the apache server, apache
once parsed 1.php.xxx according to the php syntax. The main reason is that the apache parsing strategy is from right to left. Unrecognized suffixes will be ignored and the next suffix will be searched.

For Nginx,
the 1.jpg%00.php file was once parsed as Php for the Nginx server. At first glance, there seems to be no problem, but when looking for the file, %00 is the value of 0 in ASCII, which will truncate the file name, so the final parsed file is 1.jpg.

IIS
For the IIS server, the files in the .php folder have been parsed as Php. There have also been 1.asp;.xxx parsed as asp, the main reason is that IIS does not parse the content after the semicolon.

Think about the causes of the above-mentioned vulnerabilities, most of them are due to the design flaws of programmers, such as the wrong method of taking file suffixes. Some of the above vulnerabilities are assigned CVE numbers, while others are not. For vulnerabilities without assigned CVE, many problems will be ignored, which means that the corresponding assets are at risk.

Examples of file parsing vulnerabilities

The file parsing function belongs to the underlying function and is mainly undertaken by the WEB server. Therefore, most of these vulnerabilities exist in the WEB server, and the number of vulnerabilities in history is not very large. But once such a loophole appears, the scope of impact will be very wide. At the same time, because the vulnerability parsing vulnerability usually does not directly affect the verification, it usually cooperates with the file upload, directory traversal, file inclusion and other vulnerabilities to form a more serious harm. The following introduces two vulnerabilities with CVE numbers

CVE-2013-4547

This vulnerability is an early file parsing bypass vulnerability of the nginx server, which affects the PHP service. When parsing the file path in the HTTP request, there is a bypass situation, so that the specified file can be parsed according to the syntax of php through the file path with a value of 0, as shown in the introduction in NVD in Figure 2 below: Figure 2 is due
insert image description here
to
this The vulnerability is an Nginx server vulnerability, which belongs to basic service software, so it can be said that it affects all websites that use Nginx to host PHP services. The consequence of this vulnerability is that arbitrary files can be parsed according to Php. Therefore, there are many ways to construct files with non-PHP suffixes, such as SQL injection, file upload, etc., and the webshell can be obtained by cooperating with this vulnerability. Therefore, judging from the impact of this vulnerability, the process level of Nginx, and possible potential consequences, this is a very serious vulnerability. The following is a further analysis of the vulnerability based on the public POC and the patch repaired by Nginx. Among them, exp (the exploit program of this vulnerability) is shown in Figure 3:
insert image description here

In Figure 3
, the POC can see that if you want to exploit this vulnerability, the first step is to upload a 1.gif file. The file upload is a non-sensitive file suffix, so it is relatively easy. Of course, you can also write files through SQL injection. . The second step is to construct a special path. The key to the path is that there is a value of 0, so that 1.gif is parsed as a php format file. As long as there is a one-sentence Trojan horse in 1.gif, the webshell can be obtained.

Compare the vulnerability patch content, as shown in Figure 4 below:
insert image description here

Looking at the patch in Figure 4
, it may be difficult to find out where the root cause is, and it needs to be interpreted in conjunction with other parts. First of all, the name of the function is called. ngx_http_parse_request_lineFrom the function name, we can see that the function of the function is to parse the HTTP request line, as shown in Figure 5: Figure
insert image description here
5

The HTTP request line is three-segment, separated by spaces, so nginx also distinguishes the HTTP method, the URI part and the HTTP version part according to the space when parsing. Due to the complexity of text characters, it can be seen that ngx_http_parse_request_linethe following enumeration is used to indicate the status of each parsing step during URL parsing (you can check the nginx_http_parse.c file):

enum {
    
    
        sw_start = 0,
        sw_method,
        sw_spaces_before_uri,
        sw_schema,
        sw_schema_slash,
        sw_schema_slash_slash,
        sw_host_start,
        sw_host,
        sw_host_end,
        sw_host_ip_literal,
        sw_port,
        sw_host_http_09,
        sw_after_slash_in_uri,
        sw_check_uri,
        sw_check_uri_http_09,
        sw_uri,
        sw_http_09,
        sw_http_H,
        sw_http_HT,
        sw_http_HTT,
        sw_http_HTTP,
        sw_first_major_digit,
        sw_major_digit,
        sw_first_minor_digit,
        sw_minor_digit,
        sw_spaces_after_digit,
        sw_almost_done
    } state;

Among them, sw_check_uri means to check the status of special characters in the URI, and sw_check_uri_http_09 means to check the status of HTTP version information. Under normal circumstances, if there is a space in the URI, it is required to be encoded, so that the space in the URI is not ambiguous with the space in the request line, but the above POC space is not encoded with %20, and the ASCII value of 20 is directly used. Therefore, when a space is encountered, it will enter the HTTP version check state, that is, sw_check_uri_http_09, as shown in Figure 6:

insert image description here
In Figure 6,
the ch value at this time is 0, so it enters the default branch, and enters sw_check_uri in the next cycle, as shown in Figure 7:

insert image description here
insert image description here

In Figure 7,
the value of ch is a dot at this time, so it will enter the process of obtaining the file suffix, that is, the file will be parsed as a php file. It can be seen that before the repair, the bypass was achieved by adding a space value of 0. The repair in Figure 4 makes the pointer move forward, and when entering the process in Figure 7, the value of ch is 0, and an error is reported after parsing to realize the repair of the vulnerability. In fact, it can be seen that nginx considers the 0-value truncation. If there is no space, the URI parsing fails. But it doesn't take into account the space after the combination of the HTTP request line separator and the 0 value. At this time, there may be doubts. The key point of the vulnerability is the ambiguity of spaces, so the combination of spaces and other special characters can be exploited successfully. Although it is possible to enter the process of PHP parsing, there are disadvantages in the subsequent use. Because the 0 value represents the end of the string in the C language, the characters before the 0 value are used in actual use, so files with non-php suffixes can be transmitted when uploading.

At the same time, it can also be seen that the score combination of sw_check_uri_http_09 is very large, making it impossible to design so many test cases for coverage. These are the current problems in the testing field, that is, it is difficult to exhaust all the states of the complex state machine.

CVE-2017-15715

The vulnerability is a file parsing vulnerability in versions 2.4.0 to 2.4.29 of the Apache httpd server. Under normal circumstances, when the httpd server parses php files, it will only parse the files with the . 8 Introduction in NVD:
insert image description here

Figure 8
shows the POC of the vulnerability as shown in Figure 9:
insert image description here
Figure 9
Figure 9 shows that the suffix of 1.php%0a is not a php file, and it is finally parsed as a php file, so there is a problem with parsing from this level. It is precisely because of the loopholes in the parsing that you can bypass some blacklist restrictions during the upload stage by uploading the file 1.php%0a (usually the blacklist does not allow uploading of .php file suffixes). That is, the restrictions of the blacklist are bypassed in the upload stage, and the parsing loopholes are exploited in the parsing stage. Figure 10 shows the vulnerability patch:
insert image description here
insert image description here
insert image description here

Figure 10

As can be seen in Figure 10, the main repair is the addition of the AP_REG_DOLLAR_ENDONLY variable. The variable comment is clearly written, that is, the end of the matching string, and it needs to be clear that the \n newline character is not included. options is the control character when performing regular matching. You can see that the default value of options after repairing does not match the newline character. The previous POC and the repaired patch are all discussed around the newline character. The main reason is that which files the httpd server parses as Php is defined in the configuration file. The default configuration of this version is as follows:

<FilesMatch \.php$>
    SetHandler application/x-httpd-php
</FilesMatch>

The original intention of .php$ is to parse the files ending in .php according to the php format. However, in regular matching, the role of the dollar sign is as shown in Figure 11:
insert image description here
Before Apache's http server is repaired, the multiline attribute is set by default in the options, so it can match newline characters, which is also why the previous 1.php%0a can be parsed successfully s reason. The fix in Figure 10 is by changing the default value in options so that it cannot match newline characters.

Potential hazards of file parsing

In the previous section, the directory analysis was briefly explained from the principles and examples. What consequences will the directory analysis lead to? Simple file parsing should only allow files that do not have a certain type of suffix to be parsed according to a certain file type, and it seems that the direct impact is not very great. However, judging from the previous POC, with file uploading and file parsing, the webshell can be obtained directly. Including the vulnerabilities without CVE numbers mentioned at the beginning of the article, they can all be exploited with file uploads to obtain webshells.

This article is an original article by the youth in the village of CSDN, and may not be reproduced without permission. The blogger links here .

Guess you like

Origin blog.csdn.net/javajiawei/article/details/127473415