使用dom4j下支持的xpath解析带有命名空间的xml

解析第三方xml的时候碰到了问题：使用xpath解析带命名空间的xml，无论如何都解析不了，也不报错，后来发现是命名空间的问题。找了资料解决了，记录下来：需要注意以下两点 1.解析路径的书写 2.使用xpath还要引用一个dom4j的基础包

1.解析使用xml案例

<?xml version="1.0" encoding="UTF-8"?>
<module xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <component>
      <configuration>
        <descriptors>
          <deploymentDescriptor name="web.xml" 
            url="file://$MODULE_DIR$/src/main/webapp/WEB-INF/web.xml" />
        </descriptors>
        <webroots>
          <root url="file://$MODULE_DIR$/src/main/webapp" relative="/" />
        </webroots>
        <sourceRoots>
          <root url="file://$MODULE_DIR$/src/main/resources" />
        </sourceRoots>
      </configuration>
  </component>
</module>

2.解析代码（注意点解析路径的书写）

import java.io.File;
import java.util.Map;
import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.Node;
import org.dom4j.XPath;
import org.dom4j.io.SAXReader;
import org.junit.Test;
import com.google.common.collect.Maps;

public class Dom4jXml {
	@Test
    public void analyzeTest(){
        try {
            SAXReader reader = new SAXReader();
            Document document = reader.read(new File(
                    "C:\\Users\\Administrator\\Desktop\\tuisong.xml"));
            String t=document.asXML();
            System.out.println("读取完毕:"+"\n"+t);
            
            Element rootElm = document.getRootElement();
            Map<String, Object> map = Maps.newHashMap();
            map.put("plan",rootElm.getNamespaceURI());
            
            // 创建解析路径，就是在普通的解析路径前加上map里的key值
            XPath xPath = document.createXPath("//plan:module//plan:component//plan:configuration//plan:sourceRoots//@url");
            xPath.setNamespaceURIs(map);
            Node node = xPath.selectSingleNode(document);
            System.out.println(node.getStringValue());
           
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

3.解析结果

读取完毕:
<?xml version="1.0" encoding="UTF-8"?>
<module xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <component>
      <configuration>
        <descriptors>
          <deploymentDescriptor name="web.xml" url="file://$MODULE_DIR$/src/main/webapp/WEB-INF/web.xml"/>
        </descriptors>
        <webroots>
          <root url="file://$MODULE_DIR$/src/main/webapp" relative="/"/>
        </webroots>
        <sourceRoots>
          <root url="file://$MODULE_DIR$/src/main/resources"/>
        </sourceRoots>
      </configuration>
  </component>
</module>
获取的属性结果url：file://$MODULE_DIR$/src/main/resources

4.使用xpath引用dom4j的基础包，提供依赖坐标，添加朋pom文件即可

	<!-- https://mvnrepository.com/artifact/jaxen/jaxen -->
		<dependency>
			<groupId>jaxen</groupId>
			<artifactId>jaxen</artifactId>
			<version>1.1.6</version>
		</dependency>

5.完工，看到一哥们的大招贴上来（未做验证）

String xmlStr = "<?xml version='1.0' encoding='UTF-8' ?><ROOT xx='xx' xmlns='http://www.dazhao.com' ><HEAD>...</ROOT>";
 
xmlStr = xmlStr.replaceFirst("<ROOT.*><HEAD>", "<ROOT><HEAD>");//使用正则去掉xml里的命名空间信息
 
Document d = DocumentHelper.parseText(xmlStr);
 
String xpath_model  = "/ROOT/HEAD/dazhao";
 
Node flag = d.selectSingleNode(xpath_model );

使用dom4j下支持的xpath解析带有命名空间的xml

猜你喜欢