dom4j and XML document manipulation

dom4 Profile

1, DOM4J is dom4j.org produced an open source XML parsing package. DOM4J used in the Java platform, using the Java Collections Framework and fully supports DOM, SAX and JAXP.

   DOM4J biggest feature is the use of a large number of interfaces. Its main interfaces are defined inside org.dom4j

Attribute

It defines the XML attributes.

Branch

It means a child node can contain. Such as XML elements (Element) and documentation (Docuemnts) defines a common behavior

CDATA

It defines the XML CDATA area

CharacterData

It is a marker interface, based on the node identifier characters. As CDATA, Comment, Text.

Comment

It defines the behavior of XML comments

Document

It defines the XML document

DocumentType

Defined XML DOCTYPE declaration

Element

Custom XML elements

ElementHandler

Element object defines the processor

ElementPath

It is used ElementHandler, to get the current level of information being processed path

Entity

Defined XML entity

Node

Polymorphic behavior is defined as all of the XML node dom4j

NodeFilter

It defines the behavior of a filter or dom4j predicate generated in the node (the predicateA)

ProcessingInstruction

Custom XML processing instructions

Text

Custom XML text node

Visitor

For implementing the Visitor pattern

XPath

After a string analysis will provide an XPath expression

2, the relationship between these interfaces as follows

  interface java.lang.Cloneable

      interface org.dom4j.Node

             interface org.dom4j.Attribute

             interface org.dom4j.Branch

                    interface org.dom4j.Document

                    interface org.dom4j.Element

             interface org.dom4j.CharacterData

                    interface org.dom4j.CDATA

                    interface org.dom4j.Comment

                    interface org.dom4j.Text

             interface org.dom4j.DocumentType

             interface org.dom4j.Entity

             interface org.dom4j.ProcessingInstruction

XML documents

1. What is XML?

  Extensible Markup Language subset of the standard generalized markup language, referred to as XML. Is a marker for an electronic document to have a structured markup language.

  Expandable means can be custom label, the label must be noted that the presence of <school> </ school> pairs

  Profile for storing data, the transmission data

2, XML structure 

  <?? xml version = "1.0 " encoding = "UTF-8"> header must exist
  xml:? xml declaration is a document
  version: Version
  encoding: the encoding format

  The following is the content header portion

3, xml writing specifications
  1.xml not case sensitive, but XML capitalization sensitive.
  2.xml keyword tag can not be used, for example, Version the XML
  3. properly nested
  4 can not begin with a digit
  5. The only one root tag

4, read XML documents

  The first step: Get Document Object

public static Document load(String filename) {  
    Document document = null;  
    try {  
        SAXReader saxReader = new SAXReader();  
        document = saxReader.read(new File(filename)); // 读取XML文件,获得document对象  
    } catch (Exception ex) {  
        ex.printStackTrace();  
    }  
    return document;  
}  
  
public static Document load(URL url) {  
    Document document = null;  
    try{   
        SAXReader SAXReader = new new SAXReader ();   
        document = saxReader.read (URL); // read the XML file to obtain the document object   
    } the catch (Exception EX) {   
        ex.printStackTrace ();   
    }   
    return document;   
}  

  Step Two: Get the root

Element document.getRootElement root = ();  

  The third step: the root node of the traversal

 for(Iterator it=root.elementIterator();it.hasNext();){      
      Element element = (Element) it.next();      
      // do something      
 }   

  The contents of the access node: a fourth step

String text = element.getText();

5, a number of related methods

  5.1, Document related objects        

    1, reads the XML file, to get the document object.     

         SAXReader reader = new SAXReader(); 

        Document   document = reader.read(new File("input.xml"));     

    2, parsing XML text form, to obtain the document object.      

             String text = "<members></members>";     

             Document document = DocumentHelper.parseText(text);     

    3, take the initiative to create a document object.      

             Document document = DocumentHelper.createDocument();     

             Element root = document.addElement ( "members"); // Create a root node     

  5.2, node-related        

    1. Obtain the root of the document.      

    Element rootElm document.getRootElement = ();     

    2. Obtain a single child node of a node.      

    Element memberElm = root.element ( "member"); // "member" is a node name     

    3. To get the text node      

    String text=memberElm.getText();     

    String text = root.elementText ( "name"); the point is to obtain byte character name below the root node.       

    4. Get all the nodes under the specified name and a node traversal.      

    List nodes = rootElm.elements("member");     

    for (Iterator it = nodes.iterator(); it.hasNext();) {     

       Element elm = (Element) it.next();     

       // do something     

    }     

      5. traversal of all child nodes of a node.      

              for(Iterator it=root.elementIterator();it.hasNext();){     

                   Element element = (Element) it.next();     

                  // do something     

               }     

    6. Add a child node in a node.      

      Element ageElm = newMemberElm.addElement("age");     

    7. Set the text node.      

      ageElm.setText("29");     

    8. To delete a node.      

      parentElm.remove (childElm); // node childElm is to be deleted, parentElm is its parent     

    9. Add a CDATA nodes.      

             Element contentElm = infoElm.addElement("content");     

             contentElm.addCDATA(diary.getContent());          

  5.3, the property-related.     

    1. Obtain the specified attribute node      

             Element document.getRootElement root = ();         

             Attribute attribute=root.attribute("size");    // 属性名name     

    2. Obtain text attributes      

         String text=attribute.getText();   

    String text2=root.element("name").attributeValue("firstname");

    // This is the value of the name attribute bytes firstname points made under the root node.      

    3. traverse all the attributes of a node      

    Element document.getRootElement root = ();

    for(Iterator it=root.attributeIterator();it.hasNext();){

      Attribute attribute = (Attribute) it.next();

      String text=attribute.getText();

      System.out.println(text);

    }

    4. Set the attributes of the nodes and a text.      

      newMemberElm.addAttribute("name", "sitinspring");   

    5. Set the Text property      

               Attribute attribute=root.attribute("name");     

               attribute.setText("sitinspring");     

    6. Delete a property      

               Attribute attribute=root.attribute("size");// 属性名name     

               root.remove(attribute);     

  5.4, ​​write documents to XML files.     

    1. The documents are all in English, do not set encoding, just write.      

    XMLWriter writer = new XMLWriter(new FileWriter("output.xml"));     

    writer.write(document);     

    writer.close();     

    2. documents containing Chinese, set the encoding format and then write.     

    OutputFormat format = OutputFormat.createPrettyPrint();

    format.setEncoding ( "GBK"); // specify the XML encoding

    XMLWriter writer = new XMLWriter(new FileWriter("output.xml"),format);

    writer.write(document);

    writer.close();

  5.5, and XML string conversion      

    1. string into XML      

    String text = "<members> <member>sitinspring</member> </members>";     

    Document document = DocumentHelper.parseText(text);     

    2. XML document or node into a string.     

    SAXReader reader = new SAXReader();

    Document document = reader.read(new File("input.xml"));

    Element document.getRootElement root = ();

    String docXmlText=document.asXML();

    String rootXmlText=root.asXML();

    Element memberElm=root.element("member");

    String memberXmlText=memberElm.asXML();

Xpath

1, using xpath find the need to introduce jaxen-xx-xx.jar

2, the conventional method

    List list=document.selectNodes("/books/book/@show");

3, Syntax

  1, to select nodes

  XPath expressions using the path select nodes in an XML document, or a node along the path to the selected step.

  Common path expressions:

expression

description

nodename

Select all child nodes of the current node

/

Choose from the root node

//

Selecting from the current node matches the selected node in the document, regardless of their location

.

Select the current node

..

Select the parent of the current node

@

Select Properties

  Example:

Path expression

result

bookstore

Select all the child nodes of the bookstore element

/bookstore

Select the root element bookstore

bookstore/book

Select all of the bookstore for the book under the name of the child element .

//book

选取所有 book 子元素,而不管它们在文档中的位置。

bookstore//book

选取bookstore 下名字为 book的所有后代元素,而不管它们位于 bookstore 之下的什么位置。

//@lang

选取所有名为 lang 的属性。

  2、谓语

路径表达式

结果

/bookstore/book[1]

选取属于 bookstore 子元素的第一个 book 元素。

/bookstore/book[last()]

选取属于 bookstore 子元素的最后一个 book 元素。

/bookstore/book[last()-1]

选取属于 bookstore 子元素的倒数第二个 book 元素。

/bookstore/book[position()<3]

选取最前面的两个属于 bookstore 元素的子元素的 book 元素。

//title[@lang]

选取所有拥有名为 lang 的属性的 title 元素。

//title[@lang='eng']

选取所有 title 元素,要求这些元素拥有值为 eng 的 lang 属性。

/bookstore/book[price>35.00]

选取所有 bookstore 元素的 book 元素,要求book元素的子元素 price 元素的值须大于 35.00。

/bookstore/book[price>35.00]/title

选取所有 bookstore 元素中的 book 元素的 title 元素,要求book元素的子元素 price 元素的值须大于 35.00

  3、选取未知节点

    XPath 通配符可用来选取未知的 XML 元素。

通配符

描述

*

匹配任何元素节点

@*

匹配任何属性节点

node()

匹配任何类型的节点

    实例

路径表达式

结果

/bookstore/*

选取 bookstore 元素的所有子节点

//*

选取文档中的所有元素

//title[@*]

选取所有带有属性的 title 元素。

  4、选取若干路径

    通过在路径表达式中使用“|”运算符,您可以选取若干个路径。

    实例

路径表达式

结果

//book/title | //book/price

选取所有 book 元素的 title 和 price 元素。

//title | //price

选取所有文档中的 title 和 price 元素。

/bookstore/book/title|//price

选取所有属于 bookstore 元素的 book 元素的title 元素,以及文档中所有的 price 元素。

  5、XPath 轴

    轴可定义某个相对于当前节点的节点集。

轴名称

结果

ancestor

选取当前节点的所有先辈(父、祖父等)

ancestor-or-self

选取当前节点的所有先辈(父、祖父等)以及当前节点本身

attribute

选取当前节点的所有属性

child

选取当前节点的所有子元素。

descendant

选取当前节点的所有后代元素(子、孙等)。

descendant-or-self

选取当前节点的所有后代元素(子、孙等)以及当前节点本身。

following

选取文档中当前节点的结束标签之后的所有节点。

namespace

选取当前节点的所有命名空间节点

parent

选取当前节点的父节点。

preceding

选取文档中当前节点的开始标签之前的所有节点。

preceding-sibling

选取当前节点之前的所有同级节点。

self

选取当前节点。

  6、步的语法:轴名称::节点测试[谓语]

    实例

例子

结果

child::book

选取所有属于当前节点的子元素的 book 节点

attribute::lang

选取当前节点的 lang 属性

child::*

选取当前节点的所有子元素

attribute::*

选取当前节点的所有属性

child::text()

选取当前节点的所有文本子节点

child::node()

选取当前节点的所有子节点

descendant::book

选取当前节点的所有 book 后代

ancestor::book

选择当前节点的所有 book 先辈

ancestor-or-self::book

选取当前节点的所有book先辈以及当前节点(假如此节点是book节点的话)

child::*/child::price

选取当前节点的所有 price 孙。

 

  7、XPath 运算符

运算符

描述

实例

返回值

|

计算两个节点集

//book | //cd

返回所有带有 book 和 ck 元素的节点集

+

加法

6 + 4

10

-

减法

6 - 4

2

*

乘法

6 * 4

24

div

除法

8 div 4

2

=

等于

price=9.80

如果 price 是 9.80,则返回 true。

如果 price 是 9.90,则返回 fasle。

!=

不等于

price!=9.80

如果 price 是 9.90,则返回 true。

如果 price 是 9.80,则返回 fasle。

小于

price<9.80

如果 price 是 9.00,则返回 true。

如果 price 是 9.90,则返回 fasle。

<=

小于或等于

price<=9.80

如果 price 是 9.00,则返回 true。

如果 price 是 9.90,则返回 fasle。

大于

price>9.80

如果 price 是 9.90,则返回 true。

如果 price 是 9.80,则返回 fasle。

>=

大于或等于

price>=9.80

如果 price 是 9.90,则返回 true。

如果 price 是 9.70,则返回 fasle。

or

price=9.80 or price=9.70

如果 price 是 9.80,则返回 true。

如果 price 是 9.50,则返回 fasle。

and

price>9.00 and price<9.90

如果 price 是 9.80,则返回 true。

如果 price 是 8.50,则返回 fasle。

mod

计算除法的余数

5 mod 2

1

Guess you like

Origin www.cnblogs.com/xfdhh/p/11408849.html