[Java] Java Core 75: XML Parsing Dom4j (Part 1)

Article Directory

insert image description here

1 XML parsing

1.1 Analysis overview

After storing the data in XML, we hope to obtain the content of XML through the program. The IO knowledge we have learned using the basics of Java can be completed, but it requires very cumbersome operations to complete, and different problems will be encountered during development (read-only, read-write).

People provide different parsing methods for different problems, and use different parsers for parsing, which is convenient for developers to operate XML.

1.2 Parsing method and parser

There are three common parsing methods in development, as follows:

DOM : The parser is required to load the entire XML document into memory and parse it into a Document object

a) Advantages: The structural relationship between elements is retained, so addition, deletion, modification and query operations can be performed.

b) Disadvantage: The XML document is too large, and memory overflow may occur

SAX : is a faster and more efficient method. It scans the document line by line, parsing it as it goes. And the specific analysis is carried out in an event-driven manner, and the corresponding event is triggered every time a line is executed.

a) Advantages: fast processing speed, can handle large files

b) Disadvantage: It can only be read, resources will be released after line by line, and the parsing operation is cumbersome.
PULL : Android's built-in XML parsing method, similar to SAX. (learn)

The parser is to provide specific implementations according to different parsing methods. Some parsers are too cumbersome to operate. For the convenience of developers, there are easy-to-operate parsing development kits

insert image description here

Common parsers

insert image description here

2 Basic use of Dom4j

2.1 DOM analysis principle and structure model

Analysis principle

Load the entire XML document into the memory, generate a DOM tree, and obtain a Document object, through which the DOM tree can be operated. Take the following books.xml document as an example.

<?xml version="1.0" encoding="UTF-8"?>
<books>
    <book id="0001"> 
        <name>JavaWeb开发教程</name>
        <author>张孝祥</author>
        <sale>100.00元</sale>
    </book>
    <book id="0002">
        <name>三国演义</name>
        <author>罗贯中</author>
        <sale>100.00元</sale>
    </book>
</books>

structural model

The core concept in DOM is the node. Elements, attributes, and text in the XML document are all nodes in the DOM! All nodes are encapsulated into the Document object.

insert image description here

in conclusion: Use the Document object to access every node in the DOM tree

Introduce the jar package of dom4j

Go to the official website to download the zip package. http://www.dom4j.org/

Usually we will create a lib folder in the project and put the libraries we need to depend on here.

How to import the library:

In IDEA, select the right mouse button of the project—>popup menu—>open Module settings”—>Dependencies—>±->JARs or directories… Find dom4j-1.6.1.jar, and click "OK" after adding it successfully.
Directly right-click to select: Add as Library

insert image description here

2.2 Commonly used methods

dom4j must use the core class SaxReader to load the xml document to obtain the Document, and obtain the root element of the document through the Document object, and then it can be operated.

SAXReader object

method	effect
SAXReader sr = new SAXReader();	constructor
Document read(String url)	Load and execute the xml document

Document object

method	effect
Element getRootElement()	get the root element

Element object

method	effect
List<Element> elements(String ele )	Gets all child elements of the specified name. Can not specify a name
Element element(String ele)	Get the first child element with the specified name.
String getName()	Get the element name of the current element
String attributeValue(String attrName)	Get the attribute value of the specified attribute name
String elementText(Sting ele)	Get the text value of the child element with the specified name
String getText()	Get the text content of the current element

summary

Steps to parse xml:

Create a SaxReader object, call the read method to associate the xml file, and get a Document object
Through the Document object, get the root element
After obtaining the root element, you can peel it layer by layer, and use Element-related APIs to parse its sub-elements

2.3 Method Demonstration

Copy "books.xml" in the commonly used xml under the data, the content is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<books>
    <book id="0001">
        <name>JavaWeb开发教程</name>
        <author>张孝祥</author>
        <sale>100.00元</sale>
    </book>
    <book id="0002">
        <name>三国演义</name>
        <author>罗贯中</author>
        <sale>100.00元</sale>
    </book>
</books>

Note: For ease of parsing, no constraints are added to this xml

Parse this file to get the id value of each book, as well as the book name, author name and price.

Step analysis:

Create a SaxReader object, call the read method to load an xml file to obtain the document object
Get the root element through the document object
The sub-elements are parsed layer by layer through the root element.

import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;

import java.util.List;

public class Demo01 {
    
    
    public static void main(String[] args) throws DocumentException {
    
    
        //1. 创建一个SaxReader对象，调用read方法加载一个xml文件获得文档对象
        SAXReader sr = new SAXReader();
        Document doc = sr.read("day15/xml/book.xml");

        //2. 通过文档对象，获取根元素
        Element rootElement = doc.getRootElement();

        //3. 通过根元素一层一层的进行解析子元素。
        //获取所有的子元素
        List<Element> bookElements = rootElement.elements("book");

        for (Element bookElement : bookElements) {
    
    
            //System.out.println(bookElement);
            //解析属性
            String id = bookElement.attributeValue("id");
            System.out.println("id = " + id);
            //获取子元素文本
            String name = bookElement.elementText("name");
            String author = bookElement.elementText("author");
            String sale = bookElement.elementText("sale");
            System.out.println("name = " + name);
            System.out.println("author = " + author);
            System.out.println("sale = " + sale);



            System.out.println("----------------------");
        }
        
    }
}

Requirement two:

Parse the file data in xml into java objects, and parse each book into an object of book type. Then store the book object in a collection.

<?xml version="1.0" encoding="UTF-8"?>
<books>
    <book id="0001">
        <name>JavaWeb开发教程</name>
        <author>张孝祥</author>
        <sale>100.00元</sale>
    </book>
    <book id="0002">
        <name>三国演义</name>
        <author>罗贯中</author>
        <sale>100.00元</sale>
    </book>
</books>

Step analysis:

First create a Book class corresponding to the book element
Create an ArrayList collection to store the parsed book objects
Create a SaxReader object, call the read method to load the xml file, and get the document object
Get the root element through the document object, and then parse it layer by layer

Code:

public class Demo02 {
    
    
    public static void main(String[] args) throws DocumentException {
    
    
        //定义一个集合用来存储解析的Book对象
        ArrayList<Book> books = new ArrayList<>();



        //1. 创建一个SaxReader对象，调用read方法加载一个xml文件获得文档对象
        SAXReader sr = new SAXReader();
        Document doc = sr.read("day15/xml/book.xml");

        //2. 通过文档对象，获取根元素
        Element rootElement = doc.getRootElement();

        //3. 通过根元素一层一层的进行解析子元素。
        //获取所有的子元素
        List<Element> bookElements = rootElement.elements("book");

        for (Element bookElement : bookElements) {
    
    
            //System.out.println(bookElement);
            //解析属性
            String id = bookElement.attributeValue("id");
            System.out.println("id = " + id);
            //获取子元素文本
            String name = bookElement.elementText("name");
            String author = bookElement.elementText("author");
            String sale = bookElement.elementText("sale");

            //将解析的字符串封装成为对象，放到集合
            Book book = new Book(id,name,author,sale);
            books.add(book);

        }


        //将集合遍历，打印book对象
        for (Book book : books) {
    
    
            System.out.println("book = " + book);
        }

    }
}

class Book{
    
    
    private String id;
    private String name;
    private String author;
    private String sale;

    public Book() {
    
    
    }

    public Book(String id, String name, String author, String sale) {
    
    
        this.id = id;
        this.name = name;
        this.author = author;
        this.sale = sale;
    }

    public String getId() {
    
    
        return id;
    }

    public void setId(String id) {
    
    
        this.id = id;
    }

    public String getName() {
    
    
        return name;
    }

    public void setName(String name) {
    
    
        this.name = name;
    }

    public String getAuthor() {
    
    
        return author;
    }

    public void setAuthor(String author) {
    
    
        this.author = author;
    }

    public String getSale() {
    
    
        return sale;
    }

    public void setSale(String sale) {
    
    
        this.sale = sale;
    }

    @Override
    public String toString() {
    
    
        return "Book{" +
                "id='" + id + '\'' +
                ", name='" + name + '\'' +
                ", author='" + author + '\'' +
                ", sale='" + sale + '\'' +
                '}';
    }
}

insert image description here