[Introduction to XML] An article that you have never heard of before using it proficiently

foreword

Today we will continue to learn the XML in the [Java Web] part. Compared with other parts, XML is still very simple. We will often use it when we write big projects in the future, so it is quite important.

Next, we officially start the learning of XML.

XML overview

basic concept:

Extensible Markup Language (XML), Extensible means that tags are custom.

XML usage:

XML is mainly used to store data, as a configuration file, and transmit it in the network.

XML Features:

(1) XML is a markup language, very similar to HTML
(2) XML is designed to transmit data, not display data
(3) XML has no predefined tags. When using, you need to define your own tags
(4) XML is a very flexible language. There are no fixed tags, and all tags can be customized.

Differences between XML and HTML:

(1) XML tags are custom, and HTML tags are predefined.
(2) XML syntax is strict, HTML syntax is loose.
(3) XML and HTML are designed for different purposes:

  1. XML is designed to transmit and store data, and its focus is on the content of the data.

  2. HTML is designed to display data, and the focus is on the appearance of the data.

  3. HTML is designed to display information, while XML is designed to transmit information.

XML syntax

Like other languages, the XML language has its own syntax. details as follows:

(1) The suffix of the xml document is .xml.
(2) The first line of xml must be defined as a document declaration.
(3) There is only one root tag in the xml document.
(4) The attribute value must be enclosed in single quotation marks (single and double) and the attribute value is unique.
(5) The xml tags must be closed properly, that is, either self-closing and tags, or containment tags.
(6) The xml tag name is case sensitive.

Let's implement an XML file! as follows:

<!--人力资源管理系统-->
<?xml version="1.0" encoding="utf-8"?>
<hr>
    <employee no="7706">
        <name>张三</name>
        <age>31</age>
        <salary>5000</salary>
        <department>
            <dname>技术部</dname>
            <address>xx大厦-B104</address>
        </department>
    </employee>
    <employee no="7707">
        <name>李四</name>
        <age>29</age>
        <salary>4000</salary>
        <department>
            <dname>会计部</dname>
            <address>xx大厦-B106</address>
        </department>
    </employee>
</hr>

Like this it is a well-formed XML file.
The first line of code is the declaration part, remember that the value of encoding here cannot be omitted and must be written. There is one and only one
root node .<hr></hr>

Since the tags in XML are all customized by us, we know at a glance what this file expresses and describes. So we also need to define meaningful tag names when customizing tags.

Introduce the CDATA area, the data in this area will be displayed as it is, in the format:<![CDATA[数据]]>

XML Constraints

A valid XML document has the following characteristics:

(1) It must first be well-formed.
(2) Use DTD and XSD (XML Schema) to define constraints.

Well-formed is just like our XML file above, the tags are meaningful, the syntax is okay, etc. What is the constraint?

For example, there is a definition of the age of a person, so if there is an error in the data input, the age is defined as a negative number, is this okay? This is definitely not possible, not so outrageous.

But the XML file can't check this error, because its content is free, at this time we have to introduce constraints to it, constrain the data, tags, etc. in it.

Writing a document to constrain the writing specification of an xml document is called an XML constraint.

XML constraints are divided into DTD and XSD. Next, we will continue to talk about these two constraints.

DTD constraints

A complete DTD declaration mainly consists of three basic parts: element declaration, attribute declaration, and entity declaration.

Element declaration:

The basic syntax is:

<!ELEMENT 元素名 元素内容模型>

Use !ELEMENT to declare an element, followed by the element name, which is the label name, and the element content model after the element name.

An element's content model defines the allowable element content. An element may contain a sub-element, a block of text, or a combination of sub-element field text, and the element content is also allowed to be empty.

In XML, elements can have child elements. We can use DTD to define which child elements can be included in an element. In order to limit which child elements can be included in an element, we only need to write the child element name after the parent element. ()middle.

If there are dname and address tags under the department tag before, then we can write like this

<!ELEMENT department (dname,address)>

There are multiple employee tags under the previous hr tag. We can also add a * after () to represent 0 or more employee tags.

<!ELEMENT hr (employee)*>

You can also use other ? means one or zero times, + means one or more times, and * means zero or more times.

There are also tags such as name, age, salary, department, etc. under the employee tag. We can write them like this:

<!ELEMENT employee (name,age,salary,department)>

Using ,delimited means that the child elements must appear in this order, otherwise an error will be reported. If ordering is not required then we can use |deseparation of child elements. as follows:

<!ELEMENT employee (name|age|salary|department)>

If the content in the element is plain text content, use the #PCDATAdefinition:

<!ELEMENT name #PCDATA>

If the element is just an empty element, also known as a self-closing tag we can EMPTYdefine it with:

<!ELEMENT br EMPTY>

ANY means that any content can be defined in the element:

<!ELEMENT test ANY>

Property declaration:

Use the ATTLIST keyword to declare attributes in elements

<!ATTLIST employee no CDATA "">

The above example declares a no attribute for the employee element

So how do we bind the DTD file to the XML file after we write the DTD file?

The format is as follows:

<!DOCTYPE 文档根节点 SYSTEM "dtd文件路径">

We just need to unload this section with the next line declared in the XML file.

Next, we complete the writing of a constraint file for the previous XML file:

<?xml version="1.0" encoding="UTF-8" ?>
<!ELEMENT hr (employee)*>
<!ELEMENT employee (name,age,salary,department)>
<!ATTLIST employee no CDATA "">
<!ELEMENT name (#PCDATA)>
<!ELEMENT age (#PCDATA)>
<!ELEMENT salary (#PCDATA)>
<!ELEMENT department (dname,address)>
<!ELEMENT dname (#PCDATA)>
<!ELEMENT address (#PCDATA)>

XSD constraints

XSD stands for XML Schema, and the role of XML Schema is to define the legal building blocks of XML documents.

The role of XSD:

(1) Define the elements that can appear in the document
(2) Define the attributes that can appear in the document
(3) Define which element is a child element
(4) Define the order of the child elements
(5) Define the number of child elements
(6) Defines whether an element is empty or can contain text
(7) Defines the data type of elements and attributes
(8) Defines default and fixed values ​​for elements and attributes

One of the most important capabilities of XML Schema is its support for data types

<schema>The element is the root element of every XML Schema.

Syntax to define simple elements:

<element name="标签名" type="元素类型"/>

Common data types are:

string表示字符串类型
decimal表示小数类型
integer表示整数类型
boolean表示布尔类型
date表示日期类型
time表示时间类型

The syntax for defining properties is:

<attribute name="属性名" type="属性的数据类型"/>

The second important feature about XML Schemas is that they are written in XML.

Next we use XSD to constrain the XML file.

<?xml version="1.0" encoding="UTF-8" ?>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
    <element name="hr">
<!--complexType是复杂节点,包含子节点时必须使用它-->
        <complexType>
            <sequence>
                <element name="employee" minOccurs="99">
                    <complexType>
                        <sequence>
                            <element name="name" type="string"></element>
                            <element name="age" >
                                <simpleType>
                                    <restriction base="integer">
                                    <!--年龄最小为18,最大为60-->
                                        <minInclusive value="18"></minInclusive>
                                        <maxInclusive value="60"></maxInclusive>
                                    </restriction>
                                </simpleType>
                            </element>
                            <element name="salary" type="integer"></element>
                            <element name="dapartment">
                                <complexType>
                                    <sequence>
                                        <element name="dname" type="string"></element>
                                        <element name="address" type="string"></element>
                                    </sequence>
                                </complexType>
                            </element>
                        </sequence>
                        <attribute name="no" type="string" use="required"></attribute>
                    </complexType>
                </element>
            </sequence>
        </complexType>
    </element>
</schema>

To reference an XSD constraint in XML:

<hr xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="文件路径">

Just add it below the XML definition.

XML parsing

XML parsing is to read and write XML files. We use dom4j to parse XML. Here are some introductions to dom4j:

dom4j is a Java XML API, similar to jdom, for reading and writing XML files. dom4j is a very, very good Java XML API with excellent performance, powerful functions and extreme ease of use. It is also an open source software, which can be found on SourceForge. Performance, functionality and ease of use evaluation, dom4j is very good in that regard. Now you can see that more and more Java software is using dom4j to read and write XML, such as Hibernate, including Sun's own JAXM also uses Dom4j.

We use the code directly to explain the exercise

Read the previous XML file:

import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;

import java.sql.SQLOutput;
import java.util.List;

public class HrReader {
    public void readXml() {
        String file = "d:/Java/Java Web/XML/src/hr.xml";
        //SAXReader是读取XML文件的核心类,用于将XML解析之后以树的形式保存在内存
        SAXReader reader = new SAXReader();
        try {
            Document document = reader.read(file);
            //获取XML文档的根节点,即hr标签
            Element root = document.getRootElement();
            //elements方法用于获取指定的标签集合
            List<Element> employees = root.elements("employee");
            for(Element employee:employees){
                //emement方法用于获取唯一的子节点对象
                Element name = employee.element("name");
                //getText()用于获取标签文本值
                String empName = name.getText();
                System.out.println(empName);
                System.out.println(employee.elementText("age"));
                System.out.println(employee.elementText("salary"));
                Element department = employee.element("department");
                System.out.println(department.element("dname").getText());
                System.out.println(department.element("address").getText());
                Attribute att = employee.attribute("no");
                System.out.println(att.getText());
            }
        }catch(DocumentException e){
            e.printStackTrace();
        }
    }

    public static void main(String[] args) throws DocumentException {
        HrReader reader = new HrReader();
        reader.readXml();
    }
}

Output:
insert image description here
Write operation to previous XML file:

import org.dom4j.Attribute;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;

import java.sql.SQLOutput;
import java.util.List;

public class HrReader {
    public void readXml() {
        String file = "d:/Java/Java Web/XML/src/hr.xml";
        //SAXReader是读取XML文件的核心类,用于将XML解析之后以树的形式保存在内存
        SAXReader reader = new SAXReader();
        try {
            Document document = reader.read(file);
            //获取XML文档的根节点,即hr标签
            Element root = document.getRootElement();
            //elements方法用于获取指定的标签集合
            List<Element> employees = root.elements("employee");
            for(Element employee:employees){
                //emement方法用于获取唯一的子节点对象
                Element name = employee.element("name");
                //getText()用于获取标签文本值
                String empName = name.getText();
                System.out.println(empName);
                System.out.println(employee.elementText("age"));
                System.out.println(employee.elementText("salary"));
                Element department = employee.element("department");
                System.out.println(department.element("dname").getText());
                System.out.println(department.element("address").getText());
                Attribute att = employee.attribute("no");
                System.out.println(att.getText());
            }
        }catch(DocumentException e){
            e.printStackTrace();
        }
    }

    public static void main(String[] args) throws DocumentException {
        HrReader reader = new HrReader();
        reader.readXml();
    }
}

XPath expressions

XPath is a language for finding information in XML documents. XPath can be used to traverse elements and attributes in an XML document.

Node (Node):
In XPath, there are seven types of nodes: elements, attributes, text, namespaces, processing instructions, comments, and document (root) nodes. XML documents are treated as node trees. The root of the tree is called the document node or root node.

XPath Path Expressions:
XPath uses path expressions to select nodes or sets of nodes in an XML document. These path expressions are very similar to the expressions we see in regular computer file systems.

Select Nodes:
XPath uses path expressions to select nodes in an XML document. Nodes are selected by following a path or step

Common path expressions:
insert image description here
Corresponding examples:
insert image description here
Path expressions with predicates Examples:
insert image description here

Here are some exercises on XPath path expressions

import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.Node;
import org.dom4j.io.SAXReader;

import java.util.List;
import java.util.zip.DataFormatException;

public class XPathTestor {
    public void xpath(String xpathExp){
        String file = "d:/Java/Java Web/XML/src/hr.xml";
        SAXReader reader = new SAXReader();
        try {
            Document document = reader.read(file);
            List<Node> nodes = document.selectNodes(xpathExp);
            for(Node node : nodes){
                Element emp = (Element) node;
                System.out.println(emp.attributeValue("no"));
                System.out.println(emp.elementText("name"));
                System.out.println(emp.elementText("age"));
                System.out.println(emp.elementText("salary"));
                System.out.println("================================");
            }
        } catch(DocumentException e){
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
        XPathTestor testor = new XPathTestor();
       //testor.xpath("/hr/employee");
       //testor.xpath("//employee");
       //testor.xpath("//employee[salary<4000]");
       //testor.xpath("//employee[name='张三']");
       //testor.xpath("//employee[@no=7706]");
       //testor.xpath("//employee[1]");
       //testor.xpath("//employee[last()]");
       //testor.xpath("//employee[position()<3]");
       testor.xpath("//employee[1] | //employee[3]");
    }
}

Guess you like

Origin blog.csdn.net/apple_51673523/article/details/122699029