XML
XML refers to extensible markup language (E X- tensible M arkup L anguage). It is designed to transmit and store data.
1. Features
- xml is platform-independent and is an independent markup language
- xml is self-descriptive
Second, the role
- Network data transmission
- data storage
- Configuration file
Three, XML tree structure
The XML document forms a tree structure that starts from the "root" and then expands to the "branches and leaves".
<?xml version="1.0" encoding="UTF-8"?>
<note>
<to>Qs</to>
<from>Sky</from>
<heading>Reminder</heading>
<body>Don't forget to date this weekend!</body>
</note>
The first line is the xml declaration, which defines the version of XML (1.0) and the encoding used (UTF-8: Universal Code, which can display various languages).
Four, grammatical points
-
Case sensitive
-
The XML document must have a root element
-
All XML elements must have a closing tag
-
XML attribute values must be quoted
-
Entity reference
entity character description <
< less than >
> greater than &
& ampersand '
’ apostrophe "
" quotation mark In XML, only the characters "<" and "&" are indeed illegal. The greater than sign is legal, but it is a good practice to use entity references instead.
-
Comments in XML
<!-- This is a comment -->
-
In XML, spaces will be preserved, and HTML will cut (combine) multiple consecutive space characters into one
-
XML stores newlines in LF
- In Windows applications, line breaks are usually stored as a pair of characters: carriage return (CR) and line feed (LF).
- In Unix and Mac OSX, LF is used to store new lines.
- In the old Mac system, CR was used to store new lines.
- XML stores newlines in LF.
-
XML naming rules
- The name can contain letters, numbers and other characters
- The name cannot start with a number or punctuation
- The name cannot start with the letters xml (or XML, Xml, etc.)
- The name cannot contain spaces
- Repeat tag names are allowed
-
XML elements are extensible. One of the advantages of XML is that it can be extended without interrupting the application
Five, there are XML parsing methods in Java
5.1 SAX analysis
- The analysis method is an event-driven mechanism!
- SAX parser, reads the XML file line by line and parses it, and triggers an event whenever the start/end/content/attribute of a tag is parsed
- We can write programs to deal with these events accordingly
Disadvantages:
- One-way parsing cannot locate the document level, and cannot access different parts of the same document at the same time (because of line-by-line parsing, when the nth line is parsed, the n-1th line has been released and cannot be operated on).
- Cannot know the level of the element when the event occurs, and can only maintain the parent/child relationship of the node by itself
- Read-only parsing mode, cannot modify the content of the XML document.
5.2 DOM analysis
It is the official W3C standard that expresses XML documents in a platform- and language-independent way. Analysis of this structure usually requires loading the entire document and building a document tree model in memory. Programmers can complete data acquisition, modification, and deletion by manipulating the document tree.
advantage:
- Documents are loaded in memory, allowing changes to data and structure
- Two-way access, data can be parsed in both directions in the tree at any time.
Disadvantages: all documents are loaded in memory, which consumes a lot of resources
5.3 JDOM analysis
- The purpose is to become a Java-specific document model, which simplifies the interaction with XML and is faster than using DOM. Since it is the first Java specific model, JDOM has been vigorously promoted and promoted
- The JDOM document states that its purpose is to "use 20% (or less) of effort to solve 80% (or more) Java/XML problems" (20% assumed based on the learning curve)
Disadvantages: no good flexibility; no good flexibility.
5.4 DOM4J analysis
It is an intelligent branch of JDOM. It incorporates many functions beyond basic XML document representation, including integrated XPath support, XML Schema support, and event-based processing for large or streaming documents. It also provides options for constructing document representation. DOM4J is an excellent Java XML API with excellent performance, powerful functions and extremely easy-to-use characteristics. It is also an open source software.
5.5 DOM4J parsing XML
step:
-
Import the jar file dom4j.jar
-
Create an input stream pointing to an XML file
FileInputStream fis = new FileInputStream("xml文件的地址");
-
Create an XML reading tool object
SAXReader sr = new SAXReader();
-
Use the reading tool object to read the input stream of the XML document and get the document object
Document doc = sr.read(fis);
-
Get the root element object in the XML document through the document object
Element root = doc.getRootElement();
5.5.1 Document Object
Refers to the entire XML document loaded into memory
Common methods:
-
Get the root element object in the XML document through the document object
Element root = doc.getRootElement();
-
Add root node
Element root = doc.addElement("根节点名称");
5.5.2 Element Object
Refers to a single node in the XML document
Common methods:
-
Get node name
String getName();
-
Get node content
String getText();
-
Set node content
String setText();
-
According to the name of the child node, get the first child node object matching the name
Element element(String 子节点名称);
-
Get all child node objects
List elements();
-
Get the attribute value of the node
String attributeValue(String 属性名称);
-
Get the content of the child node
String elementText(String 子节点名称);
-
Add child node
Element addElement(String 子节点名称);
-
Add attributes
void addAttribute(String 属性名,String 属性值);
5.5.3 DOM4J-XPATH parsing XML
-
Path expression, quickly find one or a group of elements through the path
-
/: Find from the root node
-
//: Find descendant nodes from the location of the node that initiated the search
-
.: Find the current node
-
…: Find the parent node
-
@: Select attribute
Attribute usage:
[@属性名='值']
[@属性名>'值']
[@属性名<'值']
[@属性名!='值']
-
-
Steps for usage
//通过Node类的两个方法, 来完成查找: //(Node是 Document 与 Element 的父接口) //方法1. //根据路径表达式, 查找匹配的单个节点 Element e = selectSingleNode("路径表达式"); //方法2. List<Element> es = selectNodes("路径表达式");
5.6 Java generates XML
step:
-
Create an empty document object through the DocumentHelper
Document doc = DocumentHelper.createDocument();
-
Add a root node to the document object
Element root = doc.addElement("根节点名称");
-
Enrich our child nodes through the root node object root
Element e = root.addElement("元素名称");
-
Create a file output stream for storing XML files
FileOutputStream fos = new FileOutputStream("要存储的位置");
-
Convert the file output stream to an XML document output stream
XMLWriter xw = new XMLWriter(fos);
-
Write out the document
xw.write(doc);
-
Release resources
xw.close();
//1. 通过文档帮助器, 创建空的文档对象
Document doc = DocumentHelper.createDocument();
//2. 向文档对象中, 加入根节点对象
Element books = doc.addElement("books");
//3. 向根节点中 丰富子节点
for(int i=0;i<1000;i++) {
//向根节点中加入1000个book节点.
Element book = books.addElement("book");
//向book节点, 加入id属性
book.addAttribute("id", 1+i+"");
//向book节点中加入name和info节点
Element name = book.addElement("name");
Element info = book.addElement("info");
name.setText("苹果"+i);
info.setText("哈哈哈"+i);
}
//4. 创建文件的输出流
FileOutputStream fos = new FileOutputStream("c:\\books.xml");
//5. 将文件输出流 , 转换为XML文档输出流
XMLWriter xw = new XMLWriter(fos);
//6. 写出XML文档
xw.write(doc);
//7. 释放资源
xw.close();
System.out.println("代码执行完毕");
5.7 Use of XStream
Quickly convert objects in Java into XML strings
step:
-
Create XStream object
XStream x = new XStream();
-
Modify the node name generated by the class (the default node name is package name.class name)
x.alias("节点名称",类名.class);
-
Pass in object, generate XML string
String xml字符串 = x.toXML(对象);
Person p = new Person(1001, "张三", "不详");
XStream x = new XStream();
x.alias("haha", Person.class);
String xml = x.toXML(p);
System.out.println(xml);