XML jaxp parsing and schema constraints

An introduction to xml parsing

*xml is a markup document
*js uses dom to parse markup document?
- According to the hierarchical structure of HTML, allocate a tree structure in memory, and encapsulate HTML tags, attributes and text into objects
- document object, element object, attribute object, text object, Node node object * xml parsing method ( Technology): dom and sax *** The difference between dom parsing and sax parsing: ** DOM parsing * Allocate a tree structure in memory according to the hierarchical structure of xml, and encapsulate xml tags, attributes and text into objects * Disadvantages : If the file is too large, it will cause memory overflow * Advantages: It is very convenient to implement addition, deletion and modification operations ** sax parsing * Adopt event-driven, parsing while reading - from top to bottom, parsing line by line, parsing to a certain object, return Object name * Disadvantage: Addition, deletion, and modification operations cannot be implemented * Advantages: If the file is too large, it will not cause memory overflow and facilitate query operations * To parse xml, you first need a parser ** Different companies and organizations provide DOM and The parser in sax mode is provided through api *** Sun company provides jaxp for dom and sax parser *** dom4j organization, for dom and sax parser dom4j (*** in actual development****) * ** jdom organization, for dom and sax parser jdom





















2. jaxp

**jaxp is part of javase

**jaxp parser is in jdk's javax.xml.parsers package
** Four classes: classes used for dom and sax parsing *** dom:  DocumentBuilder : parser class - this The class is an abstract class and cannot be new. An instance of this class can be obtained from the DocumentBuilderFactory.newDocumentBuilder() method - the xml parse("xml path") method can parse the xml and return the entire document of the Document - the returned document is an interface, The parent interface is Node. If you can't find the method you want in the document, go to Node to find it - Document interface getElementsByTagName(String tagname)  -- this method can get the tag -- return the collection NodeList createElement(String tagName) -- create Label createTextNode(String data)  -- create text appendChild(Node newChild)  -- add text below label removeChild(Node oldChild)  -- delete node getParentNode() 



























-- get parent node

NodeList list interface

- getLength() gets the length of the set
- item(int index) subscripts to the specific value
for(int i=0;i<list.getLength();i++) {
list.item(i); //return Node interface
}

                                       Node interface

getTextContent()
- Get the content inside the tag DocumentBuilderFactory: Parser Factory - This class is also an abstract class and cannot be new



newInstance() gets an instance of DocumentBuilderFactory.

Third, use jaxp to achieve query operations

*** Query the values ​​of all name elements in xml
* Step
//Query the values ​​of all name elements
/*
* 1. Create a parser factory
DocumentBuilderFactory.newInstance(); * 2. Create a parser builderFactory.newDocumentBuilder
according to the parser factory
();
* 3. Parse xml and return document
* Document document = builder.parse("src/Person.xml"); * 4. Use document.getElementsByTagName("name")
to get all name elements ; * 5. Return collection , traverse the collection, get each name element - traverse getLength() item() - get the value inside the element and use getTextContent()




* */

DocumentBuilderFactory dbf= DocumentBuilderFactory.newInstance();
DocumentBuilder db=dbf.newDocumentBuilder();
Document d=db.parse("src/Person.xml");
NodeList nl=d.getElementsByTagName("name");
for(int i=0;i<nl.getLength();i++){
    Node n=nl.item(i);
    System.out.println(n.getTextContent());
}

Fourth, use jaxp to add nodes

*** Add <sex>male</sex> under the first student (at the end)
**steps
/*
* 1. Create a parser factory
* 2. Create a parser according to the parser factory
* 3. Parse xml and return document
* 4. Get all students and
                            use the item method to subscript to get
* 5. Create sex label createElement
* 6. Create text createTextNode
* 7. Add text under sex appendChild
* 8. Add sex to the first p1 under appendChild
* 9, write back xml

* */

DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document document = builder.parse("src/Person.xml");
NodeList nodeList = document.getElementsByTagName("student");
for (int i = 0; i < nodeList.getLength(); i++) {
	Node node = nodeList.item(i);
	Node node2 = document.createElement("sex");
	Node node3 = document.createTextNode("男");
	node2.appendChild(node3);
	node.appendChild(node2);
}
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.transform(new DOMSource(document), new StreamResult("src/Person.xml"));

To write back xml, you need to use the Transformer abstract class, which needs to be obtained through the newTransformer() method of the TransformerFactory class.

        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        Transformer transformer = transformerFactory.newTransformer();

Methods for writeback:

        transform(Source xmlSource, Result outputTarget)

                    Parameter: Source xmlSource: xml input to be converted (the parameter of this class is the Node interface)

                            outputTarget: Convert the Result of the xmlSource (the path where the parameters of this class are written)

Five, use jaxp to modify the node

*** Modify the sex content under the first p1 to be nan
** Steps
/*
* 1. Create a parser factory
* 2. Create a parser based on the parser factory
* 3. Parse xml, return document 
** 
4, get sex item method
* 5. Modify the value in sex  
*** setTextContent method

* 6. Write back xml

* */

Six, use jaxp to delete nodes

*** Delete <sex>nan</sex> node
** Steps
/*
* 1. Create a parser factory
* 2. Create a parser based on the parser factory
* 3. Parse xml, return document 

* 4. Get sex element
* 5. Use the getParentNode method to get the parent node of sex
* 6. Delete the parent node using the removeChild method

* 7. Write back xml

* */

Seven, use jaxp to traverse nodes

** Print out all the element names in xml
** Steps
/*
* 1. Create a parser factory
* 2. Create a parser based on the parser factory
* 3. Parse xml and return document

* ==== Use recursive implementation =====
* 4, get the root node
* 5, get the child node of the root node
* 6, get the child nodes of the child node of the root node

* */

public static void main(String[] args) throws Exception {
	DocumentBuilderFactory builderFactory = DocumentBuilderFactory
			.newInstance();
	DocumentBuilder builder = builderFactory.newDocumentBuilder();
	Document document = builder.parse("src/Person.xml");
	list(document);
}

private static void list(Node node) {
	//Print only when it is determined to be the element type
	if (node.getNodeType() == Node.ELEMENT_NODE) {
		System.out.println(node.getNodeName());
	}
	//Get the set of child nodes of this node
	NodeList list = node.getChildNodes();
	for (int i = 0; i < list.getLength(); i++) {
		// get each child node
		Node node1 = list.item(i);
		//recursive call
		list(node1);
	}
}

Eight, the principle of sax analysis

* Allocate a tree structure in memory according to the hierarchical structure of
xml** Encapsulate tags, attributes, and text in xml into objects * sax mode: event-driven, parsing while reading * In the javax.xml.parsers package ** SAXParser An instance of this class can be obtained from the SAXParserFactory.newSAXParser() method - parse(File f, DefaultHandler dh)  * two parameters ** first parameter: xml path ** event handler ** SAXParserFactory instance newInstance() method * sax execution process







    
    



* When the start tag is parsed, the startElement method is automatically executed

                        startElement(String uri, String localName, String qName, Attributes attributes)

* When parsing to text, automatically execute the characters method

                        characters(char[] ch, int start, int length)

* When parsing to the end tag, automatically execute the endElement method

                        endElement(String uri, String localName, String qName) 

Nine, use jaxp's sax method to parse xml

* The sax method cannot implement addition, deletion and modification operations, but only query operations
** Print the entire document
*** Execute the parse method, the first parameter is the xml path, and the second parameter is the event handler
*** Create a class and inherit The class of the event handler,
***Rewrite the three methods in it * Get the values ​​of all the name elements ** Define a member variable flag= false ** Determine whether the start method is a name element, if it is a name element, put The flag value is set to true ** If the flag value is true, print the content in the characters method ** When the execution reaches the end method, set the flag value to false * Get the value of the first name element ** Define a member variable idx =1 ** At the end of the method, idx+1 idx++ ** Want to print the value of the first name element , - judge in the characters method, -- judge flag=true and idx==1, print the content












public class SaxDemo {
	public static void main(String[] args) throws Exception {
		SAXParserFactory parserFactory = SAXParserFactory.newInstance();
		SAXParser saxParser = parserFactory.newSAXParser();
		MyDefault dh = new MyDefault();
		saxParser.parse("src/Person.xml", dh);
	}
}

// Create a class that inherits the event handler class
// Override the three methods inside
class MyDefault extends DefaultHandler {
	// define a member variable boo= false
	boolean boo = false;

	@Override
	public void startElement(String uri, String localName, String qName,
			Attributes attributes) throws SAXException {
		// Determine whether the start method is a name element, if it is a name element, set the boo value to true
		if (qName.equals("name")) {
			boo = true;
			System.out.print("<" + qName + ">");
		}
	}

	@Override
	public void endElement(String uri, String localName, String qName)
			throws SAXException {
		// When the execution reaches the end method, set the flag value to false
		if (qName.equals("name")) {
			boo = false;
			System.out.println("</" + qName + ">");
		}
	}

	@Override
	public void characters(char[] ch, int start, int length)
			throws SAXException {
		// If the flag value is true, print the content in the characters method
		if (boo) {
			System.out.print(new String(ch, start, length));
		}
	}
}


Ten, schema constraints

dtd syntax: <!ELEMENT element name constraint>
**schema conforms to xml syntax, xml statement
** one xml can have multiple schemas, and multiple schemas are distinguished by namespace (similar to java package name)
**dtd contains PCDATA type, but more data types can be supported in schema
*** For example, age can only be an integer, an integer type can be defined directly in schema
*** The schema syntax is more complicated, and schema cannot currently replace dtd

11. Quick Start of Schema

* Create a schema file with a suffix of .xsd
** root node <schema>
** in the schema file
** attribute xmlns="http://www.w3.org/2001/XMLSchema"
- indicates that the current xml file is a Constraint file
** targetNamespace="http://www.example.org/1" 
- use the schema constraint file and import the constraint file directly through this address
** elementFormDefault="qualified"
Step
(1) See how many elements are in the xml
<element>
(2) Look at simple elements and complex elements
* If complex elements are
<complexType>
<sequence>
sub-elements
</sequence>
</complexType>
(3) simple elements, write them in the complex elements
<element name="person" ">
<complexType>
<sequence>
<element name="name" type="string"></element>
<element name="age" type="int"></element>
</sequence>
</complexType>
</element>

(4) Introduce the constraint file in the constrained file
<person xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.itcast.cn/20151111"
xsi :schemaLocation="http://www.example.org/1 1.xsd" >

** xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
-- indicates that xml is a Constraint file
** xmlns="http://www.itcast.cn/20151111"
-- is the targetNamespace
** xsi:schemaLocation="http://www.itcast.cn/20151111 1.xsd"> in the constraint document
- - targetNamespace space constraints the address path of the document * <sequence>: Indicates the order in which the element appears    <all>: The element can only appear once    <choice>: The element can only appear in one of them     maxOccurs="unbounded": Indicates the occurrence of the element Times    <any></any>: Indicates any element * can constrain attributes * written in complex elements *** written in </complexType> before --











<attribute name="id1" type="int" use="required"></attribute>
- name: attribute name
- type: attribute type int string
- use: whether the attribute must appear required

* complex schema constraints
        * introduce multiple schema files, each of which can be given an alias

<?xml version="1.0" encoding="UTF-8"?>
<!-- The data file references multiple schemas -->
<company xmlns = "http://www.example.org/company"
	xmlns:dept="http://www.example.org/department"
	xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://www.example.org/company company.xsd http://www.example.org/department department.xsd"
>
	<employee age="30">
		<!-- Department Name-->
		<dept:name>Human Resources</dept:name>
		<!-- employee name-->
		<name>Wang Xiaoxiao</name>   
	
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.example.org/company"
elementFormDefault="qualified">
	<element name="company">
		<complexType>
			<sequence>
				<element name="employee">
					<complexType>
						<sequence>
							<!-- Refers to any element -->
							<any></any>
							<!-- employee name-->
							<element name="name"></element>
						</sequence>
						<!-- Add attribute to employee element -->
						<attribute name="age" type="int"></attribute>
					</complexType>
				</element>
			</sequence>
		</complexType>
	</element>
</schema>
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
 targetNamespace="http://www.example.org/department"
 elementFormDefault="qualified">
 <!-- Department Name-->
 <element name="name" type="string"></element>
</schema>


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324800982&siteId=291194637