XML
1. Concept
Extensible Markup Language Extensible Markup Language
* Scalable: tags are customizable. <User> <student>
* Features
*Storing data
1. Profiles
2. The transmission network
* The difference between xml and html
1.xml labels are customizable, ntml labels are predefined
2.xml strict syntax, html syntax loose
3.xml is storing data, html is showing data
2. Grammar
* Basic grammar:
1.xml document the amount of extension .xml
2.xml first line must be defined as a document document declaration <? Xml version = '1.0'?>
3.xml document and only one root tag
4. The attribute values must be used to guide (odd and even can) to cause
5.xml label names are case sensitive
* Quick Start
<?xml version='1.0' ?> <users> <user id='1'> <name>zhangsan</name> <age>23</age> <gender>male</gender> </user> <user id='2'> <name>lisi</name> <age>24</age> <gender>female</gender> </user> </users>
* component:
1. The document declaration
1. format: <? Xml list of attributes?>
2. The list of attributes
* Version: The version number must be the property
* encoding: encoding. Inform parsing engine used in the current document character set, the default value: ISO-8859-1
* Standalone: independence
* Value:
* yes: does not depend on other files
* no: dependent on other files
2. Instruction (Learn): binding of css
* <?xml-stylesheet type="text/css" herf="a.css" ?>
3. Label: Custom label name
* Rules:
* The name can contain letters, numbers and other characters
* The name can not start with a number or punctuation
begin with the letters * name can not xml (or XML, Xml etc.)
* The name can not spaces
4. attributes:
ID attribute value unique
5. Text:
* CDATA regions: the data in this area will be show as
* format: <[CDATA [Data]]!>
* Constraint: writing a predetermined rule xml document
* frame as the user (programmer):
1. constraint can be introduced in xml document
2 can simply read the bound document
* Classification:
1. the DTD: A simple constraint technology
2. Schema: a complex technical constraints
*DTD:
* Introducing dtd document to xml document
* Internal dtd: the constraint rules defined in xml document
* external dtd: rules define the constraints in external dtd file
* Local: <! "Position dtd file" DOCTYPE root tag name the SYSTEM>
* Network:! <DOCTYPE the root tag name PUBLIC "dtd file name to" "URL dtd file">
Schema *:
* introduction:
1. Fill xml root element of the document
2. Introduction xsi prefix xmlns:. Xsi = "http://www.w3.org/2001/XMLSchema-instance"
3. introducing xsd file namespace xsi. : schemaLocation = "http://www.itcast.cn/xml student.xsd"
4. Each xsd constraint specifies a prefix, as the identification xmlns = "http://www.itcast.cn/xml"
<students xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.itcast.cn/xml" xsi:schemaLocation="http://www.itcast.cn/xml student.xsd">
3. Resolution:
Xml document operation, the read data of the document into memory
* Operating xml document
1. The resolution (reading): read data document into memory
2. Write: Save the data memory to the xml, persistent storage
* Parse xml ways:
1, Dom: the markup language document disposable loading into the memory, to form a dom tree in memory
* Advantages: easy to operate, can be CURD to document all operations
* Cons: total memory
2.SAX: read line by line, based on event-driven.
* Pros: do not take up memory.
* Disadvantages: can only be read, not additions and deletions
* Xml parser common :
1. JAXP: Sun offers parser, support dom and sax two ideas
2. DOM4J: a very good parser
3. Jsoup: jsoup is a Java HTML parser, direct resolve a URL address, HTML text.
It provides a very labor-saving API, which is taken out and manipulate data through DOM, CSS and an operation method is similar to jQuery.
4. PULL: Android operating system built parsers, sax embodiment.
* Jsoup: jsoup is a Java HTML parser can parse a URL address directly, HTML text.
It provides a very labor-saving API, which is taken out and manipulate data through DOM, CSS and an operation method is similar to jQuery.
* Getting Started:
* Step:
1. Import jar package
2. Get Document object
Acquiring a corresponding tab 3. Element object
4. Get Data
* Code:
//2.1 get student.xml the path
. String path = JsoupDemo1.class.getClassLoader () getResource ( "student.xml") getPath ();.
//2.2 parse xml document, the document is loaded into memory, get dom tree - -> the Document
the Document Document Jsoup.parse = (new new File (path), "UTF-. 8");
.. 3 // Get element object the element
elements elements document.getElementsByTag = ( "name");
System.out.println (elements .size ());
//3.1 acquiring a name of the first target Element
Element Element elements.get = (0);
//3.2 acquiring data
String name = element.text ();
System.out.println (name);
* Use objects:
1. Jsoup: tools, can parse html or xml document to return to the Document * the parse: parse html or xml document to return to the Document * the parse (File in, String charsetName): parse xml or html file. * Parse (String html): parse xml or html string * parse (URL url, int timeoutMillis ): Gets the specified html or xml document object through a network path 2. Document: document object. Represents the memory of the dom tree * Get Element object * getElementById (String id): Gets the only element object based on the id attribute value * getElementsByTag (String tagName): Gets the element name of the object based on the label collection * getElementsByAttribute (String key): Gets the name attribute object collection element according * getElementsByAttributeValue (String key, String value ): obtaining a set of elements of the object according to the corresponding attribute names and values 3. elements: a collection of elements element object. Can be used as the ArrayList < the Element > used 4. the Element: elements of the object 1. The obtaining sub-element object * getElementById (String id): obtain a unique element in accordance with the object id attribute value * getElementsByTag (String tagName): The elements of the object tag name acquired set * getElementsByAttribute (String key): the elements of the object acquired attribute name set * getElementsByAttributeValue (String key, String value ): the elements of the object corresponding to the acquired attribute names and values of the set 2. getting a property value * String attr (String key): the Get property name property value 3. Get text * string text (): Get text * string html (): Get all the contents of the tag body (including the contents of a word string tag) 5. Node: Node Object * is the Document and parent Element
* Shortcut query: 1. Selector: selector method used *: Elements SELECT (String cssQuery) * Syntax: Syntax class defined in Reference Selector 2. XPath: XPath path is the XML language, which is a method for determining a portion of the language position XML (a subset of the standard Generalized Markup language) document * using Xpath jsoup require additional import jar package. * Query w3cshool Reference Manual, complete syntax of the query using xpath * Code: // 1 get student.xml the path. String path = JsoupDemo6.class.getClassLoader () getResource ( "student.xml") getPath ();.. / . / 2 Gets the Document object Document Document Jsoup.parse = (new new File (path), "UTF-. 8"); .. 3 // according to the document object, creating an object JXDocument = New new JXDocument jxDocument JXDocument (Document); // 4 binding xpath query syntax //4.1 student Search tags List jxNodes2 = jxDocument.selN ( "// Student / name"); for (jXNode jxNode: jxNodes2) {< JXNode > jxNodes = jxDocument.selN ( "// student"); for (JXNode jxNode: jxNodes) { System.out.println (jxNode); } System.out.println ( "--------------------"); //4.2 query name tags in all student label List < JXNode > System.out.println (jxNode); } System.out.println ( "--------------------"); //4.3 query name tags with the student id attribute tag List < JXNode > jxNodes3 = jxDocument.selN ( "// student / name [@id]"); for (JXNode jxNode: jxNodes3) { System.out.println (jxNode); } System.out.println ( "--------------------"); //4.4 query id attribute name tags with the tags and the student id attribute value itcast List < jXNode > jxNodes4 = jxDocument.selN ( "// Student / name [@ id = 'itcast']"); for (jXNode jxNode: jxNodes4) { System.out.println (jxNode); }