Profile --XML

1.XML concept

XML: Extensible Markup Language, 可扩展标记语言
Extensible: tags are customizable, write what to write what they meet the label's name

2.XML function - storing data

  • As XML配置文件
  • After the data can save up 在网络中传输(XML is plain text, and its language and platform-independent)

3.XML and HTML difference

Extended --W3C: World Wide Web Consortium, created in 1994, released a number of far-reaching impact on web technology standards and implementation guidelines, which will contain XML and HTML

XML和HTML区别

  • XML tags are customizable, HTML tags are predefined
  • Strict XML syntax, HTML syntax loose
  • XML is to store data, HTML data is showing

4.XML grammar

4.1. Simple XML code

Create a file on the desktop, XML file extension is ".xml", written after the introductory period can be opened with Notepad XML code:

<?xml version = '1.0'?>

<users>

	<user id = "1">
		<name>zhangsan</name>
		<age>23</age>
		<gender>male</gender>
	</user>
	
	<user id = "2">
		<name>lisi</name>
		<age>22</age>
		<gender>female</gender>
	</user>
	
</users>

How to write XML code to verify correct? XML文档可以被所有浏览器解析, The browser has a corresponding XML parsing engine, as long as the XML file into the error does not come, then it shows the XML code is no problem, as follows:
Here Insert Picture Description

4.2. The basic syntax

  • XML document 后缀名".xml"
  • The first line of the document (no blank lines or spaces in front) must have文档声明
  • XML documents and only a根标签
  • 属性值Must be enclosed in quotation marks, single or double quotation marks can be `
  • Must be 正确闭合either self-closing and, like
    , have a beginning or end tag labels to match each other
  • XML tags区分大小写

4.3 part

4.3.1. Document declaration

4.3.1.1. Format

<? Xml list of attributes?>: Note that there are no spaces between the question mark and xml

4.3.1.2. Attribute list

  • version: version number, write 1.0,不写version会报错
  • encoding: encoding, told parsing engine, the current character set used by the document,默认ISO-8859-1
  • standalone: ​​Are independent, yes (does not depend on other files) and no (dependent on other files) two values ​​in practice很少专门去设置

4.3.2. Instructions

Use in conjunction with css

4.3.3. Label

Custom label name, 自定义规则as follows:

  • The name can contain characters, numbers, and other characters
  • The name can not start with a number or punctuation
  • The name can not start with xml (or XML, Xml etc.)
  • The name can not contain spaces

4.3.4. Properties

id属性值唯一

4.3.5. Text

特殊字符要转义, Such as greater than number is smaller than number, etc., to make it easier, we have a CDATA region, the following format:

<![CDATA[
	要展示的数据
]]>

Examples are as follows:

<code>
    <!--编写代码:if(a < b && a > c){}-->

    <!--转义-->
    if(a &lt; b &amp;&amp; a &gt; c){}

    <!--CDATA块-->
    <![CDATA[
        if(a < b && a > c){}
    ]]>
</code>

4.3.6. Notes

<!--注释内容-->

5.XML constraints

5.1. The basic concepts of constraint document

Who write XML? - user software users
who parse XML: - Software
Here Insert Picture Description
as we need to:

  • Can 在xml中引入constraint file
  • Can 简单地读懂(many development environments can automatically provide the appropriate documents suggesting the constraints, we simply need to read to) constraint document

5.2. Constraints document technology

The market constraint document is divided into two categories:

  • DTD: Simple technical constraints
  • Schema: the more complex technical constraints

5.2.1.DTD constraints

A simple DTD constraint document:
Here Insert Picture Description

The introduction of 5.2.1.1.DTD

内部的DTD(uncommonly used)

The rules define constraints in xml document
Here Insert Picture Description

外部的DTD

The rules defined in the external constraints of the DTD file, an external DTD in two ways:

  • local:
  • The internet:

The disadvantage of 5.2.1.2.DTD

约束性不够强

5.2.2.Schema constraints

A simple constraint Schema document:
Here Insert Picture Description
每个自定义类型都进行了更详细的定义

5.2.2.1. Schema constraint introduced

  • Fill the root element of the XML document
  • The introduction of xsi prefix
  • The introduction of xsd file namespace
  • For each xsd constraint specifies a prefix as identification

6.XML resolve

Analysis: xml document operation, the read data of the document into memory

Manipulate XML documents:

  • 解析(Read): The data in the document into memory
  • 写入: Save the in-memory data into an XML document, persistent storage

6.1 XML parsing mode:

6.1.1.DOM

DOM: The disposable loading a markup language document into memory, a DOM tree formed in memory
Here Insert Picture Description
DOM优点:

  • Easy to operate, can be CRUD operations for all documents

DOM缺点

  • The disposable loading everything into memory when the file is particularly large, the resulting tree structure is very much memory

6.1.2.SAX

SAX: read line by line, based on event-driven
SAX优点: memory is always only one line, do not account for memory, memory is suitable for smaller devices
SAX缺点: read only, not additions and deletions

6.2.XML common parser

6.2.1.JAXP (use less)

Provided by Sun, support DOM and SAX two ideas

6.2.2.DOM4J (Excellent)

Based on the excellent DOM parser

6.2.3.jsoup

A Java HTML parser can parse a URL address directly, HTML text, which provides a very labor-saving API, operating data can be read through DOM, CSS and jQuery method of operation is similar to the

6.2.3.1.jsoup Quick Start

step:

  • Import jar package
    Here Insert Picture Description
  • Gets Document object that represents the whole DOM tree structure

Acquisition method:

  • Parsing from a URL, file or string
  • Using DOM or CSS selectors to locate, retrieve data
  • Operable elements, attributes, text
  • Obtaining the corresponding tab: Element Object
  • retrieve data
    Here Insert Picture Description

Use 6.2.3.2.jsoup object

  • Jsoup: Tools, parsing HTML or XML document and return Document, mainly to understand the parse method

parse: parse parse HTML or XML document and return Document

  • parse (File in, String charsetName): parse XML or HTML file
  • parse (String html): parse XML or HTML string
  • parse (URL url, int timeoutMillis): Gets the specified object html or xml path through the network, the more common, will be used when doing reptiles
  • Document: Inherited from Element, it is a document object that represents the memory DOM tree

Mainly used to get the Element object:

  • getElementById (String tagName): Element object according to obtain a unique ID attribute value (used very much)
  • getElementByTag (String tagName): Gets the object collection element according to the label name
  • getElementByAttribute (String key): Gets a collection of objects based on the attribute name element
  • getElementByAttributeValue (String key, String value): to get the object set according to the corresponding attribute name element and attribute values
    Here Insert Picture Description
  • Elements: Element object collection element, as may ArrayList<Element>be used
  • Element: Element object, you can get element object, attribute values, text, etc.

1. Get the child element object:

  • getElementById (String tagName): Element object according to obtain a unique ID attribute value (used very much)
  • getElementByTag (String tagName): Gets the object collection element according to the label name
  • getElementByAttribute (String key): Gets a collection of objects based on the attribute name element
  • getElementByAttributeValue (String key, String value): to get the object set according to the corresponding attribute name element and attribute values

2. Obtain the property value

  • String attr (String key): Gets the property value based on the attribute name

3. Get text

  • String text (): Gets纯文本内容
  • String html (): Get the entire contents of the label body (including the contents of the string member child of the tag)
    Here Insert Picture Description
  • Node: Node object, a parent object Document and Element

Provide quick and easy way 6.2.3.3.jsoup

  • selector: Selector, then known hierarchical query syntax deeper content more convenient

A method used: SELECT Elements (String cssQuery)
2. Syntax: Reference Selector syntax defined in class
Here Insert Picture Description
3. A more complex example:
Here Insert Picture Description

  • XPath: XPath XML Path Language i.e., it is a language used to determine the position of a portion of an XML document

Jsoup to use the XPath 额外导入一个Jar包because XPath for XML query though, but XML itself and is independent
inquiry W3CSchool reference manual to complete the query using XPath syntax
Here Insert Picture Description
Here Insert Picture Description

6.2.4.PULL

Based on the Android operating system parser, SAX way

Guess you like

Origin blog.csdn.net/LiLiLiLaLa/article/details/90219528