Its XML parsing with Android

2019-08-01

Keywords: Android parsing XML, XML format, XML meaning


This paper introduces XML and its analytical method with Android.

1, XML Introduction

 

XML is a system for transmitting and storing the data format of the data. Called the Extensible markup language. XML is plain text, it's very simple grammar and format requirements, and therefore has a very wide range of applications in the Internet field.

 

XML structure

The following is a standard XML data structure:

1 <?xml version="1.0" encoding="utf-8"?>
2 <address>
3     <server>
4         <ip>224.1.1.2</ip>
5         <port>2002</port>
6     </server>
7 </address>

 

XML is mainly composed of two parts: the declaration form and content body .

 

In the example above the first row that is declared body. It is generally used to describe the syntax of XML version and the character encoding. Generally do not need to pay attention to the content of this section, the parser will automatically skip this part when parsing XML format.

 

The content thereof corresponding to said row of the second to seventh exemplary. It is the core of XML. All data to be transmitted and stored contents are recorded in this body in. Usually a root XML file contains a label, is an example of the above <address> </ address> tag pair. The role of the root tag may file systems with Linux analogy, in Linux, the entire file system from the root directory is / beginning, XML is also the same, all the data are to be transmitted and stored from the beginning of the root tag.

 

You can see from the above example, XML syntax supports nested. This means that we are free to design a nested relationship, in order to more clearly rational management data.

 

XML syntax

XML syntax rules are not complicated, here are a few common rules of grammar.

 

Rule number one: Closed

XML requires that all tags have to be closed. That must explicitly declare the start and end tags.

 

Start and end tags on the character exactly the same content, the only difference is the front end of a character label / symbol.

 

例如: <server> </server> , <ip> </ip>。

 

The data content can be placed directly between the start and end tags. For example, a record label XML format phone number is: <phone> 13172717561 </ phone>.

 

In addition, the end tag can be abbreviated.

 

If a tag does not need to record what XML information between the start and end tags, then end tag can be abbreviated. For example, two tabs which are completely equivalent:

   <server> </server> , <server />

 

Rule number two: case sensitive

XML tags are case sensitive. One pair of labels, case all characters must be consistent.

 

This is the correct label form: <server> </ server>

 

This is the wrong label form: <server> </ Server>

 

Rule number three: Nested

It supports nested XML tag, and is not limited nesting times. This can be seen from the example of the header article.

 

In addition, the label hierarchy XML hierarchical relationship with the Linux file system is exactly the same.

 

In the Linux file system, i.e., the next path data can be stored directly, can also be nested to save subdirectory, and further the data stored in the subdirectory. XML case. You can create both XML tags within a sub-tab, you can also save data directly, of course sub-labels and data can coexist.

 

For example, the figure below XML syntax is correct:

In the XML tags shown above, it is possible to correctly read tag content ip is 224.1.1.2, the contents can be correctly read tag port 2002 is, of course, can be correctly read tag content server is This is a test server ..

 

But there is one thing should be noted: All the content is not modified sub-label between the start and end tags are the contents of the tag.

 

How to understand it? I.e., in the illustrated example, the content server may tag just "This is a test server." Only. Label content server in addition to those of the visible text, which further comprises front and rear unseen space character, Tab invisible characters or other characters. If this sentence is multi-line, and that the contents of the label server is multi-line. Therefore, assuming the string is a front view of an exemplary four space character, then the content server tag should be "This is a test server.".

 

Rule Four: Property Value Standard quotes

Internal XML start tag allows the addition of several properties pairs. These keys can also be stored on the information. E.g:

Of course, we usually order a clearer hierarchy, will do some proper typesetting operation. Shaped like:

The values ​​of these attributes must be modified key-value pairs in double quotes.

 

Rule 5: Escape

There are a few special characters can not appear directly in the content, as this may cause a parse error in XML. These special characters should be in place in order to escape form. FIG follows

 

  

Rule Six: Notes

XML comments in the format: <- footnotes -!>

 

Rule Seven: Space merger

The number of spaces in the XML content will not be compressed. You have had several space character, is one of several parsed out. Unlike HTML tags as a plurality of successive spaces it can be compressed into a character.

 

 

2, how to parse XML under Android?

 

XML syntax and structure is relatively simple. So if you are interested, then we can develop an XML parser yourself out. Of course, the Internet has been a lot of very sophisticated XML parser, and would like to shorten the development cycle, then you can directly use ready-resolution interface.

 

Android itself does not provide XML parser. But it integrates packaged in Java parser. During programming also can be used directly.

 

XML parsing Currently there are three main forms:

1 WILL

2, DOM

3、PULL

 

WILL

This form corresponding notification is sent at the start and end of scanning the label. Then you come to resolve the appropriate implementation of the callback notification method according to the label content and properties.

 

This form is more troublesome form. Because it requires us to write analytical implement different forms according to different types of XML. It will be a little flexible, but its workload is relatively large.

 

It reads the streaming form.

 

This form is characterized by analytical faster and uses less memory is relatively small.

 

SAL is encapsulated in analytical form in org.xml.sax.helpers.DefaultHandler.

 

JUDGMENT

This form is a one-time the entire XML are read into memory. It's relatively simple to use, there is no work. Drawback is that for large XML files, memory usage will be higher.

 

DOM analytical form and be packaged in javax.xml.parsers.DocumentBuilderFactory in javax.xml.parsers.DocumentBuilder.

 

PULL

This form is similar to the form with SAL. Benefit is that we do not need to write analytical implementation. It will be sequentially read one by one the content attribute tag.

 

It also reads the form of streaming.

 

PULL org.xmlpull.v1.XmlPullParserFactory analytical form and is encapsulated in the org.xmlpull.v1.XmlPullParser.

 

 

3, XML parsing examples

 

Daily work common XML form is nothing more than some nested, with property, with its contents. I write here a sample program to resolve this form of XML to parse PULL form.

 

It is assumed to parse XML is as follows:

<?xml version="1.0" encoding="utf-8"?>
<address>
    
    <server
        location="lab1"
        manager="chorm">
        
        <ip>224.1.1.1</ip>
        <port>2001</port>
    </server>
    
    <Computer
        brand="lenovo"
        user="lemontea">
        
        <system>cent os 7</system>
        <password value="admin" />
        The memory is up to 32GB.
        
        
    </Computer>
    
</address>

 

The following start writing parsing code on Android.

 

The first step is to create a PULL type XML parser object.

XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
XmlPullParser xpp = factory.newPullParser();
xpp.setInput(new FileInputStream("address.xml"), "UTF-8");

 

The second step is to read the type of event.

int and xpp.getEventType = ();

 

The third step is to write the code parsing logic of these tags.

while(et != XmlPullParser.END_DOCUMENT) {
    switch(et) {
        case XmlPullParser.START_DOCUMENT:{
            // do nothing.
        }break;
        case XmlPullParser.START_TAG:{
            String tagName = xpp.getName();
            
            if("server".equals(tagName)) {
                log("server");
                String location = xpp.getAttributeValue(null, "location");
                String manager = xpp.getAttributeValue(null, "manager"); 
                Log ( "LOCATION:" LOCATION + + ", Manager:" + Manager); 
            } the else  IF ( "IP" .equals (tagName)) {
                 // read directly the content behind the label.
                // After performing the nextText (), the internal ip not read the label on the property. 
                Final String IP = xpp.nextText (); 
                log ( "IP:" + IP); 
            } the else  IF ( "Port" .equals (tagName)) {
                 Final String = Port xpp.nextText (); 
                log ( "Port:" + Port); ("Computer".equals(tagName)) {
                log("Computer");
                //遍历内部的所有属性。
                for(int i = 0; i < xpp.getAttributeCount(); i++) {
                    log(xpp.getAttributeName(i) + ":" + xpp.getAttributeValue(i));
                }
            }else if("system".equals(tagName)){
                String system = xpp.nextText();
                log("system:" + system);
            }else if("password".equals(tagName)){
                String psw = xpp.getAttributeValue(null, "value");
                log("password:" + psw);
            }else {
                
            }
        }break;
        case XmlPullParser.TEXT:{
            String txt = xpp.getText();
            log("+" + txt);
        }break;
        case XmlPullParser.END_TAG:{
            // do nothing.
        }break;
    }
    
    et = xpp.next();
}

 

Print the results after running this code are as follows:

D / VAM ( 17133 ) + 
D / VAM ( 17133 ):      
H / VAM ( 17133 ):      
H / VAM ( 17133 ): Server 
D / VAM ( 17133 ) location: LAB1, Manager: chorm 
D / VAM ( 17133 ) + 
D / VAM ( 17133 ):          
H / VAM ( 17133 ):          
H / VAM ( 17133 ): ip: 224.1 . 1.1 
D / VAM ( 17133 ) + 
D / VAM (17133 ):          
D / VAM ( 17133 ): Port: 2001 
D / VAM ( 17133 ) + 
D / VAM ( 17133 ):      
H / VAM ( 17133 ) + 
D / VAM ( 17133 ):      
H / VAM ( 17133 )      
D / VAM ( 17133 ): Computer 
D / VAM ( 17133 ) brand: lenovo 
D / VAM ( 17133 ): user: lemontea 
D / VAM ( 17133 ) + 
D / VAM (17133):         
D/VAM     (17133):         
D/VAM     (17133): system:cent os 7
D/VAM     (17133): +
D/VAM     (17133):         
D/VAM     (17133): password:admin
D/VAM     (17133): +
D/VAM     (17133):         The memory is up to 32GB.
D/VAM     (17133):         
D/VAM     (17133):         
D/ WE ( ):      
D / VAM ( 17 133 ) + 
D / VAM ( 17133 ):     17133

 

This print is not feeling a bit strange? Why are there so many empty row it?

 

In fact, such a long and short lines to produce the root cause lies mentioned earlier: PULL also streaming analytic are well off-label modifications characters between the starting and closing tags are all content .

 

We use notepad ++ to open the XML document to be parsed to see if it is what kind of character in the form of:

Streaming analytic means that it will resolve the order from beginning to end, and then we write according to the processing code, read what print.

 

First, the read start tag is <address>. Then the contents of the first to read the contents of CRLF with a Tab character, and then a CRLF character, and then again a Tag character. Before going down to the start tag <server>, and to which was suspended. To read in front of a bunch of properties are "content", first printed. According to the above code we write:

After the form is printed out

So, we get in front of the print is correct!

 

In addition to the characters we see, it should also be careful not to ignore those we can not see the characters. This parsing rules one by one shining parse the entire XML document, you get printed form posted above the.

 

Above parsing XML code, but also transform the attributes and read the contents of several method calls. Everyone to choose according to their actual business needs it.

 


 

References:  https://www.runoob.com/w3cnote/android-tutorial-xml.html

Guess you like

Origin www.cnblogs.com/chorm590/p/11281366.html