pytho xml

Reprinted from: https://www.cnblogs.com/gouguoqilinux/p/9168332.html

 

xml program is to achieve a different language or direct data exchange protocol, with almost json, json single simpler to use, but now there are many traditional companies mainly xml interface

xml with html tags belong to a language

Our main learning is ElementTree is python XML processing module, which provides a lightweight object model, when used ElementTree module, you need to import xml.etree.ElementTre
Xml ElementTree equivalent to the entire node tree, and Element indicates a number of nodes in individual node
We look at the following xml text, labels, divided into two kinds.
1, and since the closing tags <rank updated = "yes"> 2 </ rank>
2, non-closed tag <neighbor direction = "E" name = "Austria" />
 
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year updated="yes">2010</year>
        <gdppc>141100</gdppc>
        <neighbor direction="E" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year updated="yes">2013</year>
        <gdppc>59900</gdppc>
        <neighbor direction="N" name="Malaysia" />
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year updated="yes">2013</year>
        <gdppc>13600</gdppc>
        <neighbor direction="W" name="Costa Rica" />
        <neighbor direction="E" name="Colombia" />
    </country>
</data> 
 

Second, the specific application module xml.etree.ElementTree

1, print the name of the root tag

 
import xml.etree.ElementTree as ET # import module, the name is too long, this module name rename ET
tree = ET.parse ( "xml_lesson.xml") # parse parse parse with this method in the xml file parsing module ET open, get open a parse tree, tree is an object
root = tree.getroot () # This object can call methods, getroot is the root meaning of
print (root.tag) #root this object has a property tag, name tag value is the root tag

C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test

data
 

What is the label, what is property, for example

<country name="Liechtenstein">

conuntry is the label 

name = "Liechtenstein" is an attribute of this tag

 

<neighbor direction="W" name="Costa Rica" />

neighbor is a label

attribute direction = "W" name = "Costa Rica" these are two labels

2, with a for loop to see what is below the root

 
for n in root:
    print(n)

C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test

<Element 'country' at 0x0000000000E339F8>

<Element 'country' at 0x0000000000E38408>

<Element 'country' at 0x0000000000E38598>
 

It is addressed to three objects, printed under the name of these three labels, with a tag on it this property

 
for n in root:
    print(n.tag)

C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test

country

country

country
 

country following labels in print

 
for n in root:
    for i in n:
        print(i.tag)

C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test

rank

year

gdppc

neighbor

neighbor

rank

year

gdppc

neighbor

rank

year

gdppc

neighbor

neighbor
 

3, attributes attrib print labels

Print country property

 
for n in root:
    print(n.attrib)

C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test

{'name': 'Liechtenstein'}

{'name': 'Singapore'}

{'name': 'Panama'}
 

country in which the print object attributes

 
for n in root:
    for i in n:
        print(i.attrib)

C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test

{'updated': 'yes'}

{'updated': 'yes'}

{}

{'name': 'Austria', 'direction': 'E'}

{'name': 'Switzerland', 'direction': 'W'}

{'updated': 'yes'}

{'updated': 'yes'}

{}

{'name': 'Malaysia', 'direction': 'N'}

{'updated': 'yes'}

{'updated': 'yes'}

{}

{'name': 'Costa Rica', 'direction': 'W'}

{'name': 'Colombia', 'direction': 'E'}
 

4, text label the actual contents of the package

 
for n in root:
    for i in n:
        print(i.text)

C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test

2

2010

141100

None

None

5

2013

59900

None

69

2013

13600

None

None 
 

5, process  

Would like to take the text attribute values ​​for each year in this way should be extracted with the method iter

 
for n in root.iter("year"):
    print(n.tag,n.text)

C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test

year 2010

year 2013

year 2013
 

6, for the xml file data modification operations

 
import xml.etree.ElementTree as ET # import module, the name is too long, this module name rename ET

tree = ET.parse ( "xml_lesson.xml") # parse parse parse with this method in the xml file parsing module ET open,
                                   # Resolve open to get a tree, tree is an object
root = tree.getroot () # This object can call methods, getroot is the root meaning of
# Print (root.tag) #root this object has a property tag, name tag value is the root tag

for n in root.iter("year"):
    new_year=int(n.text)+1
    n.text = str (new_year) # modify text attribute tag year
    n.set ( "updated1", "yes") # this year to add a tag attribute
tree.write ( "xml_lesson.xml") # modified write directly to a file
 
 
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year updated="yes" updated1="yes">2012</year>
        <gdppc>141100</gdppc>
        <neighbor direction="E" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year updated="yes" updated1="yes">2015</year>
        <gdppc>59900</gdppc>
        <neighbor direction="N" name="Malaysia" />
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year updated="yes" updated1="yes">2015</year>
        <gdppc>13600</gdppc>
        <neighbor direction="W" name="Costa Rica" />
        <neighbor direction="E" name="Colombia" />
    </country>
</data>
 

7, xml file delete operation, such as deleting a ranking greater than 50 countries, it is necessary to take into text in rank of each conutry

 
import xml.etree.ElementTree as ET # import module, the name is too long, this module name rename ET

tree = ET.parse ( "xml_lesson.xml") # parse parse parse with this method in the xml file parsing module ET open,
                                   # Resolve open to get a tree, tree is an object
root = tree.getroot () # This object can call methods, getroot is the root meaning of
# Print (root.tag) #root this object has a property tag, name tag value is the root tag

for n in root.findall ( "country"): # find all the country
    rank = int (n.find ( "rank"). text) # find the text of the value of all rank
    if rank> 50: # 50 greater than the determination value
        root.remove (n) # deletes the country this tag
tree.write ( "xml_lesson.xml") # write to file
 

After deleting files where only two of the country

 
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year updated="yes" updated1="yes">2012</year>
        <gdppc>141100</gdppc>
        <neighbor direction="E" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year updated="yes" updated1="yes">2015</year>
        <gdppc>59900</gdppc>
        <neighbor direction="N" name="Malaysia" />
    </country>
    </data>
 

8, how to create xml file it through the module

 
import xml.etree.ElementTree as ET # import module, the name is too long, this module name rename ET
new_xml = ET.Element ( "namelist") # create a root
# Create the equivalent of a <namelist> </ namelist>
name=ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})
# Create a sub-label name, then add an attribute
age=ET.SubElement(name,"age",attrib={"checked":"no"})
sex=ET.SubElement(name,"sex")
sex.text="28"


et = ET.ElementTree (new_xml) # generation document object
et.write("test.xml",encoding="utf8",xml_declaration=True)
 

View the next generation of the content of the document text.xml

 
<namelist>

<name enrolled="yes">

<age checked="no"/>

<sex>28</sex>

</name>

</namelist> 
 

Guess you like

Origin www.cnblogs.com/wztshine/p/11781501.html