Reprinted from: https://www.cnblogs.com/gouguoqilinux/p/9168332.html
xml program is to achieve a different language or direct data exchange protocol, with almost json, json single simpler to use, but now there are many traditional companies mainly xml interface
xml with html tags belong to a language
Our main learning is ElementTree is python XML processing module, which provides a lightweight object model, when used ElementTree module, you need to import xml.etree.ElementTre
Xml ElementTree equivalent to the entire node tree, and Element indicates a number of nodes in individual node
We look at the following xml text, labels, divided into two kinds.
1, and since the closing tags <rank updated = "yes"> 2 </ rank>
2, non-closed tag <neighbor direction = "E" name = "Austria" />
<data> <country name="Liechtenstein"> <rank updated="yes">2</rank> <year updated="yes">2010</year> <gdppc>141100</gdppc> <neighbor direction="E" name="Austria" /> <neighbor direction="W" name="Switzerland" /> </country> <country name="Singapore"> <rank updated="yes">5</rank> <year updated="yes">2013</year> <gdppc>59900</gdppc> <neighbor direction="N" name="Malaysia" /> </country> <country name="Panama"> <rank updated="yes">69</rank> <year updated="yes">2013</year> <gdppc>13600</gdppc> <neighbor direction="W" name="Costa Rica" /> <neighbor direction="E" name="Colombia" /> </country> </data>
Second, the specific application module xml.etree.ElementTree
1, print the name of the root tag
import xml.etree.ElementTree as ET # import module, the name is too long, this module name rename ET tree = ET.parse ( "xml_lesson.xml") # parse parse parse with this method in the xml file parsing module ET open, get open a parse tree, tree is an object root = tree.getroot () # This object can call methods, getroot is the root meaning of print (root.tag) #root this object has a property tag, name tag value is the root tag C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test data
What is the label, what is property, for example
<country name="Liechtenstein">
conuntry is the label
name = "Liechtenstein" is an attribute of this tag
<neighbor direction="W" name="Costa Rica" />
neighbor is a label
attribute direction = "W" name = "Costa Rica" these are two labels
2, with a for loop to see what is below the root
for n in root: print(n) C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test <Element 'country' at 0x0000000000E339F8> <Element 'country' at 0x0000000000E38408> <Element 'country' at 0x0000000000E38598>
It is addressed to three objects, printed under the name of these three labels, with a tag on it this property
for n in root: print(n.tag) C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test country country country
country following labels in print
for n in root: for i in n: print(i.tag) C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test rank year gdppc neighbor neighbor rank year gdppc neighbor rank year gdppc neighbor neighbor
3, attributes attrib print labels
Print country property
for n in root: print(n.attrib) C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test {'name': 'Liechtenstein'} {'name': 'Singapore'} {'name': 'Panama'}
country in which the print object attributes
for n in root: for i in n: print(i.attrib) C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test {'updated': 'yes'} {'updated': 'yes'} {} {'name': 'Austria', 'direction': 'E'} {'name': 'Switzerland', 'direction': 'W'} {'updated': 'yes'} {'updated': 'yes'} {} {'name': 'Malaysia', 'direction': 'N'} {'updated': 'yes'} {'updated': 'yes'} {} {'name': 'Costa Rica', 'direction': 'W'} {'name': 'Colombia', 'direction': 'E'}
4, text label the actual contents of the package
for n in root: for i in n: print(i.text) C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test 2 2010 141100 None None 5 2013 59900 None 69 2013 13600 None None
5, process
Would like to take the text attribute values for each year in this way should be extracted with the method iter
for n in root.iter("year"): print(n.tag,n.text) C: \ python35 \ python3.exe D: / pyproject / day21 module / xml_test year 2010 year 2013 year 2013
6, for the xml file data modification operations
import xml.etree.ElementTree as ET # import module, the name is too long, this module name rename ET tree = ET.parse ( "xml_lesson.xml") # parse parse parse with this method in the xml file parsing module ET open, # Resolve open to get a tree, tree is an object root = tree.getroot () # This object can call methods, getroot is the root meaning of # Print (root.tag) #root this object has a property tag, name tag value is the root tag for n in root.iter("year"): new_year=int(n.text)+1 n.text = str (new_year) # modify text attribute tag year n.set ( "updated1", "yes") # this year to add a tag attribute tree.write ( "xml_lesson.xml") # modified write directly to a file
<country name="Liechtenstein"> <rank updated="yes">2</rank> <year updated="yes" updated1="yes">2012</year> <gdppc>141100</gdppc> <neighbor direction="E" name="Austria" /> <neighbor direction="W" name="Switzerland" /> </country> <country name="Singapore"> <rank updated="yes">5</rank> <year updated="yes" updated1="yes">2015</year> <gdppc>59900</gdppc> <neighbor direction="N" name="Malaysia" /> </country> <country name="Panama"> <rank updated="yes">69</rank> <year updated="yes" updated1="yes">2015</year> <gdppc>13600</gdppc> <neighbor direction="W" name="Costa Rica" /> <neighbor direction="E" name="Colombia" /> </country> </data>
7, xml file delete operation, such as deleting a ranking greater than 50 countries, it is necessary to take into text in rank of each conutry
import xml.etree.ElementTree as ET # import module, the name is too long, this module name rename ET tree = ET.parse ( "xml_lesson.xml") # parse parse parse with this method in the xml file parsing module ET open, # Resolve open to get a tree, tree is an object root = tree.getroot () # This object can call methods, getroot is the root meaning of # Print (root.tag) #root this object has a property tag, name tag value is the root tag for n in root.findall ( "country"): # find all the country rank = int (n.find ( "rank"). text) # find the text of the value of all rank if rank> 50: # 50 greater than the determination value root.remove (n) # deletes the country this tag tree.write ( "xml_lesson.xml") # write to file
After deleting files where only two of the country
<data> <country name="Liechtenstein"> <rank updated="yes">2</rank> <year updated="yes" updated1="yes">2012</year> <gdppc>141100</gdppc> <neighbor direction="E" name="Austria" /> <neighbor direction="W" name="Switzerland" /> </country> <country name="Singapore"> <rank updated="yes">5</rank> <year updated="yes" updated1="yes">2015</year> <gdppc>59900</gdppc> <neighbor direction="N" name="Malaysia" /> </country> </data>
8, how to create xml file it through the module
import xml.etree.ElementTree as ET # import module, the name is too long, this module name rename ET new_xml = ET.Element ( "namelist") # create a root # Create the equivalent of a <namelist> </ namelist> name=ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"}) # Create a sub-label name, then add an attribute age=ET.SubElement(name,"age",attrib={"checked":"no"}) sex=ET.SubElement(name,"sex") sex.text="28" et = ET.ElementTree (new_xml) # generation document object et.write("test.xml",encoding="utf8",xml_declaration=True)
View the next generation of the content of the document text.xml
<namelist> <name enrolled="yes"> <age checked="no"/> <sex>28</sex> </name> </namelist>