Python—XML

什么是xml

  • XML 指可扩展标记语言(EXtensible Markup Language)
  • XML 是一种标记语言,很类似 HTML
  • XML 的设计宗旨是传输数据,而非显示数据
  • XML 标签没有被预定义。您需要自行定义标签
  • XML 被设计为具有自我描述性
  • XML 是 W3C 的推荐标准

在python中可以用以下模块操作xml  

以下是xml文件:

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

1. 查找、遍历 xml

得到root 节点:

import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
print(dir(root)) 
#['__class__', '__copy__', '__deepcopy__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'attrib', 'clear', 'extend', 'find', 'findall', 'findtext', 'get', 'getchildren', 'getiterator', 'insert', 'items', 'iter', 'iterfind', 'itertext', 'keys', 'makeelement', 'remove', 'set', 'tag', 'tail', 'text']

print(root.tag)  #打印结果: data  data正好是我们的根节点
import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot() #得到root节点
print(root.tag)

# 遍历文档
for child in root:
    print(child.tag,child.attrib)   # 标签名和 属性名
    print('----------------------')
    for i in child:
        print(i.tag,i.text)

打印结果:

data
country {'name': 'Liechtenstein'}
----------------------
rank 2
year 2008
gdppc 141100
neighbor None
neighbor None
country {'name': 'Singapore'}
----------------------
rank 5
year 2011
gdppc 59900
neighbor None
country {'name': 'Panama'}
----------------------
rank 69
year 2011
gdppc 13600
neighbor None
neighbor None

只遍历 其中某个 节点:

import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
print(root.tag)


# 只遍历  year 节点
for node in root.iter('year'):
    print(node.tag,node.text)
    
# 打印:
# data
# year 2008
# year 2011
# year 2011

----------------------------------------分割线-------------------------------------------------


2.修改和删除xml文档内容

修改:

import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
print(root.tag)

#修改
for node in root.iter('year'):
    node.text = str(int(node.text) + 1)
    node.set('colr','red')
    print(node.tag,node.text)
tree.write('xml_test2.xml')

结果:


删除

import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()

#删除
for country in root.findall('country'):
    rank = int(country.find('rank').text)
    if rank > 50:
        root.remove(country)
tree.write('xml_test3.xml')

1530939482262


---------------------------------------------------分割线-------------------------------------------------------

3.创建xml

import xml.etree.cElementTree as ET
# 创建xml

#生成一个对象
new_xml = ET.Element('namelist')
name = ET.SubElement(new_xml,"name",attrib={"country":"Peking"})
age = ET.SubElement(name,"age",attrib={"type":"child"})
age.text = "22"

et = ET.ElementTree(new_xml) #生成文档对象
et.write("create_xml.xml",encoding="utf-8",xml_declaration=True)

ET.dump(new_xml) #打印生成的格式

打印结果:

<?xml version='1.0' encoding='utf-8'?>
<namelist>
    <name country="Peking">
        <age type="child">22</age>
    </name>
</namelist>

并创建了一个 xml文件

1530970086066

猜你喜欢

转载自www.cnblogs.com/friday69/p/9278595.html