Python_xml模块_用ElementTree解析xml

xml: 可扩展标记语言，用来标记数据，定义数据类型，主要用来传输和存储数据（和json差不多，不同语言或程序之间进行数据交换的协议）

xml格式：

<site>
    <name>hello</name>
    <url>yeah</url> 
</site>
<site>
    <name>你好</name>
    <url>嘿</url>
</site>

用ElementTree解析xml：

ElementTree: 一：xml.etree.ElementTree : 用纯Python实现，相对于C实现的慢一些

　　　　　　　二：xml.etree.cElementTree : 用C语言实现，速度更快，消耗内存更少

在Python3.3过后，ElementTree模块会自动寻找可用的C库来加快速度，所以直接import xml.etree.ElementTree 就可以直接用第二种方法

创建xml：

import xml.etree.cElementTree as ET

grandfather = ET.Element('爷爷')        # 根节点
uncle = ET.SubElement(grandfather, '大伯', attrib={'性别': '男'})    # 子节点
age = ET.SubElement(uncle, 'age', attrib={'啦啦': '木办法'})    # uncle的子节点
work = ET.SubElement(uncle, 'work')   # age 与work同级
child = ET.SubElement(uncle, 'child', attrib={'性别': '女'})
age.text = '45'     # 给age和name设值
work.text = '次饭饭'
child.text = '二丫'
uncle2 = ET.SubElement(grandfather, '二伯', attrib={'性别': '男'})
age = ET.SubElement(uncle2, 'age')
work = ET.SubElement(uncle2, 'work')   # work 与age同级
child = ET.SubElement(uncle2, 'child', attrib={'性别': '男'})
age.text = '44'
work.text = '打豆豆'
child.text = '二蛋'


et = ET.ElementTree(grandfather)      # 生成文档对象
et.write('test.xml', encoding='utf-8', xml_declaration=True)    # 注释声明的xml所用编码
# ET.dump(grandfather)   # 打印生成格式

grandfather

结果：

<?xml version='1.0' encoding='utf-8'?>
<爷爷>
    <大伯 性别="男">
        <age 啦啦="木办法">45</age>
        <work>次饭饭</work>
        <child 性别="女">二丫</child>
    </大伯>
    <二伯 性别="男">
        <age>44</age>
        <work>打豆豆</work>
        <child 性别="男">二蛋</child>
    </二伯>
</爷爷>

#######运行结果本身没有缩进#########

test.xml

对xml的部分操作：

import xml.etree.cElementTree as ET
tree = ET.parse('test.xml')            # 解析文件
root = tree.getroot()                # 遍历节点
print(root.tag)                      # x.tag 获取根节点名字

# 遍历所有的内容
for child in root:
    print(child.tag, child.attrib)       # 打印节点，属性
    for i in child:
        print('--->', i.tag, i.text)     # 打印节点，节点里的值（内容）

# # 取一个节点的值
# for node in root.iter('work'):        # 遍历单独某一个节点里的值（过滤year值）
#     print(node.tag, node.text)
#
# # 修改（更新）一个某个节点（某个值）
# for node in root.iter('age'):
#     new_age = int(node.text) + 1      # 新的年份 +1
#     node.text = str(new_age)
#     node.set('感叹：', '老了！')        # 设置（添加）属性
# # tree.write('xmltext.xml', encoding='utf-8',xml_declaration=True)      # 更新文件


# 删除某一个节点

# for uncle in root.findall('uncle'):         # findall() 把所有的uncle找到
#     age = int(uncle.find('age'))   # 找到age下的内容转换为int类型
#     if age > 44:
#         root.remove(uncle)     # 如果满足条件的节点则删除
# tree.write('xtext.xml', encoding='utf-8', xml_declaration=True)

部分方法

elementTree提供的部分方法：

Element.findall(): 找到第一个带有特定标签的子元素
Element.find(): 返回所有匹配的子元素列表
Element.iter(tag=None)：以当前元素为根节点创建树迭代器，如果tag不为None，则以tag过滤

关于 class xml.etree.ElementTree.Element 属性相关：

x.attrib : 表示附有属性，attrib=：添加属性
del x.attrib[key] : 删除对应的属性
x.text : 访问标签内容
x.keys(): 返回元素属性名称列表
x.items(): 返回（name,value）列表
x.get(key,default=none) : 获取属性
x.set(key, value): 更新（添加）属性

添加子元素的方法：

x.append(): 直接添加元素，只能添加一个

x.extend():添加子元素，可以添加多个，可以通过列表的方式添加

x.insert(): 在指定位置添加子元素

另：

除ElementTree（元素树）解析方法外，还有 SAX（simple API for xml）, DOM（Document Object Model）解析xml

Python_xml模块_用ElementTree解析xml

猜你喜欢