Python 解析XML
使用模块lxml
安装:
pip install lxml
pip install requests
from lxml import html import requests page = requests.get('http://econpy.pythonanywhere.com/ex/001.html') tree = html.fromstring(page.content) buyers = tree.xpath('//div[@title="buyer-name"]/text()') prices = tree.xpath('//span[@class="item-price"]/text()')
参考: http://docs.python-guide.org/en/latest/scenarios/scrape/#web-scraping
如果xml里面带有命名空间,namespace, 可以这样:
如: <itunes:duration>14:00</itunes:duration>
duration= tree.xpath('//itunes:duration/text()', namespaces ={'itunes': 'http://www.itunes.com/DTDs/Podcast-1.0.dtd'})
扫描二维码关注公众号,回复:
688464 查看本文章