python web crawler parses the module lxml

08.06 self-summary

python web crawler parses the module lxml

A. Mounting module

The windows system installation:

method one:pip3 install lxml

Method Two: download the corresponding file system version of the wheel: http://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml

pip3 install lxml-4.2.1-cp36-cp36m-win_amd64.whl Path # file is located

linux installation:

method one:pip3 install lxml

Method Two:yum install -y epel-release libxslt-devel libxml2-devel openssl-devel

II. Use of module

from lxml.html import etree

Show

import requests
from lxml.html import etree

rp = requests.get('http://www.baidu.com')
html = etree.HTML(rp.text)
#解析后的对象可以使用xpath进行内容匹配

xpath path wording

Guess you like

Origin www.cnblogs.com/pythonywy/p/11311094.html