xpath--string(.)用法

from lxml import etree

html = '''
    <li class="tag_1">需要的内容1
       <a>需要的内容2</a>
    </li>
'''

selector = etree.HTML(html)
contents = selector.xpath('//li[@class = "tag_1"]')
contents1 = selector.xpath('//li[@class = "tag_1"]')[0]
contents2 = contents1.xpath('string(.)')
contents3 = selector.xpath('//li[@class = "tag_1"]/text()')
print(contents)  # [<Element li at 0x2c55e88>]
print(contents1)  # <Element li at 0x2c55e88>
print(contents2)
print(contents3)

  输出结果

 对于contents3的输出中带有'\n',逗号等字符,我们可以用replace替换成我们想要的字符或空格,具体用法参考https://www.runoob.com/python/att-string-replace.html

猜你喜欢

转载自www.cnblogs.com/1061321925wu/p/12297383.html