Reptile -xpath basic examples demonstrate

xpath is an HTML page looking for ways to filter the data we need, and his result is a list

To be filtered HTML page:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8"/>
    <title>Xpath 测试</title>
</head>
<body>
    <div class="song">
        火药
        <b>指南针</b>
        <b>造纸术</b>
        <</printing>Bb > 

    </ div > 
    < div class = "Tang" > 
        < ul > 
            < li class = "balove" > Parking Fenglin love to sit late, Leaves red flowers in February </ li > 
            < li the above mentioned id = "hua" > business subjugation of women do not know hate, across the rear garden flowers still sing </ li > 
            < li class = "Love" name = "Yang" > a ride Red Feizixiao, no one knows is to litchi </ li > 
            < li the above mentioned id = "bei" >Magic Cup grape wine, To drink immediately pipa reminder </ li >
            < Li > < A href = "http://www.baidi.com" > Baidu, </ A > </ li > 
        </ ul > 
        < OL > 
            < li class = "lucy" > searching, desolate , desolately sad </ Li > 
            < Li class = "balily" > ye warm also when cold, most difficult to rate </ Li > 
            < Li class = "Lilei" > cups light wine </ Li > 
            < li >How the enemy he later arrivals, winds </ li> 
            < Li > Yan had also, is sad, but it is old acquaintance </ li > 
            < li > Love is a word, I only say this once </ li > 
            < li > loved, do not regret it, Bao Dai </ li > 
        </ OL > 
    </ div > 

</ body > 
</ HTML >

xpath example demonstrates:

# Contents of the local file xpath.html lookup 
from lxml Import etree 

# -generated objects 
Tree = etree.parse ( ' xpath.html ' )
 # Print (Tree) 
RET = tree.xpath ( ' // div [@ class = "Tang "] / UL / Li [. 1] ' ) # RET is a listing 
Print (RET [0] .text) 
RET = tree.xpath ( ' // div [@ class =" Tang "] / UL / Li [. 1] / text () ' )
 Print (RET) 

# the href attribute Baidu, 
baidu = tree.xpath ( ' // div [@ class = "Tang"] / UL / Li [. 5] / A / @ the href '),
 The print (to Baidu),

#逻辑and
luoji = tree.xpath('//div[@class="tang"]/ul/li[@class="love" and @name="yang"]/text()')
print(luoji)

#模糊contains
mohu = tree.xpath('//li[contains(@class,"l")]')
print(mohu)

mohu1 = tree.xpath('//li[contains(text(),"爱")]/text()')
print(mohu1)


start = tree.xpath('//li[starts-with(@class,"ba")]/text()''
text = tree.xpath (fetch text text ()#(Start)Print)



div // [@ class = "Song"] ' ) 
String = text [0] .xpath ( ' String (.) ' )
 Print (String.Replace ( ' \ n- ' , ' ' ) .replace ( " \ T " , "" )) # replace all line breaks and tabs

Filter your results:

Guess you like

Origin www.cnblogs.com/Qiuzhiyu/p/12182762.html