[Xpath] Multiple xpath Element objects, the extraction results are the same

every blog every motto: What doesn’t kill you makes you stronger.

0. Preface

Use xpath to extract, get multiple element objects, loop traversal to extract the contents of them, the result is the same, record.

1. Text

1.1 Method 1:

comment_xpath = html.xpath('//div[@node-type="root_comment"]')
# 遍历每个评论块
for ele in comment_xpath:
    # t = etree.tostring(ele,encoding='utf-8')
    # print(t.decode('utf-8'))
    # print('----'*100)
    # 评论人昵称
    nick_name = ele.xpath('//div[@class="WB_text"]/a/text()')[0]
    print(nick_name)

Insert picture description here

1.2 Method Two (Added one.)

comment_xpath = html.xpath('//div[@node-type="root_comment"]')
# 遍历每个评论块
for ele in comment_xpath:
    # t = etree.tostring(ele,encoding='utf-8')
    # print(t.decode('utf-8'))
    # print('----'*100)
    # 评论人昵称
    nick_name = ele.xpath('.//div[@class="WB_text"]/a/text()')[0]
    print(nick_name)

Insert picture description here

1.3 Analysis (to be solved)

  1. Uncomment the three statements in the for loop and turn each ele into a character. You can find that the results are all different. Don’t know why the nickname cannot be extracted correctly (always extract the first one)
  2. The xpath in the for loop adds ".", which means " select the current node ". Since each ele is different, why do we need to add "."?

1.4 Reason

  1. // The match starts from the root path by default , so the above result appears
  2. If you want to match something under the current path ele, you must precede "."

Guess you like

Origin blog.csdn.net/weixin_39190382/article/details/112634272