Use etree and xpath crawling Discuz forum

Others 2020-02-25 10:57:20 views: null

The introduction module

Download pycharm in lxml library
by from lxml import etreeintroducing modules

test

import requests
from lxml import etree
url = "https://www.discuz.net/forum-developer-1.html"
text = requests.get(url).text
html = etree.HTML(text)
context = etree.tostring(html).decode()
print(html.xpath('//div[@id="threadlist"]/div[2]/form/table/*'))
print(html.xpath('//*[@id="threadlisttableid"]/*'))

Here Insert Picture Description

Here are all the forum by xpath syntax of tbodynotes

print(html.xpath('//tbody/tr/th/a[@class="s xst"]/text()'))

Here Insert Picture Description

浩翰 Redamancy

Published 126 original articles · won praise 35 · views 10000 +

Private letter concerns

Guess you like

Origin blog.csdn.net/qq_43442524/article/details/103179535

Use etree and xpath crawling Discuz forum

A directional crawler written in Scrapy, the crawling target is a forum using the Discuz framework

Linux build Discuz forum

LAMP+DISCUZ Forum

python use xpath / lxml crawling web forms coexist CSV

LAMP architecture of Discuz Forum (actual !!)

Build Discuz Forum in LAMP Environment

LNMP architecture to build Discuz forum

discuz use

Multi-threaded crawler template, xpath, etree

xpath- prices crawling

python xpath picture crawling

Tencent cloud installation built discuz forum tutorial

Discuz forum to manually compile the installation of LAMP architecture

20.discuz Forum - pseudo-static

Compile and install Discuz framework of the forum LNMP

LNMP architecture build Discuz Forum (real!)

How to build a Linux-based Discuz! Forum

Centos- deploy forum system discuz

linux web services and open source forum Discuz

Jenkins + Docker achieve quick update of discuz Forum

LAMP builds DISCUZ forum and WordPress blog locally

rpm build LAMP+Discuz forum

Simple review of python crawlers 1 [using etree for XPath analysis]

python etree.HTML and xpath tools for parsing web pages

Jinjiang years later crawling xpath []

python3.7 encountered in the use of etree

Under Centos7 deployment environment to build discuz Forum

LAMP architecture build Discuz forum, pure dry! Everyone is watching!

Recommended

Ranking

error: (-215:Assertion failed) !_img.empty() in function ‘cv::imwrite‘

Database migration between Navicat servers

Minimum number of rotation of the array: Array

balenaEtcher for mac (make a boot disk software) v1.5.67

Custom processing serialization and deserialization in jackson

Mu-en-mask system development software

Mastering Regular Expressions

Find mileage Java--

Web pages can not directly concern the public micro-channel number how to do? A key to arouse public concern number of micro-channel solutions

[CodeForces - 739B] Alyona and a tree Tree + [difference] + bipartite

Daily

2024-05-12(28)

2024-05-11(32)

2024-05-10(34)

2024-05-09(32)

2024-05-08(18)

2024-05-07(34)

2024-05-06(6)

2024-05-05(0)

2024-05-04(18)

2024-05-03(8)