爬虫古诗文网站

  1. TypeError: Type ‘list’ cannot be serialized. 问题
  2. https://www.bilibili.com/video/BV1aJ411C7oM?p=24
  3. https://stackoverflow.com/questions/56726008/how-can-i-solve-this-particular-typeerror-type-nonetype-cannot-be-serialized
import requests
from lxml import etree
from bs4 import BeautifulSoup
# from quart import Quart
#
# app = Quart(__name__)

HEADERS = {
    'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.162 Mobile Safari/537.36',
    'Referer': 'https://www.gushiwen.org/default_1.aspx'
}
url="https://www.gushiwen.org/default_1.aspx"
response=requests.get(url,headers=HEADERS)
text=response.text
html=etree.HTML(text)
ul=html.xpath("/html[@class='no-touch']/body/div[@class='main3']/div[@class='left']/div[@class='sons']//text()")
print(ul)
fo=open('1.txt','w')
fo.close()
# /html/body/div[2]/div[1]/div[2]/div[1]/div[2]
# /html/body/div[2]/div[1]/div[4]/div[1]/div[2]

# texts=ul.xpath("./div/text()")
# for div in divs:
#     print(etree.tostring(div,encoding="utf-8").decode("utf-8"))
 # divs=ul.xpath('./div')



#-*- coding:UTF-8 -*-



发布了60 篇原创文章 · 获赞 18 · 访问量 5239

猜你喜欢

转载自blog.csdn.net/szuwaterbrother/article/details/105321513