Python爬虫requests之扇贝单词 - 代码天地

Python爬虫requests之扇贝单词

其他 2019-03-24 11:20:59 阅读次数: 0

代码，使用xpath筛选

import requests
from lxml import etree
#词汇表
words = []

def shanbei(page):
    url ='https://www.shanbay.com/wordlist/104899/202159/?page=%s'%page
    print(url)
    
    rsp = requests.get(url)
    html = rsp.text()
    html = etree.HTML(html)
    #查找所有tr标签内容
    tr_list = html.xpath('//tr')
    for tr in tr_list:
        word = {} 
        #查找单词
        strong = tr.xpath('.//strong')
        if len(strong):
            name = strong[0].text.strip()
            word['name'] = name
        #查找单词内容
        td_content = tr.xpath('./td[@class="span10"]')
        if len(td_content):
            content = td_content[0].text.strip()
            word['content'] = content
            
        if word != {}:
            words.append(word)

if __name__ == '__main__':
	#页数第一页，可以自己定义
    shanbei(1)
    print(words)
    ```

猜你喜欢

转载自blog.csdn.net/qq_31235811/article/details/88771174

Python爬虫requests之扇贝单词

Python爬虫-爬取扇贝单词(Xpath)

成功使用Python爬虫扇贝单词库实现自动测试我们的单词量

爬虫小案例：扇贝单词评估

Python爬虫之-Requests

python 爬虫之 requests

Python之爬虫-- Requests

python爬虫之requests

爬虫：爬取扇贝上python常用单词，减少登陆和贝壳的繁琐

python爬虫之requests库

Python爬虫之requests模块

Python爬虫之-Requests库

python爬虫之requests的使用

python 爬虫之requests笔记

python requests做爬虫爬取oxford词典单词音标

扇贝单词记录

python爬虫之requests的基本使用

Python爬虫之BeautifulSoup和requests的使用

python爬虫之requests+selenium+BeautifulSoup

Python爬虫之requests库入门

python爬虫之re-requests实战

Python爬虫之requests库介绍(一)

python 爬虫访问网页之request与requests：

Python爬虫之Requests库的基本使用

python之requests 爬虫遇到的时间坑

Python爬虫之Requests库的使用

python爬虫常用库之requests详解

Python爬虫之requests模块(1)

Python爬虫之requests模块(2)

python网络爬虫之requests模块

今日推荐

中国码农的“35岁魔咒”

蘭雅 CorelDRAW 插件 2024.5.1 国际劳动节版，免费下载

Arc Browser for Windows 1.0 正式 GA

90后程序员开发视频搬运软件、不到一年获利超 700 万，结局很刑！

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

周排行

Java基础复习_day13_Collection集合

2018.11.16 c语言学习经验

且看Java内置四大核心函数式接口

小程序云开发中数据库的数据分段和显示图片

python的函数

Web-JS进阶

【干货】C++常用代码积累笔记大全

Spring的ioc操作与 IOC底层原理

构建之法20191121-11 Scrum立会报告+燃尽图 07

Spring boot之Hello World访问404

每日归档

更多

2024-05-05(0)

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)