使用python爬虫爬取卷皮网背包信息实例 - 代码天地

使用python爬虫爬取卷皮网背包信息实例

其他 2020-02-19 21:50:51 阅读次数: 0

使用requests和BeautifulSoup实现对卷皮网背包名称与价格的爬取

链接:www.juanpi.com

代码:

import requests
import re
from bs4 import BeautifulSoup

#从网络上获取背包网页内容
def getHtmlText(url):
    try:
        r =requests.get(url,timeout=30)
        r.raise_for_status()
        r.encoding = r.apparent_encoding
        return r.text
    except:
        return "123"

#提取网页内容中信息到合适的数据结构
def fillUnivList(html):
    soup = BeautifulSoup(html,"html.parser")
    divs = soup.find_all('div')
    spans = soup.find_all('span')
    for i in divs :
        if 'list-good buy' in str(i):
            tit = i.find_all('h3')[0].find_all('a')[0].string
            spans = i.find_all('span')
            if 'price-current' in str(spans[0]):
                print('商品名称: ' + tit)
                print('价格: ' + str(spans[0])[38:-7])

#主函数
def main():
    goods='书包'
    depth = 2
    url = 'http://www.juanpi.com/search?keywords=' + goods
    for i in range(1,depth+1):
        print('第' + str(i) + '页: ------------------------------------------------')
        html = getHtmlText(url)
        fillUnivList(html)
        url = 'http://www.juanpi.com/search/' + str(i+1) +'?keywords=' + goods

main()

本文为学习北京理工大学爬虫mooc跟着敲得实例代码.附上链接:https://www.bilibili.com/video/av9784617?from=search&seid=17441199644632730564

猜你喜欢

转载自www.cnblogs.com/yue1234/p/12333318.html

使用python爬虫爬取卷皮网背包信息实例

一个简单Python爬虫实例（爬取的是前程无忧网的部分招聘信息）

[Python爬虫]爬虫实例:在线爬取当当网畅销书Top500的图书信息

[Python爬虫]爬虫实例:离线爬取当当网畅销书Top500的图书信息

Python爬虫爬取煎蛋网图片代码实例

Python 爬虫爬取安智网应用信息

python爬虫— 拉勾网职位信息爬取

python爬虫爬取淘宝网商品信息

简单python爬虫爬取拉勾网职位信息

python爬虫之爬取《贵州农经网》信息

python爬虫练习爬取美团网酒店信息

python爬虫-selenium爬取链家网房源信息

python爬虫—使用bs4爬取链家网的房源信息

【python爬虫实例】爬取豆瓣图书及信息

Python爬虫实例：爬取B站《工作细胞》短评——异步加载信息的爬取

爬虫---爬取拉钩信息网

使用Python原生爬虫爬取博客文章的简单信息

Python使用request爬取拉钩网信息

#python学习笔记#使用python爬取拉勾网职位信息（二）：爬取数据

python爬虫：爬取拉勾网数据

Python拉勾网爬虫-翻页爬取

python爬虫爬取诗词名句网

Python爬虫：爬取抽屉网

Python 爬虫爬取煎蛋网图片

python网络爬虫爬取房价信息

python网络爬虫，爬取图片信息

python爬虫的图片信息爬取

python爬虫，爬取豆瓣电影信息

python 爬虫 booking爬取酒店信息

Python爬虫：爬取网站电影信息

今日推荐

技术解析 GPT-4o：即时语音交互的突破与 GenAI 发展策略

开源大模型与闭源大模型

微信小程序授权登录获取用户的openid

亿级流量系统架构设计与实战

人工智能时代的程序设计教学与课程设计

纽交所技术问题致伯克希尔 (BRK.A) 显示跌近 100%

周排行

ORACLE 跟踪文件详细解释

20190924-LeetCode解数独题目分享

分治法实例-找下标，下标与对应值相等

安全测试学习笔记

JavaScript笔记：原型和原型链

在Linux中检查可用内存的5种方法

BUAA_OO_JML

mongodb创建用户、备份、恢复等

生活20190602

使用MoveIt!配置软件包在RViz中进行机器人运动规划

每日归档

更多

2024-06-09(0)

2024-06-08(0)

2024-06-07(0)

2024-06-06(0)

2024-06-05(0)

2024-06-04(10)

2024-06-03(52)

2024-06-02(4)

2024-06-01(60)

2024-05-31(47)