爬虫爬取淘宝商品信息 - 代码天地

爬虫爬取淘宝商品信息

其他 2018-11-11 11:10:50 阅读次数: 0

废话少说直接上代码：

import re
import requests

def getHtml(url):
    try:
        r=requests.get(url,timeout=30)
        r.raise_for_status()
        print(r.status_code)
        r.encoding=r.apparent_encoding
       # print(r.text)
        return r.text
    except:
        return ""


def parserPage(lis,html):
    try:
        #正则表达式
        plt=re.findall(r'\"view_price\"\:\"[\d\.]*\"',html)
        tlt=re.findall(r'\"raw_title\"\:\".*?\"',html)

        for i in range(len(plt)):
            price=eval(plt[i].split(':')[1])
            title=eval(tlt[i].split(':')[1])
            lis.append([price,title])

    except:
        print("")

def printGoods(lis):

    tplt="{:4}\t{:8}\t{:16}"
    print(tplt.format("序号","价格","商品名称"))
    count=0
    for q in lis:
        print(q[0])
        count=count+1
        print(tplt.format(count,q[0],q[1]))
    print()

def main():
    goods="书包"
    counts =2
    start_url = "https://s.taobao.com/search?q=" + goods
    #start_urls="https://s.taobao.com/search?q=书包"
    infoList = []
    for i in range(counts):
        try:
            url="https://s.taobao.com/search?q=%E4%B9%A6%E5%8C%85&imgfile=&commend=all&ssid=s5-e&search_type=item&sourceId=tb.index&spm=a21bo.2017.201856-taobao-item.1&ie=utf8&initiative_id=tbindexz_20170306&bcoffset=3&ntoffset=0&p4ppushleft=1%2C48&s=88"
            #url=start_url+'&s=' + str(44 * i)
            #url = start_url + '&s=' + str(44 * i)
            html = getHtml(url)
            parserPage(infoList,html)
        except:
            continue
    printGoods(infoList)
main()

猜你喜欢

转载自blog.csdn.net/DZMNLFH/article/details/83903359

爬虫爬取淘宝商品信息

python爬虫 — 爬取淘宝商品信息

python爬虫爬取淘宝网商品信息

Python爬虫爬取淘宝，京东商品信息

Python爬取淘宝商品信息

爬取淘宝商品信息

我要爬爬虫(11)-用selenium爬取淘宝商品信息

Python网络爬虫与信息提取（7）—— 用re库爬取淘宝商品信息

（廿八）Python爬虫：使用Selenium爬取淘宝商品信息

python爬虫爬取淘宝搜索页面商品信息数据

Python爬虫模拟浏览器的headers、cookie，爬取淘宝网商品信息

Python爬虫入门实例五之淘宝商品信息定向爬取(优化版)

Python爬取淘宝商品信息入库

python学习之爬取淘宝商品信息

利用Selenium爬取淘宝商品信息

慕课中爬取淘宝商品信息

selenium＋pyquery爬取淘宝商品信息

爬取淘宝商品信息selenium+pyquery+mongodb

比价网站的基础-爬取淘宝的商品信息

python：淘宝商品信息定向爬取

多进程爬取淘宝商品信息

requests和re库爬取淘宝商品信息

selenium和pyquery爬取淘宝美食商品信息

<day003>登录+爬取淘宝商品信息

使用正则库爬取淘宝商品信息

selenium登录爬取淘宝商品信息

淘宝商品信息爬取（已登录）

python爬取并分析淘宝商品信息

Python爬取淘宝商品信息并生成Excel

Java爬虫爬取京东商品信息

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

让自己的头脑极度开放

CentOS 6.5(x64) 和Redhat6.5操作系误删libc

高可用注册中心

【日记】12.28/【题解】AtCoder AGC041

XML（5）_XML 约束_DTD

Java集合Map（四）

树梅派安装桌面环境教程

pipenv 的使用和安装

小程序白屏问题和内存研究

C语言简单选择排序

每日归档

更多

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)