Python 爬取豆瓣 - 代码天地

Python 爬取豆瓣

其他 2018-12-21 17:28:00 阅读次数: 0

...

import urllib.request
import time
from bs4 import BeautifulSoup

def url_open(url):
    response = urllib.request.urlopen(url)
    return response
def parse_html(response):
    html_content = response.read()
    html_soup = BeautifulSoup(html_content, 'html.parser', from_encoding='utf-8')
    tag_lis = html_soup.find_all('li')
    for li in tag_lis:
        em = li.find('em')
        title = li.find_all('span', class_='title')
        # other = li.find_all('span', class_='other')
        rating = li.find('span', class_='rating_num')
        if title != []:
            rank=em.get_text()
            print("排名:" + rank + "------评分:" + str(rating.get_text()) + "-------" + title[0].get_text())
            if rank==250:
                return None
            if int(rank)%25==0:
                url="https://movie.douban.com/top250?start="+rank+"&filter="
                return url

url = "https://movie.douban.com/top250?start=0&filter="
if __name__=='__main__':
    response=url_open(url)
    start_time=time.time()
    print("开始："+str(start_time))
    while 1:
        url=parse_html(response)
        if url==None:
            break
        response=url_open(url)
    end_time=time.time()
    print("结束:"+str(end_time))
    print("一共用了："+str(end_time-start_time)+"秒")

猜你喜欢

转载自www.cnblogs.com/mysterious-killer/p/10156985.html

Python 爬取豆瓣

Python爬取豆瓣影评

Python 豆瓣mv爬取

python爬取豆瓣250

Python爬取豆瓣电影

python爬取豆瓣图片

Python爬取豆瓣读书

Python | Python爬取豆瓣的影评

python 爬取豆瓣电影案例

python爬虫，爬取豆瓣电影信息

python爬取豆瓣网页短评实战！

python爬虫实践——爬取豆瓣电影

python爬虫爬取豆瓣电影信息

Python爬取豆瓣指定书籍的短评

python爬取豆瓣出版社

用python爬取豆瓣小说

Python爬取豆瓣电影top

python爬虫 - 爬取豆瓣上的数据

Python登录豆瓣并爬取影评

使用Python 爬取豆瓣热门电影

python爬取豆瓣Top250

Python爬虫--爬取豆瓣电影

Python爬虫入门 | 2 爬取豆瓣电影信息

Python爬取豆瓣网图书评论

Python爬虫之爬取豆瓣电影（一）

爬取豆瓣电影top250（python3）

Python爬虫之爬取豆瓣电影（二）

Python爬虫实战：Scrapy豆瓣电影爬取

Python爬取豆瓣图书信息学习记录

python爬虫爬取豆瓣书籍信息并生成表格

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

循环神经网络（rnn）讲解

Tigao教程四：单独的关节运动

金蝶K3WISE15.0-注册套打教程

如何在Mac上配置Kubernetes

Android应用结束自身进程的方法

SpringMVC学习十三拦截器栈

中国驻洛杉矶总领馆举行新春招待会

HttpClient get post 发送

11 - three.js 笔记 - 绘制三维字体模型

Mysql递归获取某个父节点下面的所有子节点和子节点上的所有父节点

每日归档

更多

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)