python 学习笔记 简单爬虫

一个个简单的爬虫,爬取豆瓣网的电影top250
代码如下:

import requests
from pyquery import PyQuery as pq
for url in ['https://movie.douban.com/top250?start={}'.format(page) for page in range(0,225,25)]:
    html = requests.get(url).text
    for item in pq(html)('.item').items():
        num = item.find('.pic em').text()
        title = item.find('.title').text()
        title1 =str(title,'utf-8')
        img = item.find('.pic img').attr('src')
        start = item.find('.rating_num').text()
        print (num , title1 , start , img)


猜你喜欢

转载自blog.csdn.net/qq_40452317/article/details/80383593