scrapy之发送POST请求(人人网简单登录)

1、创建Scrapy项目

scrapy startproject renrenSpider

2.进入项目目录,使用命令genspider创建Spider

scrapy genspider renren "renren.com"

3、编写提取item数据的Spider(在spiders文件夹下:renren.py)

# -*- coding: utf-8 -*-
import scrapy
# scrapy发送POS请求--人人网简单登录,只需要提供pos数据,不需要获取隐藏或其他字段
class RenrenSpider(scrapy.Spider):
    name = 'renren'
    allowed_domains = ['renren.com']
    username = input("请输入账号:")
    password = input("请输入密码:")
# 如果想程序执行一开始就发送POST请求,可以重写Spider类的start_requests(self) 方法,并且不再调用start_urls里的url
    def start_requests(self):
        url = 'http://www.renren.com/PLogin.do'
        yield scrapy.FormRequest(
            url = url,
            formdata = {"email":self.username,"password":self.password},
            callback = self.parse_newpage
        )
    def parse_newpage(self, response):
        with open('renren.html','w',encoding='utf-8')as f:
            f.write(response.body.decode('utf-8'))

4、配置settings文件(settings.py)

# 人人网不用配置即可,有些网站可能需要修改以下参数为False
# Obey robots.txt rules
ROBOTSTXT_OBEY = False

5.以上设置完毕,进行爬取:执行项目命令crawl,启动Spider:

scrapy crawl renren

猜你喜欢

转载自blog.csdn.net/z564359805/article/details/80735871
今日推荐