Scrapy crawler frame (7) - post request

t013b9c86f5a43c0037.jpg

scrapy the default is to get the request. This time we try to request a post.
We were still way dictionary online translation website, for example: http://fanyi.youdao.com/?keyfrom=fanyi.logo
get real URL url was translated by: http://fanyi.youdao.com/translate_o?smartresult= dict & smartresult = rule
actual use needs to be removed _o.

First, we create a project, create a new folder, hold down the shift, the right mouse button to open a command window here, enter scrapy startproject youdaosipder.
Once created, enter scrapy genspider ydspider youdao.com if appropriate reptiles file does not appear, open a command window, enter the command again in just such a file folder.

image.png

ydspider.py

# -*- coding: utf-8 -*-
import scrapy


#http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule
class YdspiderSpider(scrapy.Spider):
    name = 'ydspider'
    allowed_domains = ['fanyi.youdao.com']
    # start_urls = ['http://youdao.com/']

    def start_requests(self):
        url='http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule'
        #向队列中加入post请求
        yield scrapy.FormRequest(
            url=url,
            formdata={
                    'i':'男人',
                    'from':'AUTO',
                    'to':'AUTO',
                    'smartresult':'dict',
                    'client':'fanyideskweb',
                    'salt':'15589655028559',
                    'sign':'6781389ab298673f7036bce9cd99815b',
                    'ts':'1558965502855',
                    'bv':'ab57a166e6a56368c9f95952de6192b5',
                    'doctype':'json',
                    'version':'2.1',
                    'keyfrom':'fanyi.web',
                    'action':'FY_BY_REALTlME'
            },
            callback=self.parse
        )

    def parse(self, response):
        print('-----------------------------------------------------------')
        print(response.body)

Turn off (commented) robot protocol in settings.py. Robot agreement are some of the statements site that allows users to what behavior, what behavior is not allowed, suggest that you find out.

image.png

Scrapy crawl ydspider input terminal in black

image.png

A man translation results

Reproduced in: https: //www.jianshu.com/p/e96e33060ebe

Scrapy crawler frame (7) - post request

Guess you like