Small problem based learning python python read the file line by a list python scrapy crawl watercress comments Return to solve garbled:

python read the file line by a list

F.readlines = L () 
L = [i.rstrip () Split ( ':'). [0] for i in L]
# Print (L)

Python scrapy crawl watercress comments Return garbled
Solution:
Note: cookie and user -agent there may be a corresponding host is also very important so try to use your browser all the best. of course, only a small amount crawling with large number of other investigation or whether it can replace certain parameters such as agent
because of a problem with a header directly browser header configuration can be replaced scrapy
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'zh-CN,zh;q=0.9',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3',

'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
'Cookie': '换成自己的',
'Host': 'www.douban.com',
'Upgrade-Insecure-Requests': '1',
'User-Agent': '换成自己的',

Guess you like

Origin www.cnblogs.com/stillstep/p/11135942.html