Simple crawling of cookies of Python crawler urllib module
1. Crawl directly (the cookie is displayed in the compiler):
import http.cookiejar,urllib.request
cookie = http.cookiejar.CookieJar() # 声明CookieJar对象
handler = urllib.request.HTTPCookieProcessor(cookie) # 构建Handler
opener = urllib.request.build_opener(handler)
response = opener.open('url') # 打开链接
for item in cookie:
print(item.name+"="+item.value)
2. The cookie is saved in the specified file
import http.cookiejar,urllib.request
filename = '指定文件名(文件类型一般为txt)'
cookie = http.cookiejar.MozillaCookieJar(filename)
handler = urllib.request.HTTPCookieProcessor(cookie)
opener = urllib.request.build_opener(handler)
response = opener.open('url')
cookie.save(ignore_discard=True,ignore_expires=True)
3. LWPCookieJar storage:
cookie = http.cookiejar.LWPCookieJar(filename)
Simple notes:
1.Cookiejar():
Objects that manage HTTP cookie values, store cookies generated by HTTP requests, and add cookies to outgoing HTTP requests.
2.
A subclass of MozillaCookieJar CookieJar, which can be used to process read and save Cookies, and save Cookies as Mozi browser's Cookies format
3. Cookie.save parameter explanation:
ignore_discard means to save cookies even if they will be discarded, ignore_expires means to save cookies if they have expired and will overwrite the file if it already exists.
Refer and recommend books: https://cuiqingcai.com/5052.html .