2022 takes you in-depth understanding of what cookies are? Python modifies cookies

1. What is a cookie and the role of a cookie

The HTTP protocol itself is stateless. What is stateless? Stateless means that there is no need to establish a persistent connection between the Web browser and the Web server. This means that when a client sends a request to the server, and the Web server returns a response (Response), the connection is Closed, the connection information is not kept on the server side. In other words, HTTP requests can only be initiated by the client, and the server cannot actively send data to the client. That is, the server cannot determine the user's identity. Cookie is actually a short piece of text information (key-value format). The client initiates a request to the server. If the server needs to record the user's status, it uses the response to issue a Cookie to the client browser. The client browser will save the Cookie. When the browser requests the website again, the browser submits the requested URL together with the cookie to the server. The server checks the cookie to identify the status of the user.

For example, you go to ACBC Bank and apply for a bank card. Identity information, mobile phone number information, password and other information are stored in the bank card. When you do business next time, the machine can recognize your card so that you can go through the next page. The card medium is equivalent to a cookie, and only with this authentication can we handle the next business.

Two, cookie mechanism

When a user visits and logs in to a website for the first time, the cookie setting and sending will go through the following 4 steps:

The client sends a request to the server --" the server sends an HttpResponse response to the client, which contains the header of Set-Cookie --" the client saves the cookie, and then sends a request to the server, the HttpRequest request will contain a cookie Head--"The server returns the response data
Insert picture description here

Three, cookie attribute items

属性项	属性项介绍
NAME=VALUE	键值对,可以设置要保存的 Key/Value,注意这里的 NAME 不能和其他属性项的名字一样
Expires	过期时间,在设置的某个时间点后该 Cookie 就会失效
Domain	生成该 Cookie 的域名,如 domain="www.baidu.com"
Path	该 Cookie 是在当前的哪个路径下生成的,如 path=/wp-admin/
Secure	如果设置了这个属性,那么只会在 SSL 连接时才会回传该 Cookie

Expires
This attribute is used to set the validity period of the Cookie. The maxAge in Cookie is used to represent this attribute, and the unit is second. This attribute is read and written in Cookie through getMaxAge() and setMaxAge(int maxAge). There are 3 values ​​for maxAge, which are positive, negative and 0.

  • If the maxAge attribute is a positive number, it means that the cookie will automatically expire after maxAge seconds.
  • When the maxAge attribute is a negative number, it means that the cookie is only a temporary cookie and will not be persisted. It is only valid in this browser window or the child window opened in this window. The cookie will become invalid immediately after closing the browser.
  • When maxAge is 0, it means to delete the Cookie immediately

So what is the difference between setting maxAge to a negative value and 0?

maxAge is set to 0 to delete the cookie immediately.
If maxAge is set to a negative number, you can see that the Expires property has changed, but the cookie will still exist for a period of time until the browser is closed or reopened.

Four, use cookies in python

1. Write Cookie directly in the header

# coding:utf-8
import requests
from bs4 import BeautifulSoup
cookie = '''cisession=19dfd70a27ec0eecf1fe3fc2e48b7f91c7c83c60;CNZZDATA1000201968=181584
6425-1478580135-https%253A%252F%252Fwww.baidu.com%252F%7C1483922031;Hm_lvt_f805f7762a9a2
37a0deac37015e9f6d9=1482722012,1483926313;Hm_lpvt_f805f7762a9a237a0deac37015e9f6d9=14839
26368'''
header = {
    
    
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Geck
o) Chrome/53.0.2785.143 Safari/537.36',
'Connection': 'keep-alive',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Cookie': cookie}
url = 'https://kankandou.com/book/view/22353.html'
wbdata = requests.get(url,headers=header).text
soup = BeautifulSoup(wbdata,'lxml')
print(soup)

2. Use requests to insert cookies

# coding:utf-8
import requests
from bs4 import BeautifulSoup
cookie = {
    
    
"cisession":"19dfd70a27ec0eecf1fe3fc2e48b7f91c7c83c60",
"CNZZDATA100020196":"1815846425-1478580135-https%253A%252F%252Fwww.baidu.com%252F%7C1483
922031",
"Hm_lvt_f805f7762a9a237a0deac37015e9f6d9":"1482722012,1483926313",
"Hm_lpvt_f805f7762a9a237a0deac37015e9f6d9":"1483926368"
}
url = 'https://kankandou.com/book/view/22353.html'
wbdata = requests.get(url,cookies=cookie).text
soup = BeautifulSoup(wbdata,'lxml')
print(soup)

Guess you like

Origin blog.csdn.net/weixin_45598506/article/details/113121984