requests module - use of cookies parameter

Carry cookie in headers parameter

Websites often use the Cookie field in the request header to maintain the user's access status, so we can add Cookie to the headers parameter to simulate the request of ordinary users.

First, let's understand how cookies are passed in HTTP requests.

When a browser visits a website for the first time, the server will return a Set-Cookie header in the response, which contains a Cookie value named "sessionID". The browser will save this cookie locally, and automatically include it in the cookie field of the request and send it to the server in subsequent visits. This cookie value is saved locally by the browser for use in subsequent interactions with the website.

Similar to the store receipt you get when you go shopping, cookies on the Internet play a similar role. When you visit a website, the server will create a cookie in your browser, which contains some important information, such as your activity records on the website, login status, etc. The browser will save this cookie.

The next time you visit the same website again, the browser will automatically send the saved cookie to the server in the request. Through this cookie, the server can recognize you as a previous visitor, and provide personalized services based on your personal preferences or login status, such as displaying products you have browsed before, maintaining your login status, etc.

Through cookies, the website can remember your preferences and activity records to provide you with a better user experience. For example, on a shopping website, cookies can help save the contents of your shopping cart, so that you will not lose the purchased items when you visit again after leaving the website. At the same time, cookies can also be used to identify users to ensure that only authorized users can access specific pages or functions.

However, cookies also have some disadvantages. It will increase network traffic, because each request will carry cookie information. In addition, since cookies are stored in users' browsers, they may be attacked and exploited by hackers. In addition, due to browser restrictions, cookies can only be used under the domain name that created them, and cannot be obtained and used under other domain names.

To sum up, Cookie is a mechanism for storing user information on the Internet, which helps websites identify user identities and provide personalized services. It is similar to a receipt when shopping and is used to continue shopping or to keep purchased items when returning after leaving the website for a period of time. But we also need to pay attention to the security of cookies and browser restrictions

Let's take github login as an example:

github login packet capture analysis

  1. Open the browser, right click - check, click Net work, checkPreserve log
  2. Visit the url address of github loginhttps://github.com/login
  3. After entering the account password and clicking login, access a url that needs to be logged in to obtain the correct content, such as clicking Your profile in the upper right corner to accesshttps://github.com/USER_NAME
  4. After determining the url, determine the User-Agent and Cookie in the request header information required to send the request

insert image description here
insert image description here

complete code

  • Copy User-Agent and Cookie from browser
  • The request header field and value in the browser must be consistent with the headers parameter
  • The value corresponding to the Cookie key in the headers request parameter dictionary is a string

Run the code to verify the result

Search for title in the printed output results. If the title text content in html is your github account, you can successfully use the headers parameter to carry the cookie to obtain the page that can only be accessed after login.

Example code showing (without cookies):

# 导入requests模块,用于发送HTTP请求
import requests

# 要访问的GitHub用户资料页面的URL
url = 'https://github.com/USER_NAME'

# 构造请求头字典,包含User-Agent和Cookie等信息
headers = {
    
    
    # 设置User-Agent,模拟浏览器发送请求
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36',
    # 设置Cookie,用于身份验证或其他需要的信息
    'Cookie': 'xxx这里是复制过来的cookie字符串'
}

# 发起GET请求,携带请求头参数
resp = requests.get(url, headers=headers)

# 打印响应的文本内容
print(resp.text)


running result:

insert image description here

Example code showing (with cookies):

# 导入requests模块,用于发送HTTP请求和接收响应
import requests

# 定义请求头,设置用户代理(User-Agent)信息和Cookie
headers = {
    
    
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36',
    'cookie': '_octo=GH1.1.814865664.1688632559; preferred_color_mode=light; tz=Asia%2FShanghai; _device_id=17dd5862d283d3d5d80b4fcad269acb6; has_recent_activity=1; user_session=M09z7GkbQ7wsMNg0LAKMDj5d8_BKYppsYvo0jf7eLuvKASb8; __Host-user_session_same_site=M09z7GkbQ7wsMNg0LAKMDj5d8_BKYppsYvo0jf7eLuvKASb8; tz=Asia%2FShanghai; color_mode=%7B%22color_mode%22%3A%22auto%22%2C%22light_theme%22%3A%7B%22name%22%3A%22light%22%2C%22color_mode%22%3A%22light%22%7D%2C%22dark_theme%22%3A%7B%22name%22%3A%22dark%22%2C%22color_mode%22%3A%22dark%22%7D%7D; logged_in=yes; dotcom_user=emo-github; _gh_sess=vZDgHKIDHKFl5a5PjnZCG0JPH6SmoIxGFjgFUPRZJRmwEodGQdkye7SLZ3n6B9amGQe6P%2BZZsq%2BMn2Wkb9usRBRJ%2Bm0mEIpWENwa8jvuZt7z%2BJGrubtvG3%2BKx6KLvf6%2FTQiJRcn%2BZRs%2BaW1jEiyehy%2B5rPpwsjeruKvwb9BV1yf%2BQJOiNrq2i2u7waDegvtORzIj16VrNgajbN%2BUlaJ7DTuTeN4SLano6nVGgGUFOHltiCh2VVj9xiq7Rh1FkK5RurRqMHKzsythVkL2H2L0Y5Q%2B8ntqEy6XaOKvLnLAN9fUHSkrY7xvd2HHfp%2BMsCipZoPW%2B2FJ3cxZ9ku0VrvJ1F5ApOMzPnNysTw87NB94qt%2FkOEYHwpHHP%2FvA7TsnPsxWWYvt1zL5bjI2KaIRqxwmnYs7%2FJriwyEcX%2BONkx3MzrVqzCFbujAmiWDp3%2B7vTEIQq0u4xxhO3vMEm5ZDg%3D%3D--zixVSZWwAjTLtBjB--vrsVHgpwEzS6viX%2FrXY2SA%3D%3'
}

# 定义要请求的URL地址
url = 'https://github.com/emo-github'

# 发送GET请求并获取响应
response = requests.get(url, headers=headers)

# 打印响应内容并将其以UTF-8编码解析为字符串并输出
print(response.content.decode())

running result:

insert image description here


Knowledge point: Master the cookie carried in the headers

Use of cookies parameters

In the previous section, we carried cookies in the headers parameter, or we can use special cookies parameters

  1. The format of the cookies parameter: dictionary

    cookies = {"cookie的name":"cookie的value"}

    • The dictionary corresponds to the Cookie string in the request header, and each pair of dictionary key-value pairs is separated by a semicolon and a space
    • The left side of the equal sign is the name of a cookie, corresponding to the key of the cookies dictionary
    • The right side of the equal sign corresponds to the value of the cookies dictionary
  2. How to use cookies parameters

    response = requests.get(url, cookies)

  3. The dictionary needed to convert the cookie string to the cookies parameter:

    cookies_dict = {cookie.split('=')[0]:cookie.split('=')[-1] for cookie in cookies_str.split('; ')}

  4. Note: Cookies generally have an expiration time, and once expired, they need to be reacquired

# 导入requests模块,用于发送HTTP请求
import requests

# 要访问的GitHub用户资料页面的URL
url = 'https://github.com/USER_NAME'

# 构造请求头字典
headers = {
    
    
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36'
}
# 构造cookies字典
cookies_str = '从浏览器中copy过来的cookies字符串'

cookies_dict = {
    
    cookie.split('=')[0]:cookie.split('=')[-1] for cookie in cookies_str.split('; ')}

# 请求头参数字典中携带cookie字符串
resp = requests.get(url, headers=headers, cookies=cookies_dict)

print(resp.text)
lit('=')[-1] for cookie in cookies_str.split('; ')}

# 请求头参数字典中携带cookie字符串
resp = requests.get(url, headers=headers, cookies=cookies_dict)

print(resp.text)

Guess you like

Origin blog.csdn.net/m0_67268191/article/details/132144287