Use Python+requests to simply implement simulated login and capture interface data

foreword

Speaking of reptiles, I used to use PHP+CURL to implement them in college, and later used Java+WebMagic to implement them. Now I just came into contact with Python, started to understand Python's syntax and framework, and used Python to make some HTTP requests. I found that compared with PHP and Java, it is more convenient, concise and efficient in terms of function implementation.

The server in this example maintains a session connection with the client based on Session, that is to say, when each client accesses the server for the first time, the server will open a Session for it, as long as the client’s cookie information is not cleared by the user, or The server is restarted, or the session remains valid for a long time, and the subsequent visits will be the same session many times, and will be stored in the server memory.

The following uses Python's requests dependency library to implement a client simulated login.

1. Operating environment

Python => 3.11.2
urllib3 => 1.26.15

2. Example code

(1)index.py

#! /usr/bin/env python3
# -*- coding: utf-8 -*-

import requests
import json
import urllib3

urllib3.disable_warnings()  # InsecureRequestWarning: Unverified HTTPS request is being made to host 'xxx'. Adding certificate verification is strongly advised.

# 实例化 session 对象
session = requests.session()

# 打印前后的请求标头
headers = {
    "Content-Type": 'application/json',
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36"
}
print('原 session 的请求标头' + str(session.headers))
session.headers = headers
print('新 session 的请求标头' + str(session.headers))

# 更新为使用账号密码登录成功的 session 对象
url = 'https://xxx.com/api/login'  # 登录接口
data = {
    'username': '',
    'password': ''
}
response = session.post(url, data=json.dumps(data), headers=headers, verify=False)

if response.status_code == 200:
    print('尊敬的用户,' + '您已模拟登录 成功~')
    print(response.text)
    print(response.cookies)
else:
    print('模拟登录 失败!')
print('\n')

# 爬取 getUserList 接口数据
userListUrl = 'https://xxx.com/api/getUserList'  # 目标接口
response = session.get(userListUrl, headers=headers, cookies={}, verify=False)

if response.status_code == 200:
    print('爬取接口 成功~')
    print(response.text)
else:
    print('爬取接口 失败!')

3. Operation effect

D:\Python3.11\python.exe D:\workspace\python3_django4_web\python3_django4_web\simulated_client\index.py 

原 session 的请求标头{'User-Agent': 'python-requests/2.28.2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
新 session 的请求标头{'Content-Type': 'application/json', 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36'}

尊敬的XXX用户,您已模拟登录 成功~
{"success":true,"statusCode":200,"data":"..."}
<RequestsCookieJar[...]>


爬取接口 成功~
{"success":true,"statusCode":200,"data": [...]}

Process finished with exit code 0

4. References

A simple description of JSESSIONID_hzm326's Blog-CSDN Blog

Guess you like

Origin blog.csdn.net/Cai181191/article/details/129865455