python library requests the use of simulation login process to resolve

Before you try to log in with requests direct request failed, the direct use of selenium are simple and crude login. Today looked at the big brother "web crawler developed combat" simulated login section, and a lot of harvest.

The key use of simulated landing requests:

First: it is to find the corresponding request address

Second: find a variety of fields to be requested to submit the form Form Data corresponding to the head, and their values

The first step: to find the address of the corresponding request

Take watercress, for example, do not enter in the login screen user name and password or enter an incorrect user name and password, and click Sign in, and then find the corresponding address in the network logon request

network in so many addresses how to tell which is we want to request my skill is to see what the request with Form Data, as shown below:

Corresponding to the basic request url diagram: https://accounts.douban.com/j/mobile/login/basic

What we want to address the request

Step two: find a variety of fields you want to submit the request form Form Data corresponding to the head, and their value

Also see this picture:

A total of five fields:

ck: meaning unknown, fill in the blank strings

name: Username

password: password

remember: whether to log password, true or flase will do

ticket: meaning unknown, fill in the blank character

 


 Get two key steps, the code may be

import requests
from lxml import etree


class DouLogin(object):
    def __init__(self):
        self.headers = {
            'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36'}
        self.login_url = 'https://accounts.douban.com/j/mobile/login/basic'
        self.logined_url = 'https://www.douban.com/'
        self.session = requests.Session()

    def login(self, email, password):
        post_data = {
            'ck': '',  # 可选
            'name': email,
            'password': password,
            'ticket': '',  # 可选

        }
        response = self.session.post(self.login_url, data=post_data, headers=self.headers)
        if response.status_code == 200 and response.json().get('status') == 'success':
            response = self.session.get(self.logined_url)
            selector = etree.HTML(response.text)
            username = selector.xpath("//li[@class='nav-user-account']/a/span/text()")[0]
            print('登录成功:这是%s' % username)
        else:
            print('登录失败:%s' % response.json().get('description'))


if __name__ == "__main__":
    login = DouLogin()
    login.login(email='[email protected]', password='xxx')

Login successful output:

Login Success: This is the account number xxx

Login failed output:

Login failed: CAPTCHA

ps: sometimes might fail due to the secondary slide verify that the temporary processing

pps: github failed landing attempt, his Form Data more inside the field, which is estimated not set, or selenium Dafa short answer rude

Published 115 original articles · won praise 34 · views 90000 +

Guess you like

Origin blog.csdn.net/u011519550/article/details/103076848