Reptiles actual (a) - the new know almost

Know almost is a classic case of reptiles, because he often revised, more and more difficult to climb, may I finish this tutorial he has revised up.

 

Difficulties in almost known

1. Log in and jump url

2. encryption parameters

3. Code

This article describes the log almost known analog detailed procedure.

 

Ethereal - Analysis of the login process

Use fiddler capture

Using the browser capture

1. Obtain Log url

Enter the account number, password, etc., visit the website

access to post url, page jump, the arrow is a real login url  

 

2. Get login parameters

You can see the encrypted form data

 

Approach

The need to address two questions: What are submitted parameters; how to encrypt

1. First need to enter the source panel, find the relevant js file encryption function;

2. Search for encryption-related English, search methods, see my blog "Browser capture", to the extent relevant function name is not encrypted, will be able to search, where search encrypt; [encrypt: encryption]

3. In the browser format js code, locate the encryption function to get a line number; [often can be found to a number encrypt, the browser only to the first match, so copying the editor, search positioning]

4. Set the line number corresponding to the breakpoint; [Note the line numbers may not be identical, the corresponding function in the room in the upper and lower lines]

5. Log back, debugging, crawl login parameters;

Encryption function

var b = function(e) {
        return __g._encrypt(encodeURIComponent(e))
    };

e

"client_id=c3cef7c66a1843f8b3a9e6a1e3160e20&grant_type=password&timestamp=1559629752508&source=com.zhihu.web&signature=15317e3484b64449697b285a69a09af8ff23a1af&username=yanshuangwu258%40sina.com&password=6712007&captcha=&lang=en&ref_source=homepage&utm_source="

B pass parameters encryption function E, to carry out the encodeURIComponent, according to experience should be encoded into the form of key-value, and then encrypting

 

First parameter meaning clear

= client_id c3cef7c66a1843f8b3a9e6a1e3160e2 client id
grant_type = password Authorization Type
timestamp = 1559629752508                     timestamp
Source = com.zhihu.web source address
Signature = 15317e3484b64449697b285a69a09af8ff23a1af signature
username = yanshuangwu258% 40sina.com Username
password = 6712007                     Password
captcha =                         codes
lang = EN type codes
ref_source=homepage                
utm_source=

 

Try several times, to observe whether the parameter value is fixed;

By comparison, the time stamp is not fixed, the signature, verification code;

Timestamp is the time, be known, and the rest will get a signature and a verification code

 

3. Get Signature

From that, the signature is encrypted ;

Step 2 method analogous cracks; signature search in the source;

Positioning line number, set breakpoints, debugging;

 

In this signature generating function

function(e, t, n) {
    "use strict";
    There r = n (745 )
      , o = n.n(r)
      , i = n(183)
      , a = n.n(i);
    Object.assign;
    a()("zhihu-redux-middlewares:oauth");
    var c = "c3cef7c66a1843f8b3a9e6a1e3160e20";
    var u = Object.assign || function(e) {
        for (var t = 1; t < arguments.length; t++) {
            var n = arguments[t];
            for (var r in n)
                Object.prototype.hasOwnProperty.call(n, r) && (e[r] = n[r])
        }
        return e
    }
    ;
    t.a = function(e, t) {
        var n = Date.now()
          , r = new o.a("SHA-1","TEXT");
        return r.setHMACKey("d1b964811afb40118a12068ff74a12f4", "TEXT"),
        r.update(e),
        r.update(c),
        r.update("com.zhihu.web"),
        r.update(String(n)),
        u({
            clientId: c,
            grantType: e,
            timestamp: n,
            source: "com.zhihu.web",
            signature: r.getHMAC("HEX")      #######
        }, t)
    }

该函数传入 e、t和一些全局变量,e是字符串‘password’,t 见截图,n是时间戳,c 见代码,

这个函数显示了 signature 的加密过程;【此处需要学习常规加密方法】

此处通过 秘钥d1b964811afb40118a12068ff74a12f4 和 SHA-1密码散列函数,进行加密,r.update 又添加了 e、c、‘com.zhihu.web’、string(n),

由截图和代码可知,e代表‘password’, c为"c3cef7c66a1843f8b3a9e6a1e3160e20",n为时间戳,由此可算出 signature

 

4. 获取登录验证码

知乎验证码的特点

1. 登录知乎不是每次都需要验证码

2. 知乎有两种验证码,一种是 “点击倒立的文字”,一种是 “英文字母”

 

验证码分析 - 操作过程

1. 访问知乎登录页面,F12,然后刷新

可以看到,验证码url返回 false,即无需验证码

此时 的 request url 如下图

 

2. 多次刷新登录页面,观察 验证码 url 的 response,直至为 true

返回 true ,代表需要验证码

此时的 request url 如下图

可以看到 和不需验证码的url 相同,method 都是 get

 

我们发现紧接着又有一个 验证码url,是什么呢?

这应该就是验证码图片, base64 编码的图片。

base64 编码的图片。可先存入本地,而后手动输入

 

看下headers 

我们发现 method 变成了 put,request url 还是一样

 

也就是说,如果访问验证码url返回true, 会自动再次请求这个url,请求方式为 put, 返回 base64编码的图片

 

3. 输入账号、密码,弹出验证码,输入验证码,点击登录

首先是 post 了验证码数据,同样的url

 

post 参数如下图

key 是 input_text, value 为  图片大小和倒立文字的位置,这是倒立文字验证码

 

英文字母如下图

key 也是 input_text,value 为英文字母

 

倒立文字 和 英文字母 的url 不同, 文字 cn,字母 en

 

也就是说,得到 base64 编码的图片后,要给该 url post 验证码,然后才能登录

 

至此,我们得到验证码,并 post,获取登录的所有参数。

 

也可以尝试通过 搜索 登录url 的 js 关键字,获取登录参数。

 

 代码实现登陆

import json
import requests
import time
from hashlib import sha1
from time import sleep
import hmac
import base64
from PIL import Image


class Zhihu(object):

    def __init__(self):
        self.session=requests.session()
        self.headers={
            # 'authority':'www.zhihu.com',
            'user-agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0',
        }
        self.session.headers.update(self.headers)
        self.picture=None
        self.signature=None
        self.picture_url=None

    def getcapture(self):
        # 获取验证码方法,有时候不用获取验证码就可以直接登录
        # lang=en是英文字母 验证码
        message=self.session.get(url='https://www.zhihu.com/api/v3/oauth/captcha?lang=en').json()       # get 检测是否需要验证码
        print(message)
        if message['show_captcha'] == False:
            self.picture=''
        else:
            self.picture_url = self.session.put(url='https://www.zhihu.com/api/v3/oauth/captcha?lang=en').json()    # put 获取验证码
            # 采用base64格式将验证码通过图片格式显示出来
            with open('captcha.jpg','wb') as f:
                f.write(base64.b64decode(self.picture_url['img_base64']))
            image=Image.open('captcha.jpg')
            image.show()
            self.picture=input('请输入验证码')
            sleep(2)
            message1=self.session.post(url='https://www.zhihu.com/api/v3/oauth/captcha?lang=en',data={'input_text':self.picture}).json()    # post 验证码
            print(message1)

    def get_signature(self):
        # 知乎登陆的主要问题在于找到signature了这是重点。
        a=hmac.new('d1b964811afb40118a12068ff74a12f4'.encode('utf-8'),digestmod=sha1)
        a.update('password'.encode('utf-8'))
        a.update(b'c3cef7c66a1843f8b3a9e6a1e3160e20')
        a.update(b'com.zhihu.web')
        a.update(str(int(time.time()*1000)).encode())
        self.signature=a.hexdigest()

    def Login_phone(self):
        # 登录
        data={
            'client_id':'c3cef7c66a1843f8b3a9e6a1e3160e20',#'c3cef7c66a1843f8b3a9e6a1e3160e20',
            'grant_type':'password',
            'timestamp':str(int(time.time()*1000)),
            'source':'com.zhihu.web',
            'signature':self.signature,
            'username':'[email protected]',
            'password':'xxxxxxx',
            'captcha':self.picture,
            'lang':'en',
            # 'ref_source':'homepage',
            # 'utm_source':''
        }

        headers = {
                    # 'scheme':'https',
                    # 'accept':'*/*',
                    # 'accept-encoding':'gzip, deflate, br',
                    # 'accept-language':'zh-CN,zh;q=0.8',
                    # 'cache-control':'no-cache',
                    # 'content-length':'412',
                    # 'origin':'https://www.zhihu.com',
                   'content-type':'application/x-www-form-urlencoded',
                   # 'referer':'https://www.zhihu.com/signin?next=%2F',
                   'x-zse-83':'3_2.0',
                   }
        message=self.session.post(url='https://www.zhihu.com/api/v3/oauth/sign_in', headers=headers, data=data)
        message.encoding='utf-8'
        print(message.text)
        print(json.loads(message.text)['error']['message'])

    def target_url(self,url):
        text=self.session.get(url)
        return text.text


if __name__ == "__main__":
    zhihu=Zhihu()
    zhihu.getcapture()      # 验证码
    zhihu.get_signature()   # signature
    zhihu.Login_phone()     # 登录
    # print(zhihu.target_url('https://www.zhihu.com/'))

知乎模拟登陆还是很复杂的

 

 

 

参考资料:

https://blog.csdn.net/jiyukun1/article/details/82256222

https://blog.csdn.net/y15518325965/article/details/79406247

https://blog.csdn.net/sergiojune/article/details/87873787

https://blog.csdn.net/lvanboy/article/details/88044576

https://www.chainnews.com/articles/068650003844.htm  代码 错误 解析

https://github.com/zkqiang/Zhihu-Login

Guess you like

Origin www.cnblogs.com/yanshw/p/10950899.html
Recommended