知乎登录js逆向及文章爬取js逆向

知乎登录js逆向及文章爬取js逆向

**在此声明:**本文章仅仅用于学习交流,不得用于商业活动。

登录支持账号密码登录及知乎移动端软件扫码登录。

文章爬取是把原文章的原样近似爬取,包括图片,链接,及评论。同时支持实时热点爬取。

好久不更新文章了,现在不一样了,放假了。其实这个文章我好久之前就打算写,但是当时比较忙,就被搁置了,现在放假了,就继续来。要不就半途而废了。也浪费了以前的时间。废话不多说,我们直接开干。

知乎,中文互联网高质量的问答社区和创作者聚集的原创内容平台,于 2011 年 1 月正式上线,以「让人们更好地分享知识、经验和见解,找到自己的解答」为品牌使命。知乎凭借认真、专业、友善的社区氛围、独特的产品机制以及结构化和易获得的优质内容,聚集了中文互联网科技、商业、影视等领域最具创造力的人群,已成为综合性、全品类、在诸多领域具有关键影响力的知识分享社区和创作者聚集的原创内容平台

既然要模拟登陆知乎,那么我们就首先要知道登陆请求是发给谁的。

image-20210202125655603
在这里插入图片描述

在知乎密码登录界面,打开开发者工具的Network,然后输入账号和错误的密码,我们可以看到它都发送了哪些请求。在Network中看到如下POST请求,headers中多了三个身份验证字段:x-xsrftokenx-ab-paramx-ab-pbx-zse-83经过测试,只有x-xsrftokenx-zse-83是必需的,并且x-xsrftoken是可以通过cookie获取的,x-zse-83是一个固定参数;而form data应该包括是加密后的信息,至少包括我们账号密码等信息。

接下来是找form data加密方法及其需要参数。

在这里就用老方法(适用于window系统)ctrl+shift+f,进行全局搜素,在这里我们就搜素一下sign_in,

全局搜索结果打开搜索到的这个js文件,当然文件可能不止一个,我们一个一个打开,进行局部搜素(crtl+f),再次搜素sign_in如果有的话,点击左下角的{}进行格式化,之后会出现美化后的js代码,再找有我们需要的没。

在寻找的过程中,会发现有一个是我们请求链接一下,如下图

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-GysXuO1s-1612276981000)(C:\Users\jyj34\Desktop\屏幕截图 2021-02-02 183626.jpg)]

我们打一个断点再次登录

这时会出现下面的情况:

在这里插入图片描述

在断点分析局部变量可以看出,这里的局部变量e包含了我们提交的用户名和密码。现在的任务还剩两个:

  • 凑齐这个表单中的其他变量,
  • 找到加密函数

补全变量

  • clientIdgrantTypelangrefSourcesource 都是固定参数
  • captcha:验证码,这里因为没有触发验证码机制,为空。
  • usernamepassword:用户名和原始密码
  • timestamp:当前时间的时间戳,JavaScript中new Date().getTime()等价于Python中int(time.time()*1000)
  • signature:签名认证,一个加密参数。

全局搜索signature,结果如下:
在这里插入图片描述

同时我们打一下断点,再次登录,会出现下面结果:

在这里插入图片描述

由上图可知signaturegrant_type + client_id + source + timestamp加密得到这么一段代码,看到SHA-1,知道signature它应该是用安全散列算法来进行加密的。有两种方法可以拿到signature:第一种使用python的hashlib库里的sha1模拟;第二种跳到js代码里找到逻辑代码然后使用python的execjs库执行js代码。因为python有这个算法,所以我们就用python进行模拟就好,

代码:

    def signature(self):
        self.e["timestamp"] = str(self.timestamp())
        ha = hmac.new(b'd1b964811afb40118a12068ff74a12f4', digestmod=hashlib.sha1)
        grant_type = self.e['grant_type']
        client_id = self.e['client_id']
        source = self.e['source']
        timestamp = self.e["timestamp"]
        ha.update(bytes((grant_type + client_id + source + timestamp), 'utf-8'))
        self.e["signature"] = ha.hexdigest()

可能看到有点困惑,怎么有self,因为这时从我sign类里直接粘贴过来了,到时看源码就知道了。

其实我们还有一个要解决是验证码,最近知乎加入滑块,但是我发现有时候是验证码,有时候是文字点选,如果是滑块并不会在加密里发现关于滑块的踪迹,但能发现文字点选,英数验证码的踪迹,并且Ajian还发现在爬虫时你选择哪一种验证方式,到时验证在加密参数里加入哪一种就好了。或许是一个bug。在这里我选择的是英数验证码,来,我们看一下验证码请求(lang=en代表英数验证码)
在这里插入图片描述

可以看到它发送登录请求之前先去请求了验证码。先发送一个get请求看是否有验证码

在这里插入图片描述

这里如果为false就没有验证码。当这里为true的是时候,又发送一个put请求返回验证码的base64的格式

在这里插入图片描述

拿到这个字符串之后用python的base64库对它进行解码并保存为图片的格式,这样我们就拿到了验证码图片。拿到验证码之后我们还要发送一个post请求去验证验证码对不对

在这里插入图片描述

这些都解决了,下面就是form data的加密算法我们继续回到这张图:

在这里插入图片描述

decamelizeKeys就是form data加密函数

在这里插入图片描述

看上图,我们把鼠标放到红杠杠的地方,然后点击上面的红方框框。但是多次追寻无果,搞不清逻辑,这时我们换一种思路,以攻为进,这里全局搜索 encrypt;【encrypt:加密】,程序员对于加密函数一般是不会瞎写的,都会用这个encrypt进行表示。当然会有多个文件,我们一个一个打开,然后再美化一下下,再局部搜素,会发现下面这样的结果。

在这里插入图片描述

我们打一下断点,再次登录,会发现先登录下图:

在这里插入图片描述

然后是下图:

在这里插入图片描述

看来我们逻辑是没有错的,加密就是那里,接下来我们把js代码扣下来,经过我的整理,变成下面这样,当然,里面也有坑,但是之前搞js逆向时碰到过,没有的就百度,找高人,哈哈哈哈哈。

form data加密代码:

const jsdom = require("jsdom");
const {
    
     JSDOM } = jsdom;
const dom = new JSDOM(`<!DOCTYPE html><p>Hello world</p>`);
window = dom.window;
document = window.document;
// function(module, exports, __webpack_require__) {
    
    
//    "use strict";
function t(e) {
    
    
    return (t = "function" == typeof Symbol && "symbol" == typeof Symbol.A ? function (e) {
    
    
                return typeof e
            }
            : function (e) {
    
    
                return e && "function" == typeof Symbol && e.constructor === Symbol && e !== Symbol.prototype ? "symbol" : typeof e
            }
    )(e)
}

Object.defineProperty(exports, "__esModule", {
    
    
    value: !0
});
var A = "2.0"
    , __g = {
    
    };

function s() {
    
    
}

function i(e) {
    
    
    this.t = (2048 & e) >> 11,
        this.s = (1536 & e) >> 9,
        this.i = 511 & e,
        this.h = 511 & e
}

function h(e) {
    
    
    this.s = (3072 & e) >> 10,
        this.h = 1023 & e
}

function a(e) {
    
    
    this.a = (3072 & e) >> 10,
        this.c = (768 & e) >> 8,
        this.n = (192 & e) >> 6,
        this.t = 63 & e
}

function c(e) {
    
    
    this.s = e >> 10 & 3,
        this.i = 1023 & e
}

function n() {
    
    
}

function e(e) {
    
    
    this.a = (3072 & e) >> 10,
        this.c = (768 & e) >> 8,
        this.n = (192 & e) >> 6,
        this.t = 63 & e
}

function o(e) {
    
    
    this.h = (4095 & e) >> 2,
        this.t = 3 & e
}

function r(e) {
    
    
    this.s = e >> 10 & 3,
        this.i = e >> 2 & 255,
        this.t = 3 & e
}

s.prototype.e = function (e) {
    
    
    e.o = !1
}
    ,
    i.prototype.e = function (e) {
    
    
        switch (this.t) {
    
    
            case 0:
                e.r[this.s] = this.i;
                break;
            case 1:
                e.r[this.s] = e.k[this.h]
        }
    }
    ,
    h.prototype.e = function (e) {
    
    
        e.k[this.h] = e.r[this.s]
    }
    ,
    a.prototype.e = function (e) {
    
    
        switch (this.t) {
    
    
            case 0:
                e.r[this.a] = e.r[this.c] + e.r[this.n];
                break;
            case 1:
                e.r[this.a] = e.r[this.c] - e.r[this.n];
                break;
            case 2:
                e.r[this.a] = e.r[this.c] * e.r[this.n];
                break;
            case 3:
                e.r[this.a] = e.r[this.c] / e.r[this.n];
                break;
            case 4:
                e.r[this.a] = e.r[this.c] % e.r[this.n];
                break;
            case 5:
                e.r[this.a] = e.r[this.c] == e.r[this.n];
                break;
            case 6:
                e.r[this.a] = e.r[this.c] >= e.r[this.n];
                break;
            case 7:
                e.r[this.a] = e.r[this.c] || e.r[this.n];
                break;
            case 8:
                e.r[this.a] = e.r[this.c] && e.r[this.n];
                break;
            case 9:
                e.r[this.a] = e.r[this.c] !== e.r[this.n];
                break;
            case 10:
                e.r[this.a] = t(e.r[this.c]);
                break;
            case 11:
                e.r[this.a] = e.r[this.c] in e.r[this.n];
                break;
            case 12:
                e.r[this.a] = e.r[this.c] > e.r[this.n];
                break;
            case 13:
                e.r[this.a] = -e.r[this.c];
                break;
            case 14:
                e.r[this.a] = e.r[this.c] < e.r[this.n];
                break;
            case 15:
                e.r[this.a] = e.r[this.c] & e.r[this.n];
                break;
            case 16:
                e.r[this.a] = e.r[this.c] ^ e.r[this.n];
                break;
            case 17:
                e.r[this.a] = e.r[this.c] << e.r[this.n];
                break;
            case 18:
                e.r[this.a] = e.r[this.c] >>> e.r[this.n];
                break;
            case 19:
                e.r[this.a] = e.r[this.c] | e.r[this.n];
                break;
            case 20:
                e.r[this.a] = !e.r[this.c]
        }
    }
    ,
    c.prototype.e = function (e) {
    
    
        e.Q.push(e.C),
            e.B.push(e.k),
            e.C = e.r[this.s],
            e.k = [];
        for (var t = 0; t < this.i; t++)
            e.k.unshift(e.f.pop());
        e.g.push(e.f),
            e.f = []
    }
    ,
    n.prototype.e = function (e) {
    
    
        e.C = e.Q.pop(),
            e.k = e.B.pop(),
            e.f = e.g.pop()
    }
    ,
    e.prototype.e = function (e) {
    
    
        switch (this.t) {
    
    
            case 0:
                e.u = e.r[this.a] >= e.r[this.c];
                break;
            case 1:
                e.u = e.r[this.a] <= e.r[this.c];
                break;
            case 2:
                e.u = e.r[this.a] > e.r[this.c];
                break;
            case 3:
                e.u = e.r[this.a] < e.r[this.c];
                break;
            case 4:
                e.u = e.r[this.a] == e.r[this.c];
                break;
            case 5:
                e.u = e.r[this.a] != e.r[this.c];
                break;
            case 6:
                e.u = e.r[this.a];
                break;
            case 7:
                e.u = !e.r[this.a]
        }
    }
    ,
    o.prototype.e = function (e) {
    
    
        switch (this.t) {
    
    
            case 0:
                e.C = this.h;
                break;
            case 1:
                e.u && (e.C = this.h);
                break;
            case 2:
                e.u || (e.C = this.h);
                break;
            case 3:
                e.C = this.h,
                    e.w = null
        }
        e.u = !1
    }
    ,
    r.prototype.e = function (e) {
    
    
        switch (this.t) {
    
    
            case 0:
                for (var t = [], n = 0; n < this.i; n++)
                    t.unshift(e.f.pop());
                e.r[3] = e.r[this.s](t[0], t[1]);
                break;
            case 1:
                for (var r = e.f.pop(), o = [], i = 0; i < this.i; i++)
                    o.unshift(e.f.pop());
                e.r[3] = e.r[this.s][r](o[0], o[1]);
                break;
            case 2:
                for (var a = [], c = 0; c < this.i; c++)
                    a.unshift(e.f.pop());
                e.r[3] = new e.r[this.s](a[0], a[1])
        }
    }
;
var k = function (e) {
    
    
    for (var t = 66, n = [], r = 0; r < e.length; r++) {
    
    
        var o = 24 ^ e.charCodeAt(r) ^ t;
        n.push(String.fromCharCode(o)),
            t = o
    }
    return n.join("")
};

function Q(e) {
    
    
    this.t = (4095 & e) >> 10,
        this.s = (1023 & e) >> 8,
        this.i = 1023 & e,
        this.h = 63 & e
}

function C(e) {
    
    
    this.t = (4095 & e) >> 10,
        this.a = (1023 & e) >> 8,
        this.c = (255 & e) >> 6
}

function B(e) {
    
    
    this.s = (3072 & e) >> 10,
        this.h = 1023 & e
}

function f(e) {
    
    
    this.h = 4095 & e
}

function g(e) {
    
    
    this.s = (3072 & e) >> 10
}

function u(e) {
    
    
    this.h = 4095 & e
}

function w(e) {
    
    
    this.t = (3840 & e) >> 8,
        this.s = (192 & e) >> 6,
        this.i = 63 & e
}

function G() {
    
    
    this.r = [0, 0, 0, 0],
        this.C = 0,
        this.Q = [],
        this.k = [],
        this.B = [],
        this.f = [],
        this.g = [],
        this.u = !1,
        this.G = [],
        this.b = [],
        this.o = !1,
        this.w = null,
        this.U = null,
        this.F = [],
        this.R = 0,
        this.J = {
    
    
            0: s,
            1: i,
            2: h,
            3: a,
            4: c,
            5: n,
            6: e,
            7: o,
            8: r,
            9: Q,
            10: C,
            11: B,
            12: f,
            13: g,
            14: u,
            15: w
        }
}

Q.prototype.e = function (e) {
    
    
    switch (this.t) {
    
    
        case 0:
            e.f.push(e.r[this.s]);
            break;
        case 1:
            e.f.push(this.i);
            break;
        case 2:
            e.f.push(e.k[this.h]);
            break;
        case 3:
            e.f.push(k(e.b[this.h]))
    }
}
    ,
    C.prototype.e = function (A) {
    
    
        switch (this.t) {
    
    
            case 0:
                var t = A.f.pop();
                A.r[this.a] = A.r[this.c][t];
                break;
            case 1:
                var s = A.f.pop()
                    , i = A.f.pop();
                A.r[this.c][s] = i;
                break;
            case 2:
                var h = A.f.pop();
                A.r[this.a] = eval(h)
        }
    }
    ,
    B.prototype.e = function (e) {
    
    
        e.r[this.s] = k(e.b[this.h])
    }
    ,
    f.prototype.e = function (e) {
    
    
        e.w = this.h
    }
    ,
    g.prototype.e = function (e) {
    
    
        throw e.r[this.s]
    }
    ,
    u.prototype.e = function (e) {
    
    
        var t = this
            , n = [0];
        e.k.forEach((function (e) {
    
    
                n.push(e)
            }
        ));
        var r = function (r) {
    
    
            var o = new G;
            return o.k = n,
                o.k[0] = r,
                o.v(e.G, t.h, e.b, e.F),
                o.r[3]
        };
        r.toString = function () {
    
    
            return "() { [native code] }"
        }
            ,
            e.r[3] = r
    }
    ,
    w.prototype.e = function (e) {
    
    
        switch (this.t) {
    
    
            case 0:
                for (var t = {
    
    }, n = 0; n < this.i; n++) {
    
    
                    var r = e.f.pop();
                    t[e.f.pop()] = r
                }
                e.r[this.s] = t;
                break;
            case 1:
                for (var o = [], i = 0; i < this.i; i++)
                    o.unshift(e.f.pop());
                e.r[this.s] = o
        }
    }
    ,
    G.prototype.D = function (e) {
    
    
        for (var t = window.atob(e), n = t.charCodeAt(0) << 8 | t.charCodeAt(1), r = [], o = 2; o < n + 2; o += 2)
            r.push(t.charCodeAt(o) << 8 | t.charCodeAt(o + 1));
        this.G = r;
        for (var i = [], a = n + 2; a < t.length;) {
    
    
            var c = t.charCodeAt(a) << 8 | t.charCodeAt(a + 1)
                , s = t.slice(a + 2, a + 2 + c);
            i.push(s),
                a += c + 2
        }
        this.b = i
    }
    ,
    G.prototype.v = function (e, t, n) {
    
    
        for (t = t || 0,
                 n = n || [],
                 this.C = t,
                 "string" == typeof e ? this.D(e) : (this.G = e,
                     this.b = n),
                 this.o = !0,
                 this.R = Date.now(); this.o;) {
    
    
            var r = this.G[this.C++];
            if ("number" != typeof r)
                break;
            var o = Date.now();
            if (500 < o - this.R)
                return;
            this.R = o;
            try {
    
    
                this.e(r)
            } catch (e) {
    
    
                this.U = e,
                this.w && (this.C = this.w)
            }
        }
    }
    ,
    G.prototype.e = function (e) {
    
    
        var t = (61440 & e) >> 12;
        new this.J[t](e).e(this)
    }
    ,
"undefined" != typeof window && (new G).v("AxjgB5MAnACoAJwBpAAAABAAIAKcAqgAMAq0AzRJZAZwUpwCqACQACACGAKcBKAAIAOcBagAIAQYAjAUGgKcBqFAuAc5hTSHZAZwqrAIGgA0QJEAJAAYAzAUGgOcCaFANRQ0R2QGcOKwChoANECRACQAsAuQABgDnAmgAJwMgAGcDYwFEAAzBmAGcSqwDhoANECRACQAGAKcD6AAGgKcEKFANEcYApwRoAAxB2AGcXKwEhoANECRACQAGAKcE6AAGgKcFKFANEdkBnGqsBUaADRAkQAkABgCnBagAGAGcdKwFxoANECRACQAGAKcGKAAYAZx+rAZGgA0QJEAJAAYA5waoABgBnIisBsaADRAkQAkABgCnBygABoCnB2hQDRHZAZyWrAeGgA0QJEAJAAYBJwfoAAwFGAGcoawIBoANECRACQAGAOQALAJkAAYBJwfgAlsBnK+sCEaADRAkQAkABgDkACwGpAAGAScH4AJbAZy9rAiGgA0QJEAJACwI5AAGAScH6AAkACcJKgAnCWgAJwmoACcJ4AFnA2MBRAAMw5gBnNasCgaADRAkQAkABgBEio0R5EAJAGwKSAFGACcKqAAEgM0RCQGGAYSATRFZAZzshgAtCs0QCQAGAYSAjRFZAZz1hgAtCw0QCQAEAAgB7AtIAgYAJwqoAASATRBJAkYCRIANEZkBnYqEAgaBxQBOYAoBxQEOYQ0giQKGAmQABgAnC6ABRgBGgo0UhD/MQ8zECALEAgaBxQBOYAoBxQEOYQ0gpEAJAoYARoKNFIQ/zEPkAAgChgLGgkUATmBkgAaAJwuhAUaCjdQFAg5kTSTJAsQCBoHFAE5gCgHFAQ5hDSCkQAkChgBGgo0UhD/MQ+QACAKGAsaCRQCOYGSABoAnC6EBRoKN1AUEDmRNJMkCxgFGgsUPzmPkgAaCJwvhAU0wCQFGAUaCxQGOZISPzZPkQAaCJwvhAU0wCQFGAUaCxQMOZISPzZPkQAaCJwvhAU0wCQFGAUaCxQSOZISPzZPkQAaCJwvhAU0wCQFGAkSAzRBJAlz/B4FUAAAAwUYIAAIBSITFQkTERwABi0GHxITAAAJLwMSGRsXHxMZAAk0Fw8HFh4NAwUABhU1EBceDwAENBcUEAAGNBkTGRcBAAFKAAkvHg4PKz4aEwIAAUsACDIVHB0QEQ4YAAsuAzs7AAoPKToKDgAHMx8SGQUvMQABSAALORoVGCQgERcCAxoACAU3ABEXAgMaAAsFGDcAERcCAxoUCgABSQAGOA8LGBsPAAYYLwsYGw8AAU4ABD8QHAUAAU8ABSkbCQ4BAAFMAAktCh8eDgMHCw8AAU0ADT4TGjQsGQMaFA0FHhkAFz4TGjQsGQMaFA0FHhk1NBkCHgUbGBEPAAFCABg9GgkjIAEmOgUHDQ8eFSU5DggJAwEcAwUAAUMAAUAAAUEADQEtFw0FBwtdWxQTGSAACBwrAxUPBR4ZAAkqGgUDAwMVEQ0ACC4DJD8eAx8RAAQ5GhUYAAFGAAAABjYRExELBAACWhgAAVoAQAg/PTw0NxcQPCQ5C3JZEBs9fkcnDRcUAXZia0Q4EhQgXHojMBY3MWVCNT0uDhMXcGQ7AUFPHigkQUwQFkhaAkEACjkTEQspNBMZPC0ABjkTEQsrLQ==");
var b = function (e) {
    
    
    // console.log(__g._encrypt(encodeURIComponent(e)))
    return __g._encrypt(encodeURIComponent(e))
}
// exports.ENCRYPT_VERSION = A,
// exports.default = b
// e = "client_id=c3cef7c66a1843f8b3a9e6a1e3160e20&grant_type=password&timestamp=1611807648668&source=com.zhihu.web&signature=bf8a76bdc8f39b0f1083eb2529af5208ccfd4d5f&username=%2B8613213296461&password=fdjbcikjnverv&captcha=&lang=cn&utm_source=&ref_source=other_https%3A%2F%2Fwww.zhihu.com%2Fsignin%3Fnext%3D%252F"
// console.log(b(e))

下面就可以登录。但是我们仅仅就登录吗,不再搞点其它东西吗?所以我就搞一下实时热点,关注的人及其文章了,当然还有评论,文章图片呀。

当然,我们如果仅仅想爬取它的热点是很简单的,不涉及什么js逆向,但是对于爬取关注了的人文章时,就不一样了,它就涉及了一下js代码,我们还得扣一扣,

首先是怎么得到自己页面的主页,经过我的不懈努力,我还是发现一些蛛丝马迹,是一个叫url_token的东西,(我大概猜测是我们加入知乎的第一个名字),但是我们怎么获得呢?我发现当我们进入知乎首页时会加请求一个东西,当然在加载我的主业时,也会请求一个东西。那个东东是下面这个东西,如图所示(是不是听到这个话有点小小耳熟呀,嘿嘿嘿):

在这里插入图片描述

那个me?include=visits_count,就是我们要找的,其实一开始我也没在意它,但是看到它总是被请求,而且有个me,我就想是不是与我有关呢。然后我就点开看看,发现还真有,它是以json数据形式返回的。

那我们的url_token解决了,下面就是怎么请求了,所以我们先看一下headers:

在这里插入图片描述

发现好的参数,并且有些是真长,不要怕,经过Ajian的测试,发现就x-zse-86x-zse-83是不可或缺的,那么接下来我们就全局搜索一下x-zse-86,会发现只要一个文件,(哎呦,太香了,哈哈哈哈)那我们直接打开吧,然后美化一下下,再进行局部搜索,但是局部搜索是会发现有两个,分别打断点试一试就好了,窝发现是下图那个

在这里插入图片描述

我们看一下它,我们能够明显看出,x-zse-86=2.0_+E,E是什么呢,往上看,E=y.signature,那我们看这个signature会怎么生成的,那我们局部搜索一下它吧,会发现下面的

在这里插入图片描述

原来signature就是一个函数,相当于对d进行了一些变换。猜想一下这里的(0,o.default)( (0,r.default)(d) )就是对明文d进行加密的函数,而d是由几个部分和“+”连接组成的,我们通过打断点的方式看看明文d是什么样子:
在这里插入图片描述

d=3_2.0``+``/api/v4/search_v3t=general&q=%E5%B1%B1%E4%B8%9C%E5%A4%A7%E5%AD%A6&correction=1&offset=0&limit=20&lc_idx=0&show_all_topics=0+``“AEAkV5lhDg-PTu36jNX6b3n62LBIfQP7QOk=|1551443146”,其实明文就是明文d是headers里的x-zse-83+url+cookie.d_c0

这里我们可以通过在控制台中调用这个r.defauly方法,来看看它输出的是什么:

在这里插入图片描述

这个看着有点熟悉,有点像md5加密,我们不妨试一试,果然是,

在这里插入图片描述

结果与控制台输出一致,接下来我们应该找一下o.default这个函数了,但是找不到,那我我们再次试一试上面那个方法,找一找encrypt,不找不要紧,一找还真有,那我们打下断点试试,结果如下图所示:
在这里插入图片描述

那个e = "965c2f5d20c0ef9f1b50def3025bc261"是不是很像md5加密的东西,其实它就是,哈哈哈,接下来我们就可以扣js代码了。

x-zse-86加密代码:

``

const jsdom = require("jsdom");
const {JSDOM} = jsdom;
const dom = new JSDOM(`<!DOCTYPE html><p>Hello world</p>`);
window = dom.window;
document = window.document;
XMLHttpRequest = window.XMLHttpRequest;

var exports = {}

function t(e) {
    return (t = "function" == typeof Symbol && "symbol" == typeof Symbol.A ? function (e) {
                return typeof e
            }
            : function (e) {
                return e && "function" == typeof Symbol && e.constructor === Symbol && e !== Symbol.prototype ? "symbol" : typeof e
            }
    )(e)
}

Object.defineProperty(exports, "__esModule", {
    value: !0
});
var A = "2.0"
    , __g = {};

function s() {
}

function i(e) {
    this.t = (2048 & e) >> 11,
        this.s = (1536 & e) >> 9,
        this.i = 511 & e,
        this.h = 511 & e
}

function h(e) {
    this.s = (3072 & e) >> 10,
        this.h = 1023 & e
}

function a(e) {
    this.a = (3072 & e) >> 10,
        this.c = (768 & e) >> 8,
        this.n = (192 & e) >> 6,
        this.t = 63 & e
}

function c(e) {
    this.s = e >> 10 & 3,
        this.i = 1023 & e
}

function n() {
}

function e(e) {
    this.a = (3072 & e) >> 10,
        this.c = (768 & e) >> 8,
        this.n = (192 & e) >> 6,
        this.t = 63 & e
}

function o(e) {
    this.h = (4095 & e) >> 2,
        this.t = 3 & e
}

function r(e) {
    this.s = e >> 10 & 3,
        this.i = e >> 2 & 255,
        this.t = 3 & e
}

s.prototype.e = function (e) {
    e.o = !1
}
    ,
    i.prototype.e = function (e) {
        switch (this.t) {
            case 0:
                e.r[this.s] = this.i;
                break;
            case 1:
                e.r[this.s] = e.k[this.h]
        }
    }
    ,
    h.prototype.e = function (e) {
        e.k[this.h] = e.r[this.s]
    }
    ,
    a.prototype.e = function (e) {
        switch (this.t) {
            case 0:
                e.r[this.a] = e.r[this.c] + e.r[this.n];
                break;
            case 1:
                e.r[this.a] = e.r[this.c] - e.r[this.n];
                break;
            case 2:
                e.r[this.a] = e.r[this.c] * e.r[this.n];
                break;
            case 3:
                e.r[this.a] = e.r[this.c] / e.r[this.n];
                break;
            case 4:
                e.r[this.a] = e.r[this.c] % e.r[this.n];
                break;
            case 5:
                e.r[this.a] = e.r[this.c] == e.r[this.n];
                break;
            case 6:
                e.r[this.a] = e.r[this.c] >= e.r[this.n];
                break;
            case 7:
                e.r[this.a] = e.r[this.c] || e.r[this.n];
                break;
            case 8:
                e.r[this.a] = e.r[this.c] && e.r[this.n];
                break;
            case 9:
                e.r[this.a] = e.r[this.c] !== e.r[this.n];
                break;
            case 10:
                e.r[this.a] = t(e.r[this.c]);
                break;
            case 11:
                e.r[this.a] = e.r[this.c] in e.r[this.n];
                break;
            case 12:
                e.r[this.a] = e.r[this.c] > e.r[this.n];
                break;
            case 13:
                e.r[this.a] = -e.r[this.c];
                break;
            case 14:
                e.r[this.a] = e.r[this.c] < e.r[this.n];
                break;
            case 15:
                e.r[this.a] = e.r[this.c] & e.r[this.n];
                break;
            case 16:
                e.r[this.a] = e.r[this.c] ^ e.r[this.n];
                break;
            case 17:
                e.r[this.a] = e.r[this.c] << e.r[this.n];
                break;
            case 18:
                e.r[this.a] = e.r[this.c] >>> e.r[this.n];
                break;
            case 19:
                e.r[this.a] = e.r[this.c] | e.r[this.n];
                break;
            case 20:
                e.r[this.a] = !e.r[this.c]
        }
    }
    ,
    c.prototype.e = function (e) {
        e.Q.push(e.C),
            e.B.push(e.k),
            e.C = e.r[this.s],
            e.k = [];
        for (var t = 0; t < this.i; t++)
            e.k.unshift(e.f.pop());
        e.g.push(e.f),
            e.f = []
    }
    ,
    n.prototype.e = function (e) {
        e.C = e.Q.pop(),
            e.k = e.B.pop(),
            e.f = e.g.pop()
    }
    ,
    e.prototype.e = function (e) {
        switch (this.t) {
            case 0:
                e.u = e.r[this.a] >= e.r[this.c];
                break;
            case 1:
                e.u = e.r[this.a] <= e.r[this.c];
                break;
            case 2:
                e.u = e.r[this.a] > e.r[this.c];
                break;
            case 3:
                e.u = e.r[this.a] < e.r[this.c];
                break;
            case 4:
                e.u = e.r[this.a] == e.r[this.c];
                break;
            case 5:
                e.u = e.r[this.a] != e.r[this.c];
                break;
            case 6:
                e.u = e.r[this.a];
                break;
            case 7:
                e.u = !e.r[this.a]
        }
    }
    ,
    o.prototype.e = function (e) {
        switch (this.t) {
            case 0:
                e.C = this.h;
                break;
            case 1:
                e.u && (e.C = this.h);
                break;
            case 2:
                e.u || (e.C = this.h);
                break;
            case 3:
                e.C = this.h,
                    e.w = null
        }
        e.u = !1
    }
    ,
    r.prototype.e = function (e) {
        switch (this.t) {
            case 0:
                for (var t = [], n = 0; n < this.i; n++)
                    t.unshift(e.f.pop());
                e.r[3] = e.r[this.s](t[0], t[1]);
                break;
            case 1:
                for (var r = e.f.pop(), o = [], i = 0; i < this.i; i++)
                    o.unshift(e.f.pop());
                e.r[3] = e.r[this.s][r](o[0], o[1]);
                break;
            case 2:
                for (var a = [], c = 0; c < this.i; c++)
                    a.unshift(e.f.pop());
                e.r[3] = new e.r[this.s](a[0], a[1])
        }
    }
;
var k = function (e) {
    for (var t = 66, n = [], r = 0; r < e.length; r++) {
        var o = 24 ^ e.charCodeAt(r) ^ t;
        n.push(String.fromCharCode(o)),
            t = o
    }
    return n.join("")
};

function Q(e) {
    this.t = (4095 & e) >> 10,
        this.s = (1023 & e) >> 8,
        this.i = 1023 & e,
        this.h = 63 & e
}

function C(e) {
    this.t = (4095 & e) >> 10,
        this.a = (1023 & e) >> 8,
        this.c = (255 & e) >> 6
}

function B(e) {
    this.s = (3072 & e) >> 10,
        this.h = 1023 & e
}

function f(e) {
    this.h = 4095 & e
}

function g(e) {
    this.s = (3072 & e) >> 10
}

function u(e) {
    this.h = 4095 & e
}

function w(e) {
    this.t = (3840 & e) >> 8,
        this.s = (192 & e) >> 6,
        this.i = 63 & e
}

function G() {
    this.r = [0, 0, 0, 0],
        this.C = 0,
        this.Q = [],
        this.k = [],
        this.B = [],
        this.f = [],
        this.g = [],
        this.u = !1,
        this.G = [],
        this.b = [],
        this.o = !1,
        this.w = null,
        this.U = null,
        this.F = [],
        this.R = 0,
        this.J = {
            0: s,
            1: i,
            2: h,
            3: a,
            4: c,
            5: n,
            6: e,
            7: o,
            8: r,
            9: Q,
            10: C,
            11: B,
            12: f,
            13: g,
            14: u,
            15: w
        }
}

Q.prototype.e = function (e) {
    switch (this.t) {
        case 0:
            e.f.push(e.r[this.s]);
            break;
        case 1:
            e.f.push(this.i);
            break;
        case 2:
            e.f.push(e.k[this.h]);
            break;
        case 3:
            e.f.push(k(e.b[this.h]))
    }
}
    ,
    C.prototype.e = function (A) {
        switch (this.t) {
            case 0:
                var t = A.f.pop();
                A.r[this.a] = A.r[this.c][t];
                break;
            case 1:
                var s = A.f.pop()
                    , i = A.f.pop();
                A.r[this.c][s] = i;
                break;
            case 2:
                var h = A.f.pop();
                A.r[this.a] = eval(h)
        }
    }
    ,
    B.prototype.e = function (e) {
        e.r[this.s] = k(e.b[this.h])
    }
    ,
    f.prototype.e = function (e) {
        e.w = this.h
    }
    ,
    g.prototype.e = function (e) {
        throw e.r[this.s]
    }
    ,
    u.prototype.e = function (e) {
        var t = this
            , n = [0];
        e.k.forEach(function (e) {
            n.push(e)
        });
        var r = function (r) {
            var o = new G;
            return o.k = n,
                o.k[0] = r,
                o.v(e.G, t.h, e.b, e.F),
                o.r[3]
        };
        r.toString = function () {
            return "() { [native code] }"
        }
            ,
            e.r[3] = r
    }
    ,
    w.prototype.e = function (e) {
        switch (this.t) {
            case 0:
                for (var t = {}, n = 0; n < this.i; n++) {
                    var r = e.f.pop();
                    t[e.f.pop()] = r
                }
                e.r[this.s] = t;
                break;
            case 1:
                for (var o = [], i = 0; i < this.i; i++)
                    o.unshift(e.f.pop());
                e.r[this.s] = o
        }
    }
    ,
    G.prototype.D = function (e) {
        console.log(window.atob(e));
        for (var t = window.atob(e), n = t.charCodeAt(0) << 8 | t.charCodeAt(1), r = [], o = 2; o < n + 2; o += 2)
            r.push(t.charCodeAt(o) << 8 | t.charCodeAt(o + 1));
        this.G = r;
        for (var i = [], a = n + 2; a < t.length;) {
            var c = t.charCodeAt(a) << 8 | t.charCodeAt(a + 1)
                , s = t.slice(a + 2, a + 2 + c);
            i.push(s),
                a += c + 2
        }
        this.b = i
    }
    ,
    G.prototype.v = function (e, t, n) {
        for (t = t || 0,
                 n = n || [],
                 this.C = t,
                 "string" == typeof e ? this.D(e) : (this.G = e,
                     this.b = n),
                 this.o = !0,
                 this.R = Date.now(); this.o;) {
            var r = this.G[this.C++];
            if ("number" != typeof r)
                break;
            var o = Date.now();
            if (500 < o - this.R)
                return;
            this.R = o;
            try {
                this.e(r)
            } catch (e) {
                this.U = e,
                this.w && (this.C = this.w)
            }
        }
    }
    ,
    G.prototype.e = function (e) {
        var t = (61440 & e) >> 12;
        new this.J[t](e).e(this)
    }
    ,
    (new G).v("AxjgB5MAnACoAJwBpAAAABAAIAKcAqgAMAq0AzRJZAZwUpwCqACQACACGAKcBKAAIAOcBagAIAQYAjAUGgKcBqFAuAc5hTSHZAZwqrAIGgA0QJEAJAAYAzAUGgOcCaFANRQ0R2QGcOKwChoANECRACQAsAuQABgDnAmgAJwMgAGcDYwFEAAzBmAGcSqwDhoANECRACQAGAKcD6AAGgKcEKFANEcYApwRoAAxB2AGcXKwEhoANECRACQAGAKcE6AAGgKcFKFANEdkBnGqsBUaADRAkQAkABgCnBagAGAGcdKwFxoANECRACQAGAKcGKAAYAZx+rAZGgA0QJEAJAAYA5waoABgBnIisBsaADRAkQAkABgCnBygABoCnB2hQDRHZAZyWrAeGgA0QJEAJAAYBJwfoAAwFGAGcoawIBoANECRACQAGAOQALAJkAAYBJwfgAlsBnK+sCEaADRAkQAkABgDkACwGpAAGAScH4AJbAZy9rAiGgA0QJEAJACwI5AAGAScH6AAkACcJKgAnCWgAJwmoACcJ4AFnA2MBRAAMw5gBnNasCgaADRAkQAkABgBEio0R5EAJAGwKSAFGACcKqAAEgM0RCQGGAYSATRFZAZzshgAtCs0QCQAGAYSAjRFZAZz1hgAtCw0QCQAEAAgB7AtIAgYAJwqoAASATRBJAkYCRIANEZkBnYqEAgaBxQBOYAoBxQEOYQ0giQKGAmQABgAnC6ABRgBGgo0UhD/MQ8zECALEAgaBxQBOYAoBxQEOYQ0gpEAJAoYARoKNFIQ/zEPkAAgChgLGgkUATmBkgAaAJwuhAUaCjdQFAg5kTSTJAsQCBoHFAE5gCgHFAQ5hDSCkQAkChgBGgo0UhD/MQ+QACAKGAsaCRQCOYGSABoAnC6EBRoKN1AUEDmRNJMkCxgFGgsUPzmPkgAaCJwvhAU0wCQFGAUaCxQGOZISPzZPkQAaCJwvhAU0wCQFGAUaCxQMOZISPzZPkQAaCJwvhAU0wCQFGAUaCxQSOZISPzZPkQAaCJwvhAU0wCQFGAkSAzRBJAlz/B4FUAAAAwUYIAAIBSITFQkTERwABi0GHxITAAAJLwMSGRsXHxMZAAk0Fw8HFh4NAwUABhU1EBceDwAENBcUEAAGNBkTGRcBAAFKAAkvHg4PKz4aEwIAAUsACDIVHB0QEQ4YAAsuAzs7AAoPKToKDgAHMx8SGQUvMQABSAALORoVGCQgERcCAxoACAU3ABEXAgMaAAsFGDcAERcCAxoUCgABSQAGOA8LGBsPAAYYLwsYGw8AAU4ABD8QHAUAAU8ABSkbCQ4BAAFMAAktCh8eDgMHCw8AAU0ADT4TGjQsGQMaFA0FHhkAFz4TGjQsGQMaFA0FHhk1NBkCHgUbGBEPAAFCABg9GgkjIAEmOgUHDQ8eFSU5DggJAwEcAwUAAUMAAUAAAUEADQEtFw0FBwtdWxQTGSAACBwrAxUPBR4ZAAkqGgUDAwMVEQ0ACC4DJD8eAx8RAAQ5GhUYAAFGAAAABjYRExELBAACWhgAAVoAQAg/PTw0NxcQPCQ5C3JZEBs9fkcnDRcUAXZia0Q4EhQgXHojMBY3MWVCNT0uDhMXcGQ7AUFPHigkQUwQFkhaAkEACjkTEQspNBMZPC0ABjkTEQsrLQ==");

function b(e) {
    console.log(e);
    console.log(encodeURIComponent(e));
    return __g._encrypt(encodeURIComponent(e))
};

我们的url_token可以获取到了,那么下面的许多都可以获取到了,其实x-zse-86我们破解后,当我吗爬取文章时也会用到,不过都是一样的加密参数,都需要x-zse-83+url+cookie.d_c0,我们最难的都解决了,还怕文章爬取吗,这里文章爬取我没有用xpath,用的是正则正则其实比xpath更适合,为什么呢?因为文章先以json返回,不管我是否看全文,都会在json里,当我们点击时,它就会被提取,渲染,然后就是我们看到的样子,同时我这次是把它们写在word里的。并且知乎大部分数据是以json形式返回。

对了,那个扫码登录我是借鉴github上一个人的(链接:https://github.com/CharlesPikachu/DecryptLogin),我自己搞不成,也不想搞了,对于爬虫来说,我们等实现一种登录就好了,并不需要太多登录方式。越简单越好。

好了,这个文章就到此结束了,如果想获取源码可以加我微信发送知乎源码,同时有什么不明白的可以和我交流。

这是我的公众号可以点个关注:
在这里插入图片描述

在我公众号里可以加我微信。

同时如果想了解互联网策略产品前世今生,可以关注下面这个公众号:

在这里插入图片描述

他在里面聊策略侃经济谈江湖思人生,满满的干货。

我下期再见。

猜你喜欢

转载自blog.csdn.net/weixin_45886778/article/details/113575960