【作者主页】:吴秋霖
【作者介绍】:Python领域优质创作者、阿里云博客专家、华为云享专家。长期致力于Python与爬虫领域研究与开发工作!
【作者推荐】:对JS逆向感兴趣的朋友可以关注《爬虫JS逆向实战》,对分布式爬虫平台感兴趣的朋友可以关注《分布式爬虫平台搭建与开发实战》
还有未来会持续更新的验证码突防、APP逆向、Python领域等一系列文章
1. 接口分析
首先找到我之前双十一购买的肉铺商品,开发者工具打开,点击商品的评价,可以看到发包请求参数如下:
如上参数一眼看过去的话应该是只有sign是加密生成的,32位的加密有经验的基本盲猜它是MD5加密的,之前我的文章有说到对于JS加密函数定位的各种技巧,可以去看看:JS逆向中快速搜索定位加密函数技巧总结
搜索MD5加密关键词,能够找到一个h函数,原生的加密算法,如下所示:
2. 断点分析
经验只是辅助,我们还得实战。为了验证上述对加密算法的猜测,我们开始更近一步的分析,全局在搜搜请求的参数,一般运气不差的话能节省很多时间,找到相关参数sign,直接断点刷新请求,如下所示:
OK!j就是我们分析的目标,往上一点看j的生成,这样看的话我想h是加密函数,这里的话我们先不管它,直接看看被加密之前的明文是什么,将h里面的代码拿到控制台执行,如下所示:
以上则是加密之前的明文信息,可以看到又带着h执行了一下,出来的不就是sign嘛,稍后看h加密函数,现在要看的主要是明文的组成部分,token应该是固定值:
i的话是一个13位的时间戳,g是请求内的appKey参数值(固定的)之后就是data也是请求参数可以拿到的
在调试过程中,根据上面分析发现传入MD5参数是通过对几个参数进行拼接而得的,具体总结如下:
var token = ''; // 固定的
var data = '{"itemId":"746109770909","bizCode":"ali.china.tmall","channel":"pc_detail","pageSize":20,"pageNum":2}'; // 评论相关的参数,包括商品的ID、评论的页数等
var timestamp = 1700909312000; // (new Date()).getTime(),时间戳
var appKey = ''; // 固定的参数,在JS上面能看到是固定的,多次请求也能发现
var concatenatedString = token + "&" + timestamp + "&" + appKey + "&" + data; // 进行拼接
将上述拼接后的值进行MD5加密,然后与网页生成的MD5进行比对,从而验证MD5加密是否为原生的,而没有经过修改
3. 算法实现
上面我们已经知道了加密之前的所有参数明文,现在我们再去看h函数,直接鼠标放置点击,如下所示:
跳转后发现h就一个原生的MD5加密算法,代码如下所示:
function h(a) {
function b(a, b) {
return a << b | a >>> 32 - b
}
function c(a, b) {
var c, d, e, f, g;
return e = 2147483648 & a,
f = 2147483648 & b,
c = 1073741824 & a,
d = 1073741824 & b,
g = (1073741823 & a) + (1073741823 & b),
c & d ? 2147483648 ^ g ^ e ^ f : c | d ? 1073741824 & g ? 3221225472 ^ g ^ e ^ f : 1073741824 ^ g ^ e ^ f : g ^ e ^ f
}
function d(a, b, c) {
return a & b | ~a & c
}
function e(a, b, c) {
return a & c | b & ~c
}
function f(a, b, c) {
return a ^ b ^ c
}
function g(a, b, c) {
return b ^ (a | ~c)
}
function h(a, e, f, g, h, i, j) {
return a = c(a, c(c(d(e, f, g), h), j)),
c(b(a, i), e)
}
function i(a, d, f, g, h, i, j) {
return a = c(a, c(c(e(d, f, g), h), j)),
c(b(a, i), d)
}
function j(a, d, e, g, h, i, j) {
return a = c(a, c(c(f(d, e, g), h), j)),
c(b(a, i), d)
}
function k(a, d, e, f, h, i, j) {
return a = c(a, c(c(g(d, e, f), h), j)),
c(b(a, i), d)
}
function l(a) {
for (var b, c = a.length, d = c + 8, e = (d - d % 64) / 64, f = 16 * (e + 1), g = new Array(f - 1), h = 0, i = 0; c > i; )
b = (i - i % 4) / 4,
h = i % 4 * 8,
g[b] = g[b] | a.charCodeAt(i) << h,
i++;
return b = (i - i % 4) / 4,
h = i % 4 * 8,
g[b] = g[b] | 128 << h,
g[f - 2] = c << 3,
g[f - 1] = c >>> 29,
g
}
function m(a) {
var b, c, d = "", e = "";
for (c = 0; 3 >= c; c++)
b = a >>> 8 * c & 255,
e = "0" + b.toString(16),
d += e.substr(e.length - 2, 2);
return d
}
function n(a) {
a = a.replace(/\r\n/g, "\n");
for (var b = "", c = 0; c < a.length; c++) {
var d = a.charCodeAt(c);
128 > d ? b += String.fromCharCode(d) : d > 127 && 2048 > d ? (b += String.fromCharCode(d >> 6 | 192),
b += String.fromCharCode(63 & d | 128)) : (b += String.fromCharCode(d >> 12 | 224),
b += String.fromCharCode(d >> 6 & 63 | 128),
b += String.fromCharCode(63 & d | 128))
}
return b
}
var o, p, q, r, s, t, u, v, w, x = [], y = 7, z = 12, A = 17, B = 22, C = 5, D = 9, E = 14, F = 20, G = 4, H = 11, I = 16, J = 23, K = 6, L = 10, M = 15, N = 21;
for (a = n(a),
x = l(a),
t = 1732584193,
u = 4023233417,
v = 2562383102,
w = 271733878,
o = 0; o < x.length; o += 16)
p = t,
q = u,
r = v,
s = w,
t = h(t, u, v, w, x[o + 0], y, 3614090360),
w = h(w, t, u, v, x[o + 1], z, 3905402710),
v = h(v, w, t, u, x[o + 2], A, 606105819),
u = h(u, v, w, t, x[o + 3], B, 3250441966),
t = h(t, u, v, w, x[o + 4], y, 4118548399),
w = h(w, t, u, v, x[o + 5], z, 1200080426),
v = h(v, w, t, u, x[o + 6], A, 2821735955),
u = h(u, v, w, t, x[o + 7], B, 4249261313),
t = h(t, u, v, w, x[o + 8], y, 1770035416),
w = h(w, t, u, v, x[o + 9], z, 2336552879),
v = h(v, w, t, u, x[o + 10], A, 4294925233),
u = h(u, v, w, t, x[o + 11], B, 2304563134),
t = h(t, u, v, w, x[o + 12], y, 1804603682),
w = h(w, t, u, v, x[o + 13], z, 4254626195),
v = h(v, w, t, u, x[o + 14], A, 2792965006),
u = h(u, v, w, t, x[o + 15], B, 1236535329),
t = i(t, u, v, w, x[o + 1], C, 4129170786),
w = i(w, t, u, v, x[o + 6], D, 3225465664),
v = i(v, w, t, u, x[o + 11], E, 643717713),
u = i(u, v, w, t, x[o + 0], F, 3921069994),
t = i(t, u, v, w, x[o + 5], C, 3593408605),
w = i(w, t, u, v, x[o + 10], D, 38016083),
v = i(v, w, t, u, x[o + 15], E, 3634488961),
u = i(u, v, w, t, x[o + 4], F, 3889429448),
t = i(t, u, v, w, x[o + 9], C, 568446438),
w = i(w, t, u, v, x[o + 14], D, 3275163606),
v = i(v, w, t, u, x[o + 3], E, 4107603335),
u = i(u, v, w, t, x[o + 8], F, 1163531501),
t = i(t, u, v, w, x[o + 13], C, 2850285829),
w = i(w, t, u, v, x[o + 2], D, 4243563512),
v = i(v, w, t, u, x[o + 7], E, 1735328473),
u = i(u, v, w, t, x[o + 12], F, 2368359562),
t = j(t, u, v, w, x[o + 5], G, 4294588738),
w = j(w, t, u, v, x[o + 8], H, 2272392833),
v = j(v, w, t, u, x[o + 11], I, 1839030562),
u = j(u, v, w, t, x[o + 14], J, 4259657740),
t = j(t, u, v, w, x[o + 1], G, 2763975236),
w = j(w, t, u, v, x[o + 4], H, 1272893353),
v = j(v, w, t, u, x[o + 7], I, 4139469664),
u = j(u, v, w, t, x[o + 10], J, 3200236656),
t = j(t, u, v, w, x[o + 13], G, 681279174),
w = j(w, t, u, v, x[o + 0], H, 3936430074),
v = j(v, w, t, u, x[o + 3], I, 3572445317),
u = j(u, v, w, t, x[o + 6], J, 76029189),
t = j(t, u, v, w, x[o + 9], G, 3654602809),
w = j(w, t, u, v, x[o + 12], H, 3873151461),
v = j(v, w, t, u, x[o + 15], I, 530742520),
u = j(u, v, w, t, x[o + 2], J, 3299628645),
t = k(t, u, v, w, x[o + 0], K, 4096336452),
w = k(w, t, u, v, x[o + 7], L, 1126891415),
v = k(v, w, t, u, x[o + 14], M, 2878612391),
u = k(u, v, w, t, x[o + 5], N, 4237533241),
t = k(t, u, v, w, x[o + 12], K, 1700485571),
w = k(w, t, u, v, x[o + 3], L, 2399980690),
v = k(v, w, t, u, x[o + 10], M, 4293915773),
u = k(u, v, w, t, x[o + 1], N, 2240044497),
t = k(t, u, v, w, x[o + 8], K, 1873313359),
w = k(w, t, u, v, x[o + 15], L, 4264355552),
v = k(v, w, t, u, x[o + 6], M, 2734768916),
u = k(u, v, w, t, x[o + 13], N, 1309151649),
t = k(t, u, v, w, x[o + 4], K, 4149444226),
w = k(w, t, u, v, x[o + 11], L, 3174756917),
v = k(v, w, t, u, x[o + 2], M, 718787259),
u = k(u, v, w, t, x[o + 9], N, 3951481745),
t = c(t, p),
u = c(u, q),
v = c(v, r),
w = c(w, s);
var O = m(t) + m(u) + m(v) + m(w);
return O.toLowerCase()
}
这里我们可以通过扣JS代码来还原加密算法,直接调用上面的JS代码。但是核心只需要做一个MD5加密,完全不用这么麻烦,直接使用Python一行代码即可实现。代码实现如下所示:
import time
import hashlib
token = '' # 自行添加
app_key = '' # 自行添加
data = '{"itemId":"746109770909","bizCode":"ali.china.tmall","channel":"pc_detail","pageSize":20,"pageNum":1}'
current_time = int(time.time() * 1000)
to_be_hashed = f'{
token}&{
current_time}&{
app_key}&{
data}'
# sign加密实现
sign = hashlib.md5(string.encode()).hexdigest()
请求调用部分代码如下:
import requests
# data就是上面的data
# sign也是上面计算出来的sign
def make_request(data, sign):
url = 'https://h5api.m.tmall.com/h5/mtop.alibaba.review.list.for.new.pc.detail/1.0/'
cookies = {
'Cookie': '...' # cookies
}
headers = {
'authority': 'h5api.m.tmall.com',
'accept': '*/*',
'accept-language': 'en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7',
'referer': 'https://detail.tmall.com/',
'sec-ch-ua': '"Google Chrome";v="117", "Not;A=Brand";v="8", "Chromium";v="117"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"macOS"',
'sec-fetch-dest': 'script',
'sec-fetch-mode': 'no-cors',
'sec-fetch-site': 'same-site',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36'
}
params = {
'jsv': '2.7.0',
'appKey': '12574478',
't': int(time.time() * 1000),
'sign': sign,
'api': 'mtop.alibaba.review.list.for.new.pc.detail',
'v': '1.0',
'isSec': '0',
'ecode': '0',
'timeout': '10000',
'ttid': '2022@taobao_litepc_9.17.0',
'AntiFlood': 'true',
'AntiCreep': 'true',
'preventFallback': 'true',
'type': 'jsonp',
'dataType': 'jsonp',
'callback': 'mtopjsonp4',
'data': data,
}
response = requests.get(url, params=params, cookies=cookies, headers=headers)
print(response.text)
在终端执行上述代码测试效果如下:
这里提醒一点,是有滑块验证的:
商品详情接口是一样的,参数有点变化,如下所示:
好了,到这里又到了跟大家说再见的时候了。创作不易,帮忙点个赞再走吧。你的支持是我创作的动力,希望能带给大家更多优质的文章