Pagoda human-machine identification verification

Case address: https://www.amec-inc.com
Case content: cookie analysis of human-machine identification of a pagoda on a site. (The case is very simple, refreshing the page after clearing the cookie can trigger human-machine recognition)
insert image description here

Through observation, it can be found that there is an additional cookie parameter e50222e9c8393a251cb491679ef73186 after human-machine identification.

insert image description here

Analyze from the code returned by the local request, first look at the loaded Js code.

 <script type="text/javascript" src="/renji_296d626f_32f3b5e96be67b639c72b3614aaac541.js?id=1660718301"></script>

Copy the code to the local and format it (cannot be copied directly from the browser page, the code is incomplete), it needs to be copied from the response.

insert image description here
The copied content is as follows

insert image description here

After formatting, observing the code can find the logic that enters the human-machine identification

insert image description here
Then focus on analyzing this c.get
insert image description here
c.get is the GET request of XMLHttpRequest, then clear the cookie again and check the data packet.

insert image description here

It is found that after loading Js, a request is initiated, and there are three parameters in Params

insert image description here

The verification cookie is returned after the request is successful.

insert image description here

Next, let's take a look at how the params of the request are generated. At present, it seems to be fixed.

"/a20be899_96a6_40b2_88ba_32f1f75f1552_yanzheng_ip.php?type=96c4e20a0e951f471d32dae103e83881&key=" + key + "&value=" + md5encode(stringtoHex(value)

Take the code out and run it and find that there is an error.
insert image description here

Error: ReferenceError: window is not defined
Fill in: window = {}

Error: TypeError: window.addEventListener is not a function
Make up: window.addEventListener = function (){}

Run it again and output the result successfully.

window = {
    
    }
window.addEventListener = function (){
    
    }

// 代码过长省略掉
// 此处为从renji.js文件的response中复制出的代码

var key = "32f3b5e96be67b639c72b3614aaac541";

var value = "b20f96e5878b0a47ff8626c8f757e35b";

function stringtoHex(acSTR) {
    
    
var val = "";
for (var i = 0; i <= acSTR.length - 1; i++) {
    
    
    var str = acSTR.charAt(i);
    var code = str.charCodeAt();
    val += code
}
;return val
};

function md5encode(word) {
    
    
return cx.MD5(word).toString()
};

var s="/a20be899_96a6_40b2_88ba_32f1f75f1552_yanzheng_ip.php?type=96c4e20a0e951f471d32dae103e83881&key=" + key + "&value=" + md5encode(stringtoHex(value))
console.log("https://www.amec-inc.com"+s)

After the process analysis is clear, the code can be reproduced locally.

It should be noted that the current key and value are fixed values, so there is no need to extract them from js, and you can directly request the interface to obtain the cookie.

But it will change when it is hard to say, everyone remembers dynamic analysis.


Test code:

import requests
from lxml import etree

headers={
    
    
    "Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
    "Accept-Encoding":"gzip, deflate",
    "Accept-Language":"zh-CN,zh;q=0.9,en;q=0.8,en-US;q=0.7",
    "Cache-Control":"no-cache",
    "Connection":"keep-alive",
    "Host":"www.amec-inc.com",
    "Pragma":"no-cache",
    "Referer":"http://www.amec-inc.com/index/Lists/index/catid/97.html",
    "Upgrade-Insecure-Requests":"1",
    "User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36"
}

url = 'https://www.amec-inc.com/index/Lists/index/catid/97.html'
curl = 'https://www.amec-inc.com/a20be899_96a6_40b2_88ba_32f1f75f1552_yanzheng_ip.php?type=96c4e20a0e951f471d32dae103e83881&key=32f3b5e96be67b639c72b3614aaac541&value=3bf9901397de25f7fc8d5c31e2059f5d'

sess = requests.session()
sess.get(curl, headers=headers)
html = sess.get(url, headers=headers).text

e = etree.HTML(html)
print(e.xpath('//div[@class="mews_Eone_ul"]/ul/li/a'))

Guess you like

Origin blog.csdn.net/weixin_43582101/article/details/126385805