JS decryption + confusion cracking

JS decryption + confusion cracking

Crawled website: https://www.aqistudy.cn/html/city_detail.html

View the blog more clearly: Blog address: https://www.cnblogs.com/bobo-zhang/p/11243138.html

analysis:

  • 1. Modify the query conditions (name of the city + time range), click the query button to capture the data packet corresponding to the request initiated after clicking the button. After clicking the query button, the ajax request is initiated. This request will load the data corresponding to the specified query condition into the current page. (The data we want to crawl is the data requested by the ajax request)

  • 2. Analyze the captured packets

    - 提取出请求的url:https://www.aqistudy.cn/apinew/aqistudyapi.php
            - 请求方式:post
            - 请求参数:d:动态变化一组数据(且加密)
                    - 响应数据:是加密的密文数据
        - 问题:该数据包请求到的是密文数据,为何在前台页面显示的缺失原文数据?
        - 原因:请求请求到密文数据后,前台接受到密文数据后使用指定的解密操作(js函数)对密文数据进行了解密,然后将原文数据显示在了前台页面。
        - 下一步工作的步骤:
                        - 首先先处理动态变化的请求参数,动态获取该参数的话,就可以携带该参数进行请求发送,将请求到的密文数据捕获到。
                                   - 将捕获到的密文数据找到对应的解密函数对其进行解密即可。
                                   - 【重点】需要找到点击查询按钮后对应的ajax请求代码,从这组代码中就可以破解动态变化的请求参数和加密的响应数据对应的相关操作。
                         - 3.找ajax请求对应的代码,分析代码获取参数d的生成,和加密的响应数据的解密操作
                                     - 基于火狐浏览器定位查询按钮绑定的点击事件。
    
     从getData函数实现中找寻ajax请求对应的代码
        - 在该函数的实现中没有找到ajax代码,但是发现了另外两个函数的调用
            - getAQIData();getWeatherData();ajax代码一定是存在于这两个函数实现内部
            - type == ’HOUR‘:查询时间是以小时为单位
    - 分析getAQIData();getWeatherData():找到ajax代码
        - 没有找到ajax请求代码
        - 发现了另一个函数调用:getServerData(method,param,func,0.5)
            - method = 'GETCITYWEATHER' or 'GETDETAIL'
            - params = {city,type,startTime,endTime}:查询条件
    - 分析getServerData,找寻ajax代码:
        - 基于抓包工具做全局搜索
    

    Decrypt the implementation of getServerData encryption

    • JS Obfuscation: Encrypt the core JS code
    • JS anti-obfuscation: decrypt js encrypted code
      • Brute force cracking: https://www.bm8.com.cn/jsConfusion/
      • Finally saw the code implemented by ajax
      • Analysis conclusion
        • data: encrypted response data
          • decodeDate (data) decrypts the encrypted response data
          • Parameter data: encrypted response data '
        • param: dynamically changing and encrypted request parameters
          • getParam (method.object) returns dynamically changing request parameters
            • 参数method:method = 'GETCITYWEATHER' or 'GETDETAIL'
            • Parameter object: {city, type, startTime, endTime}: query condition

    js reverse

    • Now only need to call two js functions (decodeData, getParam) to return the result. How to call js function in python program.
    • js reverse: call js function in python
      • Method 1:
        • Manually rewrite the js function as a python function
      • Method 2:
        • Use fixed modules for automatic reverse (recommended)
        • PyExecJS library to simulate JavaScript code execution to obtain dynamically encrypted request parameters, and then bring the encrypted response data into decodeData for decryption!
          • pip install PyExecJS
          • Install the nodejs environment on this machine
#模拟执行decodeData的js函数对加密响应数据进行解密
import execjs
import requests

node = execjs.get()
 
# Params
method = 'GETCITYWEATHER'
city = '北京'
type = 'HOUR'
start_time = '2018-01-25 00:00:00'
end_time = '2018-01-25 23:00:00'
 
# Compile javascript
file = 'jsCode.js'  这是JS源文件,你所模拟执行的JS函数需要在源文件中
ctx = node.compile(open(file,encoding='utf-8').read())
 
# Get params
js = 'getPostParamCode("{0}", "{1}", "{2}", "{3}", "{4}")'.format(method, city, type, start_time, end_time)
params = ctx.eval(js) #请求参数d

#发起post请求
url = 'https://www.aqistudy.cn/apinew/aqistudyapi.php'
response_text = requests.post(url, data={'d': params}).text#加密的响应数据

#对加密的响应数据进行解密
js = 'decodeData("{0}")'.format(response_text)
decrypted_data = ctx.eval(js)#返回的是解密后的原文数据
print(decrypted_data)
#执行会报错:目前页面中没有数据。解密函数只是针对页面中原始的数据进行解密。

Mobile data crawling

Using tools, fiddler

Reference blog- https : //www.cnblogs.com/bobo-zhang/p/10068994.html

https reference know: https://www.jianshu.com/p/4764825fb916

Guess you like

Origin www.cnblogs.com/zzsy/p/12688019.html