JS decryption entry case: python Youdao translation JS decryption

foreword

Hello! Hello everyone, this is the Demon King~

Course Highlights:

  1. Systematic analysis of web page structure
  2. Dynamic data packet capture demo
  3. json data parsing
  4. JS decryption

Environment introduction:

  • python 3.8
  • pycharm >>> need to install nodejs plugin
  • Nodejs interpreter used to run JS code

Module use:

  • requests >>> pip install requests
  • execjs >>> pip install pyexecjs

How to install python third-party modules:

  1. win + R Enter cmd Click OK, enter the installation command pip install module name (pip install requests) Enter
  2. Click Terminal in pycharm to enter the installation command

How to configure the python interpreter in pycharm?

  1. Select file >>> setting >>> Project >>> python interpreter (python interpreter)
  2. Click on the gear, select add
  3. Add python installation path

How does pycharm install plugins?

  1. Select file >>> setting >>> Plugins
  2. Click on Marketplace and enter the name of the plug-in you want to install. For example: translation plug-in input translation / Chinese plug-in input Chinese
  3. Select the corresponding plug-in and click install.
  4. After the installation is successful, the option to restart pycharm will pop up, click OK, and restart to take effect.

The basic process of the crawler case:

1. Data source analysis

  1. Determine what data is needed to capture website video
  2. Capture packets through developer tools, analyze the data we want is the request
    post request sent by that url address >>> need to submit data form data
    Analyze the change law of request parameters, sign parameters, each request is different

The first way to deduct the code:

  • To do JS decryption, in fact, deduct the code. Where do I want the sign parameter to come from, deduct that piece of code?
  • Run the JS code, after running, it will definitely report an error. What is not defined, what is missing is what to make up

Call the JS code content through python to get the returned data content

The second way is to directly rewrite the JS code with python:

2. Code implementation process: send request, get data, parse data, save data

  1. Send a request, send a request for the translation interface
  2. Get data, get the data content returned by the server
  3. Parse the data, extract the results we want to translate

If you want to do JS decryption, you must first analyze which parameter encryption it is, and secondly analyze whether the encrypted parameters are generated by the JS code, how to generate it, and then the deduction code

The simplest case in JS reverse... none

code

It is better for me to delete the URL in the code than to review it. If you want a friend, you can read the comments or privately chat with me to get it~

# 导入数据请求模块
import requests
# 导入格式化输出模块
import pprint
# 导入execjs
import execjs
# 导入md5解密模块
import hashlib  # 内置模块
# 导入时间模块
import time

# 1647329439.9328077
# 16473294570110
# 16473295059531

# m3u8 AES加密
while True:
    word = input('请输入你想要翻译的内容(输入0即可退出): ')
    if word == '0':
        break
    # f = open('有道.js', encoding='utf-8')
    # js_code = f.read()
    # compile_code = execjs.compile(js_code)
    # json_data = compile_code.call('youdao', word)
    string = "fanyideskweb" + word + str(int(time.time() * 10000)) + "Ygy_4c=r#e#4EX^NUGUc5"
    sign = hashlib.md5(string.encode('utf-8')).hexdigest()
    # print(json_data)
    url = ''  # 确定请求网址
    # headers 请求头 伪装python代码, 如果你不伪装, 就被识别出来是爬虫程序, 从而得不到数据内容
    headers = {
    
    
        'Cookie': 'OUTFOX_SEARCH_USER_ID=1092484940@10.169.0.82; OUTFOX_SEARCH_USER_ID_NCOO=1350964471.5510483; JSESSIONID=aaa_jaG1Fa7rPdutNrm_x; ___rl__test__cookies=1647328160933',
        'Host': 'fanyi.youdao.com',
        'Origin': '',
        'Referer': '',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36',
    }
    # 表单数据, post请求都是需要提交一个from data 表单数据
    data = {
    
    
        'i': word,
        'from': 'AUTO',
        'to': 'AUTO',
        'smartresult': 'dict',
        'client': 'fanyideskweb',
        'salt': int(time.time() * 10000),
        'sign': sign,
        'lts': int(time.time() * 1000),
        'bv': 'c2777327e4e29b7c4728f13e47bde9a5',
        'doctype': 'json',
        'version': '2.1',
        'keyfrom': 'fanyi.web',
        'action': 'FY_BY_REALTlME',
    }
    response = requests.post(url=url, data=data, headers=headers)  # <Response [200]> 200 状态码请求成功 响应对象
    # response.json() 返回json字典数据 键值对取值
    translateResult = response.json()['translateResult'][0][0]['tgt']
    # pprint.pprint(response.json())
    print('翻译的结果: ', translateResult)

video tutorial

JS decryption entry case: python youdao translation JS decryption

epilogue

Well, this article of mine ends here!

If you have more suggestions or questions, feel free to comment or private message me! Let's work hard together (ง •_•)ง

Follow the blogger if you like it, or like and comment on my article! ! !

Guess you like

Origin blog.csdn.net/python56123/article/details/124100404