No.6 Youdao translation form forgery

1. Introduction

Website: http://fanyi.youdao.com/

Effect: Simulate web form submission and realize real-time translation

Use frame: requests

Difficulty factor: ✩✩✩

2. Tutorial

1 Introduction

As a well-known translation company in China, Youdao Translation has also opened an online translation website. The goal of our crawler this time is to crawl the form submission that simulates Youdao's translation and achieve the effect of real-time translation.

2. Website Analysis

website homepage

Insert picture description here

Try to translate

Insert picture description here

Crawl network request

By searching and discovering that there is a result we need in this request, then we can achieve the desired effect when we get this request.

Insert picture description here

Analysis form

Analyze form parameters through different requests

Insert picture description here

Insert picture description here

Through different requests, we can find the changed data of the form:

  • i : the translated text
  • salt : timestamp
  • sign : MD5 encrypted ciphertext
  • LTS : timestamp than salt more of a
  • bv : MD5 encrypted ciphertext

Breakpoint debugging to find form data

Use Ctrl + shift + F to find the JS file where the keyword is located

Insert picture description here

After finding the file, Ctrl + F looks for keywords, and then breaks at the keywords found.

Insert picture description here

Go back to the homepage and request again. You can find that the program webpage is paused. At this time, the breakpoint we hit took effect.

Insert picture description here

Returning to the debugger, there is no change, which means that no salt value is generated when the program runs to the breakpoint we hit. The straight point is the wrong place for the breakpoint. The next work is to repeat the above work.

Insert picture description here

When we hit this position, we finally found something amazing. All the values ​​we need are found here, then the JS code here is what we need to crack:

Insert picture description here

Analyze JS code

var r = function(e) {
    
    
    var t = n.md5(navigator.appVersion)  // navigator.appVersion的值为User-Agent,对该值进行md5加密
        , r = "" + (new Date).getTime()  // 获取当前时间戳
        , i = r + parseInt(10 * Math.random(), 10);  // 时间戳和一位随机数进行字符串拼接
        return {
    
    
            ts: r,
            bv: t,
            salt: i,
            sign: n.md5("fanyideskweb" + e + i + "]BjuETDhU)zqSxf-=B#7m")  // 字符串拼接后进行md5加密
    	}
 };

Python code implementation

import time
import random
from hashlib import md5

data = {
    
    
"i": "爬虫",
"from": "AUTO",
"to": "AUTO",
"smartresult": "dict",
"client": "fanyideskweb",
"doctype": "json",
"version": "2.1",
"keyfrom": "fanyi.web",
"action": "FY_BY_REALTlME"
}

enc = md5()
enc.update(
"5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 "
"Safari/537.36".encode())
data['bv'] = enc.hexdigest()
data['lts'] = time.time() * 1000
data['salt'] = data['lts'] + random.randint(0, 9)

enc = md5()
sign = f"fanyideskweb{self.keyword}{data['salt']}]BjuETDhU)zqSxf-=B#7m"
enc.update(sign.encode())
data['sign'] = enc.hexdigest()

After the form forgery is complete, we can request data. The data request part is relatively simple, so I won’t post a detailed tutorial here. The specific code can be viewed below.

3. Complete code

Complete code

Guess you like

Origin blog.csdn.net/qq_43580193/article/details/108352647