Introduction to crawler anti-obfuscation--ob obfuscation of JS obfuscation

Project scene:

I have written several articles about JS reverse engineering. As a worker who loves crawlers, I must be exposed to JS obfuscation. There are many kinds of JS obfuscation. Here are a few: UglifyJS, JScrambler, jsbeautifier.org, JSDetox, obfuscator.io Wait, you can take a look at this article https://blog.csdn.net/hefeng6500/article/details/80024810 for details . Since JS can be obfuscated, then there will be a deobfuscator method. This time I will bring you obfuscator (Ob) Obfuscated entry-level decryption.


solution:


1. Post an obfuscator obfuscator website here , visit the page to see an example below.

Insert picture description here

2. Just use his default JS code to see what it looks like after obfuscation. Variables and method names have been completely replaced, and then you can see an obvious array in the first line. This is a common feature of ob obfuscation.
var _0x30bb = ['log', 'Hello\x20World!'];

(function (_0x38d89d, _0x30bbb2) {
    
    
    var _0xae0a32 = function (_0x2e4e9d) {
    
    
        while (--_0x2e4e9d) {
    
    
            _0x38d89d['push'](_0x38d89d['shift']());
        }
    };
    _0xae0a32(++_0x30bbb2);
}(_0x30bb, 0x153));

var _0xae0a = function (_0x38d89d, _0x30bbb2) {
    
    
    _0x38d89d = _0x38d89d - 0x0;
    var _0xae0a32 = _0x30bb[_0x38d89d];
    return _0xae0a32;
};

function hi() {
    
    
    console[_0xae0a('0x1')](_0xae0a('0x0'));
}

hi();

3. First look at the overall structure of the obfuscated code. The overall structure is around the array variable _0x30bb, and the shift operation (obvious shift) is performed first. _0xae0a is the decryption function called in the hi() method.

Insert picture description here

4. Let’s take a look at the shifted array first and place the code directly under the browser’s Console for execution

Insert picture description here

5. OK, we got the shifted array, and then there is a ready-made decryption method, we only need to execute the decryption function with _0xae0a in the hi() method once to restore the original data, then, here, one by one Execution is impossible, write a code to let him replace automatically, and paste the code directly
# -*- coding: utf-8 -*-
'''
无法100%还原为源代码,只能修改变量和方法名,增加可读性
'''
import re
import execjs

func_js = """
// 直接用重组后的数组替换原来的数组
var _0x30bb = ["Hello World!", "log"]

// 解混淆用到的函数
var _0xae0a = function (_0x38d89d, _0x30bbb2) {
    _0x38d89d = _0x38d89d - 0x0;
    var _0xae0a32 = _0x30bb[_0x38d89d];
    return _0xae0a32;
};
"""

# 1.编译解混淆函数到node.js环境中
js_func_name = '_0xae0a'  # 混淆js中函数定义的名称
ctx = execjs.compile(func_js)
# 2.正则匹配出所有需要替换的函数
with open('source.js') as f1:
    js = f1.read()

be_replaced_func_set = set(re.findall(js_func_name + "\([\s\S]+?\)",js))

print(be_replaced_func_set)
# 3.循环遍历进行替换
for be_replaced_func in be_replaced_func_set:
    args_tuple = re.findall("\(([\s\S]+?)\)", be_replaced_func)[0]
    args0 = eval(args_tuple.split(',')[0]) # 截取参数

    res = ctx.call(js_func_name, args0) # 调用参数,获取返回值
    js = js.replace(be_replaced_func, "'" + res + "'")
    print('{} 替换完成'.format(res))

with open('code.js', 'w') as f2: # 重写JS输出
    js = f2.write(js)


6. Then let's Run, look at the decrypted hi() function and compare it with the original one, and we are almost done!
# 解密后
function hi() {
    
    
    console['log']('Hello World!'); # 等价于console.log('Hello World!')
}hi();

# 解密前
function hi() {
    
    
  console.log("Hello World!");
}
hi();

Insert picture description here


PS: The next article will bring more complicated JS confusion. If you find it useful, I hope to pay attention to it~~Thank you!

Guess you like

Origin blog.csdn.net/qq_26079939/article/details/108644855