Detailed Explanation of JS Reverse Complementary Environment Overpass

Detailed Explanation of JS Reverse Complementary Environment Overpass

"Ruishu" is a big mountain on the reverse road, a wall that many JS reversers cannot avoid, and it is also a bright spot on the job-hopping resume. We must overcome it before the next job-hopping! ! Fortunately, there are many articles on the Internet that explain Ruishu, and they teach us step by step to analyze the Ruishu process and how to deduct the logic of Ruishu, in an attempt to teach us (manual dog head ) . However, there are few articles that explain in detail how to pass Ruishu by purely supplementing the environment. Today, it's here!

In order to allow everyone to thoroughly handle the big brother Ruishu, this article will describe from the following four parts:

  1. The process logic of rs
  2. Talking about deduction code pass rs
  3. Explain in detail the process of supplementing the environment
  4. Comparison of code deduction and supplementary environment
  5. Curve overtaking

It's a long one, so sit tight and start off!
Note: The content of this article takes a newcomer-friendly rs4 website: Online Real Estate as an example;

One: The process logic of rs

When we do reverse engineering, we must first analyze what 加密参数needs to be reversed, and then reverse these parameters. Of course, the same is true for Ruishu.
So our first step is 明确逆向的目标:

  • Phenomenon: The website on rs will request page_url twice, and the correct page content can only be obtained when requesting page_url for the second time;
  • Analysis: Analyze the request body and find that cookie_s and cookie_t are included in the second request to page_url, and cookies_s comes from the response header set when page_url is requested for the first time;
  • Conclusion: Then our goal is determined - cracking cookie_t的生成逻辑;

Now, we know that the encryption parameter that needs to be reversed is cookie_t, so where does cookie_t come from, let's first analyze the request of the website

Ruishu website request process analysis:

第一次请求:Request page_url, response status code is 202, cookie_s is set in the response header;
insert image description here
the response body is the HTML source code, which can be divided into four parts from top to bottom: Let me spoil it first它们的作用

  1. A meta tag whose content is long and 动态constant (changes with each request), will be used when eval executes the second layer of JS code;
  2. An external js file , generally the content of which is in the same page 固定, the following self-executing function will decrypt the content of the file to generate the JS source code required for eval execution, which is the second layer of vm code;
  3. A large self-executing function动态 (the homepage will change every time it is requested ), mainly to decrypt the JS content of the external link, and add some attributes to the window such as $_ts, which will be used in vm;
  4. The function calls in the two script tags at the end, will 更新cookie, make it longer. We can ignore this here.[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-5Y95CH4R-1673156546278)(https://note.youdao.com/yws/res/26818/WEBRESOURCE8752b7a5f31959082ecd992b3c0c4ace)]

第二次请求:Request external link js, the general content is fixed;
insert image description here

第三次请求:Request page_url, return 200, and carry cookie_s, cookie_T to request normally;
insert image description here

So what exactly happens when we visit this website in a browser? What is worthy of our attention?
Let's simulate 浏览器加载page_url源码what happens first:

  1. The browser loads meta;
  2. The browser requests 外链jsand executes the js content;
  3. Execute the page_url source code 自执行函数, it will internally 外链js解密convert the tens of thousands of lines of js strings required by eval, and add many attributes to window.$_ts, then call the eval function 进入VMto execute the decrypted js, generate cookies , after eval is executed, continue to execute self-execution function;
  4. Execute the code in the script tag at the end, which will update cookie_t ( you can ignore it )
  5. Execute setTimeout, eventlistener callback function ( you can ignore it )

瑞数执行流程图解as follows:
Ruishu execution flow diagram

Here we need to focus on eval调用the location (that is VM的入口), the location where the cookie is generated .

Note: When browser v8 calls eval to execute code, it will open a virtual machine (VM+number) to execute JS code.

We can directly hook evalor 搜索.calldirectly locate the position where eval is called
insert image description here
_$Ln is 解密后the js code string; entering vm is a self-executing function, which has 生成cookiesome logic, and the location cookie can be directly hooked or searched in the vm code (5)
insert image description here

hook cookie code:

// hook  指定cookie赋值
(function () {
    
    
    'use strict';
    Object.defineProperty(document, 'cookie', {
    
    
        set: function (cookie) {
    
    
            if(cookie.indexOf("FSSBBIl1UgzbN7N80T")!= -1){
    
    
                debugger;
            }
            return cookie;
        }
    });
})();

Well, the above is the overall process logic of Ruishu, so how do we 扣代码pass Ruishu by generating cookie_t?

2. Talking about deducting the code to pass the Swiss number

Since the source code of Ruishu page_url is 动态changed every time it is requested, VM代码it is also dynamically changed, so we need to save a static code for easy debugging;
as shown in the figure, just fix the page_url response,
insert image description here

When requesting in this way, the page_url 外链JSis fixed, 自执行函数fixed, VM中的代码and fixed, so is the cookie_t generated each time by this fixed code also fixed?
Answer: No , because the generation of cookie_t uses random numbers , timestamps , and values ​​in localStorage (the value will be +1 each time it is called), we also hook the two variables of random numbers and timestamps , and then each time Open a new non-trace browser to run, and the final generated cookie_t is fixed.

At this point, we can start deduction of this static code. If the deduction is consistent with the one generated by node cookie_tand the static code generated by the browser, cookie_tit means that the deduction is successful.

Deduction code needs to deduct two parts, the page_url source code part and the cookie_t part generated in the VM.
According to our previous analysis, the code in the VM will be used window.$_ts, so we must first ensure that the code deducted from the page_url source code can be generated normally when it is executed to eval window.$_ts.

We deduct the sum of page_url 自执行函数and 外链JS内容meta content first, and then simply supplement the environment according to the execution error report, so that when the deducted code is executed to eval, the window.$_tsdecrypted sum VM JS代码is consistent with the one generated by the local static code.
insert image description here
This means that we have deducted the page_url source code part, and then we can deduct the part that generates cookie_t in the VM.

From the analysis above cookie的定位, we can know that the secondary cookie_t is generated in the function _$bO, then this is us 扣代码的起点, buckle the function to the end of the file, execute it directly, and then make up for what is missing. BOM、DOM apiWhen encountering the code used, we can Use 等价逻辑替换, for example, the original logic here is to get the content content in the meta tag, and then delete the meta tag, we can directly replace it with return meta_content equivalently. As shown in the picture:
insert image description here

If at the end, the cookie_t generated by node is inconsistent with that generated by the local static code , it means that the code we deducted has not passed some environmental tests . We can locate it according to the difference between node and browser execution , such as node execution to a certain place The value is different from that obtained by the browser executing the static code, indicating that there is a difference in the previous logic. Continue to move forward, and repeat this process to find all the missing environment detections and complete the deduction of the VM part.

At this point, we have completed the deduction of the static rs code and succeeded, but the code of the rs website is 动态yes, window.$_tsand VM jsit will change every time it is requested. Do we have to deduct every copy? No, in fact, we are now only one mapping away from real success. Although
the tens of thousands of lines of js code of the VM will change every time, only the variable name changes, and everything else remains unchanged.
The mapping is to correspond the attribute names in the JS with the fixed attribute names 动态window.$_tsused in the JS in the VM we deducted .window.$_ts

So in the process of deduction, we need to pay attention to which places in the VM are used 外部变量, (that is, which variables in a function come from other scopes, even external variables), we need to pay attention to where these external variables come from , if the VM itself If it is defined in the execution function, then don’t worry about it. If it comes from window.$_ts, you need to record it. This is what needs to be mapped.
insert image description here
The calculation logic here is used window.$_ts._$IK, so we need to pass in this value when we do the mapping;window.$_ts={_$IK:对应的动态属性名}

After solving the mapping, the code will be deducted successfully through rs.
Well, that's all for the deduction code, and the next step is the focus of our article.

3. Explain the process of supplementing the environment in detail

Comrades who don’t know the principle of supplementary environment can refer to my previous article: JS逆向之浏览器补环境详解;
In fact 纯补环境, the principle of Ruishu is very simple. Let’s observe the diagram of Ruishu execution flow. Executing these dynamic JS based on the browser environment can generate usable cookie_t. So as long as the browser environment we supplement is perfect enough to make these dynamic JS look like us 补的环境===浏览器环境, then the environment we supplement executes these dynamic JS , and can also generate usable cookie_t, and then we extract cookie_t through document.cookie Not good enough.

Expressed in pseudocode is:

// 补的环境头
window = this;
... 省略大量环境头
// 模拟meta标签及其content
document.createElement('meta');
Meta$content = "{qYnKTJPAw84QfF5jm0I2_1IqhgTvRw8Y0yCBPxIVn6od8AeJE6CBz8ZSU6U...省略";
// 固定的外链js
$_ts=window['$_ts'];if(!$_ts)$_ts={
    
    };$_ts.scj=[];$_ts['dfe1675']='þþ+...省略';
// page_url动态自执行函数
(function(){
    
    var _$CK=0,_$WI=[[9,3,6,0,4,1; ...ret = _$su.call(_$fr, _$WR); 很长...省略}}}}}}}})();;


// 获取cookie
function get_cookie(){
    
    
    return document.cookie;
}
// 获取MmEwMD参数
function get_mme(){
    
    {
    
    
    XMLHttpRequest.prototype.open("GET","http://脱敏/",true);
    return XMLHttpRequest.prototype.uri;
}}

This is the final file we will complete. Since the Meta$content and page_url dynamic self-executing functions change dynamically, we need to use regular expressions to extract these two strings every time we request page_url, and then splice them into the file. , and then python pyexecjs calls get_cookie to get the available cookie_t;

It is also mentioned in the code deduction above that the generated cookie_t is changed due to random numbers and timestamps involved in the operation of generating cookie_t. We can fix the timestamp and random numbers so that the same static The cookie_t generated by JS is fixed.静态JS代码hook

// 固定定随机数和时间戳
Date.prototype.getTime = function(){
    
    return 1672931693};
Math.random = function(){
    
    return 0.5};

补的环境We can also judge whether we are === through the final generated cookie_t 浏览器环境.
The principle is very simple, and the next step is how to practice it. We need to make up a perfect environment header so that the cookie_t obtained by this 静态JSexecution is consistent with that obtained by the browser.
Because supplementing the environment is a systematic work and there are general routines , we can use the method mentioned in the previous article 补环境框架to systematically supplement the environment. What we have to do is to log输出continuously improve this framework according to the problems that arise.

As shown in the figure, start the framework, the above is the environment header we added, and the following is the code we deducted.
insert image description here

Continue to debug. When we Proxy拦截器intercept BOM、DOM apithe use, the debugger will live. We can check which line of code uses the browser environment according to the call stack , and then see whether the simulated result in the framework is consistent with the browser. If If it is inconsistent, you need to make up this environment.
insert image description here

Such reciprocating supplementary environment until it can be generated at the end cookie_t, judge whether it is consistent with the browser’s local generation, if not, use the dichotomy to locate, see which browser environment has not been supplemented, until finally get the correct one, post it here for the cookie_tend Execution rendering:
This is part of the final print 环境检测点:
insert image description here

Here's what the fetch ends up with cookie_t:
insert image description here

In the same way, MmEwMDthe same logic applies to the parameter complement environment. When the environment header is perfect, execute the final result file in python to get the following results:
insert image description here

4. Supplementary environment and deduction code summary:

For js逆向me, these are two conventional and practical means, and each has its own advantages and disadvantages;

No matter which method is used, we first deduct the encrypted JS code from the website, and then choose whether to continue deducting the code and replace the used 浏览器环境apilogic; or use the supplementary environment to make the encrypted JS code seem to be in the browser environment run in.

  • Both the deduction code and the supplementary environment depend on the proficiency of JS, the deduction code is more focused on js syntax and code logic, and the supplementary environment is more focused on the simulation of the prototype chain and BOM and DOM objects.
  • Code deduction proficiency depends on reverse experience, and supplementary environment almost only depends on JS proficiency.
  • Deduction code needs to debug and track a lot of logic. For rs, if you don’t understand the confusion, you will have hemorrhoids in your buttocks;
  • The deduction code needs to replace the environment detection logic, so it also needs to know where the browser environment is used; the supplementary environment framework can only be used to monitor the use of the browser environment, and can be used as an auxiliary tool for deduction code.
  • Since Ruishu is dynamic, the deduction code can only deduct a static one, so it is necessary to find all the dynamic attributes used in the vm for mapping. The supplementary environment is universal, the more supplementary, the more websites that can be killed.
  • Deducting codes is more efficient than supplementing the environment. After all, the number of codes for supplementing the environment is much more than deduction codes. The gap can be narrowed by eliminating unnecessary environments;
  • The manual time-consuming of deduction code is much higher than that of supplementary environment.

All in all, 扣代码it focuses on js syntax and code logic, and its proficiency depends on the reverse experience. It is different for different websites, it is difficult to use universally, and the labor efficiency is low, but the program execution efficiency is high.
补环境It focuses on the prototype chain and browser environment simulation. The proficiency almost only depends on the mastery of JS principles. The more different websites are supplemented, the more websites can be killed. The manual efficiency is extremely high, but the program execution efficiency is not high.

5. Curve overtaking link

Passing Ruishu is the small goal of almost every novice reverser, and it is also a common questioning point for interviewers. Through this article, we have learned 瑞数的流程及破解思路that we can try to implement a complete supplementary environment framework from scratch, 纯环境黑盒过瑞数but it will take a long time to develop, and there are many repetitive tasks that are boring (copy Paste comparison, etc.).

Take the fast lane: The supplementary environment framework
of this version of this article is systematically improved based on the framework of the previous article. At present, it can be said that it is quite complete and has supplemented a lot of environments. If you want to save If you spend a lot of time, directly , you can contact me on WeChat: dengshengfeng666 Paid source code reference; fixed price 199, after payment, directly send the source code of the framework project (you can directly get started with the latest graphic readme), and if you have any questions in the future, you can directly ask me. Or private message me directly on CSDN.极大提高效率弯道超车

Improvements between this version and the previous version:

  1. Optimize Proxy13 kinds of interceptors, achieve the limit of Proxy, and make it recursive proxy, support abcde... level depth detection;
  2. Improve the calling mechanism of the final result file.js, so that it can be directly used for V8 and node calls without modification;
  3. Add some BOM and DOM objects, such as: XMLHttpRequest, XMLHttpRequestEventTarget, etc.;
  4. Complete all browser environment methods used by rs, so that they can pass rs in a black box;
  5. Optimize the debugging method, you can break the point when you want to break, and skip the detection when you don't want to break;
  6. Add the case of py directly calling the result file, which can be called by v8 and node using python;
  7. Optimize the readme, and introduce the environment configuration and usage with pictures and texts.

Curve overtaking, start from me
The framework directory of this version:
insert image description here

Guess you like

Origin blog.csdn.net/qq_36291294/article/details/128600583