JS reverse analysis of Ruishu 5th generation environmental detection of a certain website

1. Write in front

  Reverse technology is indeed very challenging. I often see all kinds of tricks and tricks in various crawlers and reverse groups. In the field of reptiles, more sharing is needed in order to be able to achieve a better self. The encryption technology used by this website to be discussed in this issue is relatively difficult! When I was looking for a case, I was also learning the ideas and skills of other big guys. There is no shortcut to learning reverse engineering. We can only rely on our own analysis and accumulation.

target site :

aHR0cHM6Ly9xaWthbi5jcXZpcC5jb20vUWlrYW4vSm91cm5hbC9TdW1tYXJ5P2tpbmQ9MSZnY2g9OTUyNDNYJmZyb209UWlrYW5fSm91cm5hbF9TdW1tYXJ5

2. Target Analysis

First, open the case website and click to turn the page. The main purpose of this time is to obtain more encrypted analysis of the content loaded in it. The search in this website is also an encrypted area
insert image description here
. You can get more data content through this url

There have been many people on the Internet in the search area of ​​this website have analyzed this, some of which are hard-encrypted and encrypted, and some of which are based on RPC technology.

Open this url on other browsers and find that no data has been received. This is because the VIP journal website has not only encrypted this parameter, but also made special processing so that we cannot directly obtain data from the URL. We must obtain data from VIP Journal websites can only obtain data by opening the url, which protects the data on VIP journals to a certain extent, so as to achieve the purpose of anti-crawling

previous part:

https://*.com/Journal/RightArticle?

Encrypted part:

X2sCXRB4=0IsPQ0alqEtID3EakumIk.NZiLQ9yQeCdROMNRaLtSO0U74P8MpIcElFncx8UMr8l36GA6zSqfi_DEzmmnT2wbnwsgTuYbvql

After several page-turning and loading, it is found that the former part is fixed, while the encrypted part is constantly changing, and the string X2sCXRB4 is also fixed

It is obvious that the entire URL is encrypted and then spliced ​​with the previous part to generate a piece of ciphertext

The page turning action website uses ajax technology to achieve, no matter how the ajax request is packaged and sent, the bottom layer must be implemented with XMLHttpRequest technology, otherwise the request cannot be sent to the server without refreshing, and then the web page will return data without refreshing

The process of triggering page turning to return data is as follows:

Click to turn the page -> after some initialization -> encrypt and splicing the complete URL -> send to the server -> server accepts the request -> render to the web page

The next thing to do is to pull out the code of the link of encrypted splicing URL and call it, splicing the complete page-turning URL, so as to continuously obtain more data

Encryption codes are generally confusing on many websites, just like valuable web data will require account login ( later, a detailed explanation will be given for various schemes of simulated login ), and direct search is generally not available

And this website cannot directly locate the encrypted code, and there are no keywords that can be searched, so we can only locate from the interface

Via XHR breakpoints:
insert image description here

Because each page turning is through Journal/RightArticle , so we break it directly, and then click other page turning will be broken at the Ajax request

insert image description here

Of course, you can also use the Initiator
insert image description here

In the XMLHttpRequest technology, positioning the send function send in the stack is the last step in realizing communication with the server. Through this function, the requested URL is sent to the server. The last function in the stack is the encryption function, however, since encryption functions are usually obfuscated, it is not recommended to study this function first. Instead, it is recommended to analyze the logic of the send function first

It can be seen from this request server process that encryption must be implemented before send is sent

XMLHttpRequest implementation process:

xhr = XMLHttpRequest()
xhr.open('get', 'http://*.com...', false)
xhr.send(data)

Locate the send function:
insert image description here

Because the jquery.js file is a third-party js framework, encryption functions are generally not written in this framework. According to the previous analysis, the website will rewrite the open function

Analyze the open function, set a breakpoint on the line where the open function is located, and click to turn the page again:
insert image description here

Put the mouse on the open function, you can find that the open function has been rewritten, we click the link to enter the rewritten function:

insert image description here

Follow up with the execution code and enter n.apply(this, arguments)
insert image description here

This function is likely to be an encryption function entry, let's continue down and look at the console

insert image description here

The above guess that this function is an encryption function entry

_$8f(arguments[1])

Rewrite and send the request to the encrypted url, and the result is the same as now. It can be confirmed that the encrypted url is the url we want to reverse

Obviously it is a plain text string, after the function _ KaTeX parse error: Expected group after '_' at position 20: ..., it becomes an object, and the _̲ bt value in the object is the encrypted url

return _$6y[_$V8[36]](this, arguments)

_$6y in this line of code is the open function, as shown in the figure below:

insert image description here

From this, it can be determined that the open function has been rewritten, and the final return is still the open function

Analyze encryption function_ $8f

function _$8f(_$H_, _$wH) {
    
    
    var _$Pv, _$pq = null;
    var _$8_ = _$H_;
    function _$Rv(_$tx, _$65) {
    
    
        var _$xW = [];
        var _$DY = '';
        var _$dF = _$u3(_$4i());
        _$xW = _$xW[_$V8[9]](_$65, _$tx, _$wH || 0, _$dF);
        var _$eq = _$4b(923, _$2l[186], true, _$xW);
        var _$Q6 = _$rW + _$eq;
        _$pq = _$Tt(_$XH(_$Q6), _$2l[27]);
        return _$hT[_$V8[5]](_$DY, _$ph, _$V8[26], _$Q6);
    }
    function _$_L(_$tx) {
    
    
        if (_$tx._$V8) {
    
    
            var _$xW = _$ht(_$ht(_$tx._$wa, _$V8[38])[0], _$V8[78])[1];
            if (_$xW[_$V8[3]](_$Z_) >= 0 && _$xW[_$V8[3]](_$ph) >= 0) {
    
    
                return true;
            }
        }
        return false;
    }
    function _$Hc() {
    
    
        try {
    
    
            if (typeof _$H_ !== _$V8[0])
                _$H_ += '';
            _$Pv = _$wa(_$H_);
            if (_$_L(_$Pv)) {
    
    
                return;
            }
            if (_$45) {
    
    
                _$H_ = _$MJ(_$H_, _$Pv);
            }
        } catch (_$xW) {
    
    
            return;
        }
        if (_$Pv === null || _$Pv._$uF > _$2l[40]) {
    
    
            _$4b(953, _$2l[186]);
            return;
        }
        if (_$_u(_$Pv)) {
    
    
            _$4b(953, _$2l[186]);
            return;
        }
        _$H_ = _$Pv._$iB + _$Pv._$fp;
        var _$DY = _$s3(_$Pv);
        var _$dF = _$DY ? _$V8[78] + _$DY : '';
        var _$eq = _$3v(_$zB(_$iL(_$Pv._$nu + _$dF)));
        var _$Q6 = 0;
        if (_$Pv._$9h) {
    
    
            _$Q6 |= 1;
        }
        if (_$bP & _$2l[49]) {
    
    
            _$Q6 |= _$2l[40];
        }
        _$H_ += _$V8[78] + _$Rv(_$Q6, _$eq, _$wH);
        if (_$DY.length > 0) {
    
    
            if (_$24 && _$24 <= _$2l[149]) {
    
    
                _$H_ = _$e7(_$H_);
            }
            if (!(_$bP & _$2l[9])) {
    
    
                _$DY = _$e7(_$DY);
            }
            _$DY = _$V8[66] + _$zL(_$DY, _$pq, _$2l[40]);
        }
        _$H_ += _$DY;
    }
    function _$xa(_$tx) {
    
    
        _$5H(_$2l[27], _$Ka());
        if (_$pq === null || _$lh(_$Pv) === false) {
    
    
            return _$tx;
        }
        if (typeof _$tx === _$V8[0] || typeof _$tx === _$V8[447] || typeof _$tx === _$V8[347]) {
    
    
            _$tx = '' + _$tx;
            if (_$tx.length <= _$pC) {
    
    
                _$tx = _$zL(_$tx, _$pq, _$2l[178]);
            }
        }
        return _$tx;
    }
    function _$BZ() {
    
    
        return _$pq !== null;
    }
    function _$dF(_$tx, _$65) {
    
    
        if ((_$tx === 'get' || _$tx === _$V8[106]) && _$BZ() && (_$4o & 1) && (_$bP & _$2l[49]) && _$Pv && _$Pv._$uF < _$2l[178] && _$P4(_$Pv)) {
    
    
            if (_$Pv._$9h) {
    
    
                this._$Tt = true;
            } else {
    
    
                if (_$65 === _$Sc || _$65 === null || _$65 === '') {
    
    
                    _$65 = _$V8[105];
                }
                if (_$65 === _$V8[105]) {
    
    
                    this._$Tt = true;
                    return _$65;
                }
            }
        }
        return '';
    }
    _$Hc();
    return {
    
    
        _$ni: _$8_,
        _$bt: _$H_,
        _$rI: _$xa,
        _$K$: _$dF,
        _$Wu: _$Qw,
        _$Tt: false
    };
}

Several functions and variables are defined in the function, and the Hc() function is called at the end. Why is it called here? It can be seen from the above that the result of the 8f function has a return value, and the encrypted url is returned, so calling the function _$Hc() here is likely to generate encrypted parameters

Finally, an object is returned, and the key is bt in the object , and the value corresponding to this key is found to be the encrypted url, that is to say, $H is an encrypted url

Where did this H come from? In the entire 8f function, only the _Hc() function call is executed, and the other functions just declare the function without calling it. It can be guessed that the url value of H must be generated in the _$Hc() function, because other functions are not called.

Analysis_ $Hc() function

What actually generates the url in this function is the following line of code:

_$H_ += _$V8[78] + _$Rv(_$Q6, _$eq, _$wH)

V8[78] is a question mark, and the following _$Rv function is the function that actually generates the url encryption parameters

Now it is further found that the function _$Rv that actually generates the encrypted parameters is the first function defined in the above large function _$8f

Analysis_ $Rv function

In this function, the real encrypted function, _$4b is the real encrypted function, and the following code is to splice the encrypted code through the function concat

return _$hT[_$V8[5]](_$DY, _$ph, _$V8[26], _$Q6)

_$ph is:

X2sCXRB4

Isn’t this string the first string following the question mark in the url? From this point, it can also be determined that the function that actually generates the encrypted string is _$4b

Analyze the _$4b function, enter this function and find thousands of lines of code, the _$4b function is an anti-climbing and obfuscation code that controls the flattened structure of the flow

The url encryption parameter is encrypted and generated in this flat stream code. The simplified _$4b code is as follows

var _$1V, _$IF, _$f4 = _$P2, _$je = _$oY[0];
function _$4b(_$xI, _$H_, _$wH, _$_M) {
    
    
    function _$bc() {
    
    }
    function _$Lp() {
    
    }
    function _$gd() {
    
    }
    function _$q9() {
    
    }
    function _$xQ() {
    
    }
    function _$Po() {
    
    }
    var _$Pv, _$Xl, _$Rv, _$BZ, _$_L, _$pq, _$gy, _$xp, _$Xt, _$xW, _$8_, _$cy, _$iO, _$Xi, _$bX, _$DY, _$dF, _$Q6, _$p4, _$aB, _$9P, _$eq, _$zH, _$Hc, _$ss, _$xq, _$Hu, _$ya, _$xa, _$hI;
    var _$LP, _$vc, _$Rb = _$xI, _$eS = _$oY[1];
    while (1) {
    
    
        _$vc = _$eS[_$Rb++];
      if (_$vc < 256) {
    
    
        ...
      }
    }
}

Omit a lot of if logic

The specific encryption codes have been found so far, and the following is how to extract these obfuscated codes. The flat flow structure code is the key point of difficulty, and time is limited. Find time to make up later

If you use RPC, you don’t need to dig out the code, just find the encryption function entry and call it directly

In addition, interested friends can take a look at this website. The functions and variable names in the article change dynamically, because the code generated by Chrome on the virtual host will regenerate the changes and function names as long as the browser is refreshed.

  Well, it's time to say goodbye to everyone here again. It's not easy to create, please give me a like before leaving. Your support is the driving force for my creation, and I hope to bring you more high-quality articles

Guess you like

Origin blog.csdn.net/qiulin_wu/article/details/131837792