Crawler: analysis of website encryption requests

Crawler: analysis of website encryption requests

Quote:

Recently, because of some requirements, it is necessary to capture the data of some websites and analyze the user's behavior. So a record of some problems encountered, the first time to process this encrypted request,
it also cost myself A lot of time, make a record of the processing process. For friends with similar needs, also add a reference material.

Analysis steps:

1. The first thing to look at is the approximate look of the requested URL.

Here is a get request. The composition of the uri is: website domain name + api address + analysis parameters. The
encrypted part is actually this analysis. This website has get and post requests, but An analysis parameter will be added to the parameter to distinguish whether it is
an internal call or a request from others. So as long as we crack the generation rule of this parameter, we can get the data of this website.
2. So we can find You can't find the rules of splicing parameter encryption from the js file of the website or where, just start with the first request parameter. We are in the developer mode, click Network, then select the first interface, and click Initiator. It
will jump. Go to the corresponding js. At this time, it is found that js is compressed. After clicking the format code that comes with the chrome developer tool

, we can easily view where the encryption rules come from.
3. At this time, we can go to Network Add XHR breakPoint to the sidebar on the right,

add the keywords in our previous url, and then refresh the page. When the url is requested again, it will return to the breakpoint. The

left side is the various parameters in the code. Variables, values, etc., on the right is the call stack, and we can find out where the parameters are encrypted step by step.
4. The functions in the js of this website make sense to me as a backend, and it is a bit difficult to understand. I am From the information of some great gods on the Internet, and then from the initial call stack to check the parameters passed online one by one, to see which step was encrypted.

This process takes a lot of time. There are many anonymous functions in the js file. When debugging, it is impossible to read them all step by step. The main reason is to find the problem of passing parameters. Compare and find whether there is one of the parameters in the two call stacks in the figure. The analysis parameter indicates that the encryption is done in another call stack. So it seems that it is located in the get call,
and it is easy to know which file is encrypted.
5. We just called this get

Put a breakpoint inside to debug step by step to see the parameter value . In fact, the readability is quite poor. We can select the parameter we want to view with the mouse and push it out step by step to approximate the encryption process
. It also takes a lot of time to calculate.

result:

The final translation into java code is probably like this

private static String generatorAnalysis() {
       final String FLAG = "@#";
       // @#/rank/indexPlus/brand_id/1@#52217050198@#1
       // 时间戳的差值
       Long time = System.currentTimeMillis() - 1515125653845L;
       // 请求的api
       String baseUrl = "/rank/indexPlus/brand_id/1";
       StringBuilder builder = new StringBuilder(FLAG);
       // 拼接字符串的规则 api + @# + 时间戳差值 + @#1
       builder.append(baseUrl).append(FLAG).append(time).append(FLAG).append(1);
       // 自定义的加密 这里加密函数不就给出了  可以自己根据断点里面的值推算出来哦
       String encodeStr = diyEncode(builder.toString());
       // 最后base64编码下
       return new String(Base64.getEncoder().encode(encodeStr.getBytes()));
   }

1. First generate a difference based on the current 13-bit timestamp and a fixed timestamp
2. The second parameter is the requested api
3. Then splice out a specific string according to the splicing rules in js. One of them is @# is the splicing separator
4. Encrypt this string
5. Finally, base64 encode this string to complete

to sum up:

    Don’t panic if you encounter encryption parameters in the future. If the website is not the encryption parameter of the app, there is still a high probability that it can be found. Analyze the structure of the url, and then reverse the encryption process according to the referenced js file. This process may use tools such as fidder or Charles to
replace the js of the website (used to spy on the value of some parameters, which is not given in the text, and I plan to write a special introduction next time). Finally, the encryption rules are translated Just the code. The main point that takes time is to find the encrypted code from the js file and the process of translating the encrypted code. Be careful that the part that may be encrypted is one line, but it is the most critical point. Make
a record again , And also help other friends hack similar websites to grab data, but don't use it for violations.

Guess you like

Origin blog.csdn.net/sc9018181134/article/details/100631528