Copyright statement: originality is not easy, plagiarism and reprinting are prohibited in this article, and infringement must be investigated!
Table of contents
1. JS Reverse Goal - Will be at the end
Introduction to oklink:
oklink is a blockchain organization, including eth, btc, tron, polygon, bsc and other blockchain browsers. Most of their transactions, addresses, tags and other data can be found on this website
.
You have a task, you need to crawl down the label data of each address according to the address of each chain, so that another colleague of yours - a machine learning/data mining engineer can model and analyze some hidden information of the address, such as whether the address is Money laundering address, whether it is a hacker address, etc.
JS reverse actual combat example address:
https://www.oklink.com/cn/eth/address/0xdac17f958d2ee523a2206206994597c13d831ec7
As mentioned above, we use the address of Ethereum - that is, the 42-bit length address starting with 0x as an example
JS reverse actual combat example page:
Note : the red box is the data we want to crawl.
Encryption parameters :
request header encryption parameters:
encryption data:
2. JS Reverse Analysis - I don't know the true face of Mount Lushan
There are generally several ways to capture data, such as web page element positioning (css/xpath/re, etc.), interface requests (ajax, fetch, etc.), automated simulation (selenium/palywrigth, etc.), etc.; the returned json tag data is encrypted Yes, at this time we may want to capture data by locating web page elements; but when analyzing the html of the web page, we will find that the label data cannot be located. If the data is captured by automatic simulation, the cost performance is too low
. If there are 100,000 pieces of data, then it is necessary to simulate 100,000 clicks. The larger the amount of data, the slower the crawling speed .
There are generally several ways to analyze: global search (or local search), breakpoint mode (ajax breakpoint/DOM breakpoint/listener breakpoint, etc.), hook (hook/interception technology), etc. Global search is our reverse
analysis The most direct, easiest and preferred way, let’s first look at the API of this interface, as
shown in the figure above, the API interface is:
https://www.oklink.com/api/explorer/v1/eth/address/0xdac17f958d2ee523a2206206994597c13d831ec7/more?t=1680606330657
Among them is a URL parameter t, which is a 13-bit timestamp
and the request header encryption parameter is:
x-apikey: LWIzMWUtNDU0Ny05Mjk5LWI2ZDA3Yjc2MzFhYmEyYzkwM2NjfDI3OTE3MTc0NDE3NjQxNjU=
The parameter looks like base64 encryption, let's use the online tool to decrypt it:
Online encryption and decryption tool URL:
https://33tool.com/base64/
The string decrypted using base64 is:
-b31e-4547-9299-b6d07b7631aba2c903cc|2791717441764165
If it is directly decrypted, we use the decrypted string to request directly, and the request fails, then the encryption process must have undergone some other encryption or encoding logic processing. Use the global search, as shown in the figure below: input the parameter x-apikey
for
global search , Then format and view the JS code:
Very good, it seems that it can be searched directly. The variable name and method name have not been confused by JS, which is relatively friendly to crawlers. It can be easily
seen from observing the JS code that when setting the value of the parameter x-apikey through the setRequestHeader () method , it is Use the getApiKey () method, use the shortcut key ctrl+f to search for the method name getApiKey: Observe the code logic in the getApiKey() method, you can easily see that the last return is this.comb(e, t) , this is similar to The self in python, the method comb has two parameters, e is obtained by the method encryptApiKey (), and the t parameter first obtains the current time, which is a 13-bit timestamp, and then calls the method encryptTime(t) to compare the time The parameter t performs encryption logic processing, as the second parameter of comb Let's first look at how encryptApiKey() performs encryption:
{
key: "encryptApiKey",
value: function() {
var t = this.API_KEY
, e = t.split("")
, n = e.splice(0, 8);
return t = e.concat(n).join("")
}
}
The initial variable t is a hardcoded string:
['a', '2', 'c', '9', '0', '3', 'c', 'c', '-', 'b', '3', ………………]
The variable e splits the variable t and returns a list of single characters:
['a', '2', 'c', '9', '0', '3', 'c', 'c', ………………]
Then delete the first 8 characters of the list e and return to form the variable n, which is also a list:
['a', '2', 'c', '9', '0', '3', 'c', 'c']
Finally, with the list e in front and the list n in the back, perform an unsigned connection to get the final variable t, which is the variable e in the method getApiKey()
-b31e-4547-9299-b6d07b7631aba2c903cc
Let's take a look at the logic in the encryptTime() method:
{
key: "encryptTime",
value: function(t) {
var e = (1 * t + a).toString().split("")
, n = parseInt(10 * Math.random(), 10)
, r = parseInt(10 * Math.random(), 10)
, o = parseInt(10 * Math.random(), 10);
return e.concat([n, r, o]).join("")
}
}
It can be seen from the JS code that the variable t here is a 13-bit integer timestamp. Through analysis, the variable a here is also a hard-coded integer variable 1111111111111. After the calculation, convert it to the str type before proceeding. The unsigned split is divided into a list to form the variable e, and the variables n, r, and o are all random integers between 0 and 10. Finally, the four variables are connected unsigned to get the final return value, which is the method getApiKey() The variable t in
the last look at the final method comb():
{
key: "comb",
value: function(t, e) {
var n = "".concat(t, "|").concat(e);
return window.btoa(n)
}
}
One thing to note here, this.comb(e, t), after the variables e and t are passed to the method comb, the actual parameter variable e becomes the formal parameter variable t, and the actual parameter variable t becomes the formal parameter variable e , and then use the connector "|" to connect with the formal parameter t first and the formal parameter e later, and the finally returned variable is the variable encrypted by the btoa () method, that is, the base64 method. Due to the JS reverse
here It is relatively simple, without logic obfuscation and JS obfuscation, etc. It can be completed by writing code in python, and there is no need to use execujs, node.js, etc. to simulate the execution of JS code. Through the above analysis, the code after reverse engineering is as follows
:
import requests
import time
import random
import base64
def get_apikey():
API_KEY = "a2c903cc-b31e-4547-9299-b6d07b7631ab"
key1 = API_KEY[0:8]
key2 = API_KEY[8:]
new_key = key2 + key1
current_time = int(time.time() * 1000)
new_time = str(1 * current_time + 1111111111111)
random1 = str(random.randint(0, 9))
random2 = str(random.randint(0, 9))
random3 = str(random.randint(0, 9))
current_time = new_time + random1 + random2 + random3
last_key = new_key + '|' + current_time
x_apiKey = base64.b64encode(last_key.encode('utf-8'))
return str(x_apiKey, encoding='utf-8')
Run the get_apikey () method:
LWIzMWUtNDU0Ny05Mjk5LWI2ZDA3Yjc2MzFhYmEyYzkwM2NjfDI3OTE3ODQ3MTU2NDMzMjk=
The reverse code is written
3. JS reverse testing - just because I am in this mountain
Next, only the test is left, because both the api interface and the reverse code use timestamps. In order to maintain the consistency of timestamps, we pass the timestamps as parameters to get_apikey(now_time), and then write simple code for testing. The code is as follows :
# -*- coding: utf-8 -*-
import requests
import time
import random
import base64
def get_apikey(now_time):
API_KEY = "a2c903cc-b31e-4547-9299-b6d07b7631ab"
key1 = API_KEY[0:8]
key2 = API_KEY[8:]
new_key = key2 + key1
new_time = str(1 * now_time + 1111111111111)
random1 = str(random.randint(0, 9))
random2 = str(random.randint(0, 9))
random3 = str(random.randint(0, 9))
now_time = new_time + random1 + random2 + random3
last_key = new_key + '|' + now_time
x_apiKey = base64.b64encode(last_key.encode('utf-8'))
return str(x_apiKey, encoding='utf-8')
now_time = int(time.time()) * 1000
headers = {
'x-apikey': get_apikey(now_time),
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36'
}
api = f'https://www.oklink.com/api/explorer/v1/eth/address/0xdac17f958d2ee523a2206206994597c13d831ec7/more?t={
now_time}'
res = requests.get(url=api, headers=headers)
print(res.json())
Console output (format it):
{
'code': 0,
'msg': '',
'detailMsg': '',
'data':
{
'entityTags':
['QayvIUbQGpJhs4QOJk7Ccw==: dlWG6vsFQhA+YAnbzdnYNg==. igTdUMG1sXqlL+ISnaIU8Q=='],
'propertyTags':
['BYqzosCjwa3Hdj/jGp99Xg==', 'B3N0UYJLaM9LPazO98GU9Q==']
}
}
Now the response can be returned normally, but it can be seen from the output that even if we crack the encryption parameters of the request header, its response tag data is still encrypted, so now we need to further crack the encryption logic of the tag data? ?
4. JS anti-reverse - Willows and flowers bright another village
It is certainly feasible to further analyze the encryption logic of the tag data in the response, but let's think about it, no matter what encryption logic is adopted in JS, its initial string remains unchanged:
API_KEY = "a2c903cc-b31e-4547-9299-b6d07b7631ab"
What will happen if we use this string instead of x-apikey to make a request directly? @>_<@
Yes, we directly write x-apikey into API_KEY:
now_time = int(time.time()) * 1000
headers = {
'x-apikey': 'a2c903cc-b31e-4547-9299-b6d07b7631ab',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36'
}
api = f'https://www.oklink.com/api/explorer/v1/eth/address/0xdac17f958d2ee523a2206206994597c13d831ec7/more?t={
now_time}'
res = requests.get(url=api, headers=headers)
print(res.json())
Console output (format it):
{
'code': 0,
'msg': '',
'detailMsg': '',
'data':
{
'entityTags':
['DeFi: Tether. USDT Stablecoin'],
'propertyTags':
['ERC20', 'Tether USDT']
}
}
Observe, this idea is feasible, and this should be the way we want. We not only get the decrypted real data, but also include the type of blockchain address: Defi, ERC20, and this type of data is in It is invisible on the webpage, and can only be obtained by requesting data through the interface. So far, we have completed JS reverse engineering, not JS reverse engineering
. So why is this? It is possible to successfully request data directly through hard-coded strings, and the obtained data is still unencrypted data. The
reason may be that this website adopts reverse JS reverse logic, so that those who are too focused on JS reverse construction logic code data capture The taker will fall into a misunderstanding, but can't find the most real and simple and direct way, and is hit with an anti-surrounding.
This kind of reverse JS reverse logic is a relatively different existence. Although it is relatively simple, it is worth our while. attention and reflection
Five, oklink reverse complete code download
oklink reverse complete source code download
Disclaimer: This article is for study and research purposes only, do not use it in illegal ways!
6. Author Info
Author: Xiaohong's fishing routine, Goal: Make programming more interesting!
Focus on algorithms, reptiles, websites, game development, data analysis, natural language processing, AI, etc., looking forward to your attention, let us grow and code together!
Copyright Note: This article prohibits plagiarism and reprinting, and infringement must be investigated!