Bitcoin transaction information data set preparation

1. Data source

The data source is: https://www.oklink.com/eth/tx-list , we will process the transaction information as the data source of the Bitcoin transaction data set.

Two, the type of data prepared and its meaning

After we sort the data, it is classified into the following data types:

交易哈希 :比特币交易哈希,例如7f5a92db44be25414d5c322cc474bfcb94e538f5e511ff2776db6bf7c507bba0
区块高度 :该交易所在区块,例如667300
交易时间戳 :交易所在区块出块时间戳(秒级),例如1611404276
输入地址 :交易支付方地址,例如16M3qXrGkAYppA1aJug49JtYDxQkdKGLW8
输出地址 :交易接收方地址,例如1387LuWrcYBcGtBsADix6Yo1iLox6VM4m1
交易金额 :按照比例转换后的单个支出方支付给单个接收方的交易金额,例如0.002489832589155737

Three, data processing

Due to the unique transaction method of Bitcoin, one-to-one transaction information between the sender and receiver in a transaction cannot be obtained. Therefore, we convert the transaction amount in an equal proportion so that the transaction amount is processed by the receiver's amount multiplied by the sender's proportion of the total amount of the sender, and finally presents a one-to-one transaction.

公式:单笔交易金额 = 单个接收方接收金额 * (单个发送方发送金额 / 发送方发送总金额)

Special processing:
1. The input party information is empty, only the recipient information is actually the miner receiving the miner fee: record the sender address as null
2. The website recipient information may have parsing errors, but it does not exist: remove it directly , Do not store

Four, data storage

The data is stored in the form of text files, as shown below:

各字段按顺序依次为
交易哈希、区块高度、交易时间戳、输入地址、输出地址、交易金额

bcoin

Five, the code

Finally, attach the complete code:

import requests
import time
import random
import base64


# 封装get请求参数,返回请求参数(字典类型)
def get_params(limit, offset):
    # 获取当前时间戳
    get_time = round(time.time() * 1000)
    # 封装get请求参数
    params = {
    
    
        't': get_time,
        'limit': limit,
        'offset': offset
    }
    return params


# 封装get请求头,返回请求头(字典类型)
def get_headers():
    # 获取动态变化且加密的x_apiKey
    x_apikey = get_x_apikey()
    # 封装请求头
    headers = {
    
    
        'Accept': 'application/json',
        'Accept-Encoding': 'gzip, deflate, br',
        'Accept-Language': 'zh-CN,zh;q=0.9',
        'App-Type': 'web',
        'Connection': 'keep-alive',
        'devId': 'e1e4a5cd-2303-42f7-b6c8-fd19bb6b7e6f',
        'ftID': '52103795853138.011509f1cf101a3f80efe0c3e228e2084ac81.1010L8o0.FB62638978454009',
        'Host': 'www.oklink.com',
        'Referer': 'https://www.oklink.com/btc/tx-list',
        'Sec-Fetch-Dest': 'empty',
        'Sec-Fetch-Mode': 'cors',
        'Sec-Fetch-Site': 'same-origin',
        'User-Agent': 'Mozilla/5.0(Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36',
        'x-apiKey': x_apikey
    }
    return headers


# 获取动态变化且加密的x_apiKey
def get_x_apikey():
    # API_KEY固定字符串
    API_KEY = "a2c903cc-b31e-4547-9299-b6d07b7631ab"
    Key1 = API_KEY[0:8]
    Key2 = API_KEY[8:]
    #  交换API_KEY部分内容
    new_Key = Key2 + Key1
    # 获取当前时间,毫秒级
    cur_time = round(time.time() * 1000)
    # 处理获得的时间
    new_time = str(1 * cur_time + 1111111111111)
    # 生成三个0-9的随机整数
    random1 = str(random.randint(0, 9))
    random2 = str(random.randint(0, 9))
    random3 = str(random.randint(0, 9))
    # 再次处理时间字符串
    cur_time = new_time + random1 + random2 + random3
    # 将包含API_KEY和时间串的内容合并
    this_Key = new_Key + '|' + cur_time
    # 转码
    n_k = this_Key.encode('utf-8')
    # base64加秘
    x_apiKey = base64.b64encode(n_k)
    # 将加密后的x_apiKey返回
    return str(x_apiKey, encoding='utf8')


# 导入数据到文本文件
def insert_txt(data):
    # 创建文本文件,用于存储比特币交易数据
    with open('./b—coin.txt', 'a', encoding='utf-8') as fp:
        # 遍历每一条交易信息
        for transaction in data:
            transaction_address = transaction['hash']   # 交易地址
            block_height = transaction['blockHeight']   # 区块高度
            block_time = transaction['blocktime']  # 交易时间戳
            input_count = transaction['inputsCount']    # 输入账户数量
            input_count_error = 0  # 判断输入解析错误或者输入为空个数
            output_count = transaction['outputsCount']  # 输出账户数量
            output_count_error = 0  # 判断输出解析错误或者输入为空个数
            inputs_all_value = transaction['inputsValue']   # 输入金额
            # outputs_all_value = transaction['outputsValue']  # 输出金额
            # 输入账户金额列表
            input_value_list = []
            # 输入账户地址列表
            input_address_list = []
            # 遍历输入账户,拿到输入账户地址和各账户输入金额
            for in_value in transaction['inputs']:
                # 判断输入是否为空
                if in_value['prevAddresses']:
                    input_value_list.append(in_value['prevValue'])
                    input_address_list.append(in_value['prevAddresses'])
                else:
                    input_count_error += 1
            # 输出账户金额列表
            output_value_list = []
            # 输出账户地址列表
            output_address_list = []
            # 遍历输出账户,拿到输出账户地址和各账户输出金额
            for out_value in transaction['outputs']:
                # 判断输出是否为空
                if out_value['addresses']:
                    output_value_list.append(out_value['value'])
                    output_address_list.append(out_value['addresses'])
                else:
                    output_count_error += 1
            # 重置新的输入,输出个数
            input_count = input_count - input_count_error
            output_count = output_count - output_count_error
            if input_count == 0:  # 对应矿工奖金
                # 一对一给出,写入文本文件
                for j in range(0, output_count):
                    tran_string = transaction_address + " " + str(block_height) + " " + str(block_time) + " null " + output_address_list[j][0] + " " + str(output_value_list[j]) + "\n"
                    fp.write(tran_string)
                    j += 1
            else:  # 对应标准情况,写入文本文件
                for i in range(0, input_count):
                    for j in range(0, output_count):
                        tran_string = transaction_address + " " + str(block_height) + " " + str(block_time) + " " + input_address_list[i][0] + " " + output_address_list[j][0] + " " + str(output_value_list[j]*input_value_list[i]/inputs_all_value) + "\n"
                        fp.write(tran_string)
                        j += 1
                    i += 1


# 主函数
def main():
    set_url = 'https://www.oklink.com/api/explorer/v1/btc/transactionsNoRestrict'
    for i in range(0, 100):
        headers = get_headers()
        params = get_params(100, i*100)
        json_obj = requests.get(url=set_url, params=params, headers=headers).json()
        data = json_obj['data']['hits']
        # 插入数据
        # print(data)
        insert_txt(data)
        print("第"+str(i+1)+"页完成")


if __name__ == "__main__":
    main()

Guess you like

Origin blog.csdn.net/qq_47935193/article/details/113095283