Presto Ethereum connector

Presto Ethereum connector

Release the interactive SQL query function of Presto on the Ethereum blockchain

Introduction

Presto is a powerful interactive query engine. As long as there is a connector to the source, it can run SQL queries on any content (including MySQL, HDFS, local files, Kafka).

This is the Presto connector to the Ethereum blockchain data. With this connector, you can start analyzing the Ethereum blockchain without knowing how to use the sophisticated Javascript API.

prerequisites

There is an Ethereum client that can connect. There are 2 options:

  1. Run Geth or Parity locally .
  2. Use Infura ( Ethereum client hosted in the cloud).

note

You can specify a block range here (for example WHERE block.block_number > x AND block.block_number < y, or WHERE transaction.tx_blocknumber > x AND transaction.tx_blocknumber < y, or WHERE erc20.erc20_blocknumber > x AND erc20.erc20_blocknumber < y). The block number is the default value and is the only predicate that can be pushed down to narrow the scope of the data scan. A query without a block range will cause presto to always retrieve blocks from the first block, which will take a long time.

usage

  1. Install Presto . Follow the instructions on this page to create the relevant configuration files.
    At the end of this step, your presto installation folder structure should look like this:

    ├── bin
    ├── lib
    ├── etc
    │   ├── config.properties
    │   ├── jvm.config
    │   └── node.properties
    ├── plugin
    
  2. Install Presto CLI

  3. Clone this repository and run it mvn clean packageto build the plugin. You will targetfind the built-in plugins in the folder.

  4. Load the plugin into Presto
    a. Create the Ethereum connector configuration in etc.
    $ mkdir -p etc/catalog && touch etc/catalog/ethereum.properties
    Paste the following into ethereum.properties:

    connector.name=ethereum
    
    # You can connect through Ethereum HTTP JSON RPC endpoint
    # IMPORTANT - for local testing start geth with rpcport
    # geth --rpc --rpcaddr "127.0.0.1" --rpcport "8545"
    ethereum.jsonrpc=http://localhost:8545/
    
    
    # Or you can connect through IPC socket
    # ethereum.ipc=/path/to/ipc_socketfile
    
    # Or you can connect to Infura
    # ethereum.infura=https://mainnet.infura.io/<your_token>
    

    b. Copy and unzip the built plugin into the presto plugin folder

    $ mkdir -p plugin/ethereum \
      && cp <path_to_this_repo>/target/presto-ethereum-*-plugin.tar.gz . \
      && tar xfz presto-ethereum-*-plugin.tar.gz -C plugin/ethereum --strip-components=1
    

    At the end of this step, your presto installation folder structure should look like this:

    ├── bin
    ├── lib
    ├── etc
    │   ├── catalog
    │   │   └── ethereum.properties
    │   ├── config.properties
    │   ├── jvm.config
    │   └── node.properties
    ├── plugin
    │   ├── ethereum
    │   │   └── <some jars>
    
  5. You go. Now, you can start the presto server and query via presto-cli:

$ bin/launcher start
$ presto-cli --server localhost:8080 --catalog ethereum --schema default

example

Inspired by the analysis of the first 100,000 blocks , the following SQL query part captures the content described in this article.

  • Time of the first 50 blocks (in seconds)
SELECT b.bn, (b.block_timestamp - a.block_timestamp) AS delta
FROM
    (SELECT block_number AS bn, block_timestamp
    FROM block
    WHERE block_number>=1 AND block_number<=50) AS a
JOIN
    (SELECT (block_number-1) AS bn, block_timestamp
    FROM block
    WHERE block_number>=2 AND block_number<=51) AS b
ON a.bn=b.bn
ORDER BY b.bn;

Average block time (from the inception to the 200th block of block 10000)

WITH
X AS (SELECT b.bn, (b.block_timestamp - a.block_timestamp) AS delta
        FROM
            (SELECT block_number AS bn, block_timestamp
            FROM block
            WHERE block_number>=1 AND block_number<=10000) AS a
        JOIN
            (SELECT (block_number-1) AS bn, block_timestamp
            FROM block
            WHERE block_number>=2 AND block_number<=10001) AS b
        ON a.bn=b.bn
        ORDER BY b.bn)
SELECT min(bn) AS chunkStart, avg(delta)
FROM
    (SELECT ntile(10000/200) OVER (ORDER BY bn) AS chunk, * FROM X) AS T
GROUP BY chunk
ORDER BY chunkStart;

The largest miner in the first 100,000 blocks (address, block, %)

ELECT block_miner,count(*)AS num,count(*)/ 100000。0  AS PERCENT
 FROM block
 WHERE block_number <= 100000 
GROUP BY block_miner
 ORDER BY num DESC 
LIMIT  15 ;

 

  • ERC20 token movement in the last 100 blocks
SELECT erc20_token,SUM(erc20_value)FROM erc20
 WHERE erc20_blocknumber > =  4147340  AND erc20_blocknumber <= 4147350 
GROUP BY erc20_token;

Describe the database structure

SHOW TABLES;
    Table
-------------
 block
 erc20
 transaction

DESCRIBE block;
Column                 | Type               | Extra | Comment
-----------------------------------------------------------
block_number           | bigint             |       |
block_hash             | varchar(66)        |       |
block_parenthash       | varchar(66)        |       |
block_nonce            | varchar(18)        |       |
block_sha3uncles       | varchar(66)        |       |
block_logsbloom        | varchar(514)       |       |
block_transactionsroot | varchar(66)        |       |
block_stateroot        | varchar(66)        |       |
block_miner            | varchar(42)        |       |
block_difficulty       | bigint             |       |
block_totaldifficulty  | bigint             |       |
block_size             | integer            |       |
block_extradata        | varchar            |       |
block_gaslimit         | double             |       |
block_gasused          | double             |       |
block_timestamp        | bigint             |       |
block_transactions     | array(varchar(66)) |       |
block_uncles           | array(varchar(66)) |       |


DESCRIBE transaction;

Column              |    Type     | Extra | Comment
--------------------------------------------------
tx_hash             | varchar(66) |       |
tx_nonce            | bigint      |       |
tx_blockhash        | varchar(66) |       |
tx_blocknumber      | bigint      |       |
tx_transactionindex | integer     |       |
tx_from             | varchar(42) |       |
tx_to               | varchar(42) |       |
tx_value            | double      |       |
tx_gas              | double      |       |
tx_gasprice         | double      |       |
tx_input            | varchar     |       |


DESCRIBE erc20;
      Column       |    Type     | Extra | Comment
-------------------+-------------+-------+---------
 erc20_token       | varchar     |       |
 erc20_from        | varchar(42) |       |
 erc20_to          | varchar(42) |       |
 erc20_value       | double      |       |
 erc20_txhash      | varchar(66) |       |
 erc20_blocknumber | bigint      |       |

Web3 function

In addition to various built-in Presto functions, some web3 functions have been ported so that they can be called directly inline with SQL statements. Currently, the supported web3 functions are

  1. From Wei
  2. To Wei
  3. eth_gas price
  4. eth_blockNumber
  5. eth_getBalance
  6. eth_getTransactionCount

Troubleshooting

  • You must use python2. If you use Python3, you will receive invalid syntax errors.
-> bin/launcher start
  File "/your_path/presto-server-0.196/bin/launcher.py", line 38
    except OSError, e:
                  ^
SyntaxError: invalid syntax
  • Only use Java 8. If the wrong Java version is used, the following error may occur.
Unrecognized VM option 'ExitOnOutOfMemoryError'
Did you mean 'OnOutOfMemoryError=<value>'?
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

 Source address: https://github.com/xiaoyao1991/presto-ethereum

Guess you like

Origin blog.csdn.net/weixin_39842528/article/details/108382291