Ethereum source code analysis ethdb

1. ethdb (ethereum database) source code analysis and levelDB

The ethdb package is stored in the location go-ethereum/ethdb

According to the analysis of ethdb interface Database:

//Database接口位置: go-ethereum/ethdb/database.go
type Database interface{
    
    
    Reader  //读取器:从数据库中读取数据
    Writer  //写入器:向数据库中写入数据
    Batcher //支持数据存储的接口,将多个操作合并到一个批处理任务中,然后一次性提交给数据库执行.
            //减少I/O次数
    Iterator //创建数据库内容的二进制字母排序迭代器的方法.
    Stater   //提供了各种从键值数据库和不可改变的原始数据中检索到状态的方法
    Compacter //方法用于给定键范围的底层数据存储进行压缩。它的作用是丢弃已删除和被覆盖的版本,
    //并重新排列数据,以减少访问它们所需的操作成本.
    Snapshotter //基于当前状态创建一个数据库快照,所创建的快照将
}
//上述中都是接口,那么ethdb的数据库必须要实现上述接口中的所有方法.

The underlying database supports three databases: leveldb (kv key-value store), memorydb (memory database) and pebble (kv key-value store). leveldb is the most commonly used and mature option, while memorydb is suitable for scenarios that require high speed and data does not need to be persisted.
Insert image description here

Next we will introduce ethdb under the underlying database implementation of leveldb:

1.1 go-ethereum/leveldb package

package location: go-ethereum/ethdb/leveldb

The actual underlying database is:GitHub - syndtr/goleveldb: LevelDB key/value database in Go. That is, go-ethereum/leveldb is the right A one-level encapsulation of goleveldb.The structure of go-etheruem/ethdb/leveldb is as follows:

type Database struct {
    
    
    fn string      // filename for reporting
    db *leveldb.DB // LevelDB instance(leveldb实例)

    //下面的字段都是测量数据库性能的meter.
    compTimeMeter       metrics.Meter // Meter for measuring the total time spent in database compaction
    compReadMeter       metrics.Meter // Meter for measuring the data read during compaction
    compWriteMeter      metrics.Meter // Meter for measuring the data written during compaction
    writeDelayNMeter    metrics.Meter // Meter for measuring the write delay number due to database compaction
    writeDelayMeter     metrics.Meter // Meter for measuring the write delay duration due to database compaction
    diskSizeGauge       metrics.Gauge // Gauge for tracking the size of all the levels in the database
    diskReadMeter       metrics.Meter // Meter for measuring the effective amount of data read
    diskWriteMeter      metrics.Meter // Meter for measuring the effective amount of data written
    memCompGauge        metrics.Gauge // Gauge for tracking the number of memory compaction
    level0CompGauge     metrics.Gauge // Gauge for tracking the number of table compaction in level0
    nonlevel0CompGauge  metrics.Gauge // Gauge for tracking the number of table compaction in non0 level
    seekCompGauge       metrics.Gauge // Gauge for tracking the number of table compaction caused by read opt
    manualMemAllocGauge metrics.Gauge // Gauge to track the amount of memory that has been manually allocated (not a part of runtime/GC)

    levelsGauge [7]metrics.Gauge // Gauge for tracking the number of tables in levels

    quitLock sync.Mutex      // Mutex protecting the quit channel access
    quitChan chan chan error // Quit channel to stop the metrics collection before closing the database

    log log.Logger // Contextual logger tracking the database path
}

We start from go-etheruem/ethdb/leveldb/leveldb_test.goBenchmarkLevelDB

func BenchmarkLevelDB(b *testing.B) {
    
    
    dbtest.BenchDatabaseSuite(b, func() ethdb.KeyValueStore {
    
    
        db, err := leveldb.Open(storage.NewMemStorage(), nil)
        if err != nil {
    
    
            b.Fatal(err)
        }
        return &Database{
    
    
            db: db,
        }
    })
}
//storage.NewMemStorage()

1.2 leveldb feature

支持基本操作:Put(key,value),Get(key),Delete(key)
1. Both key and value can be any byte array
Data is sorted by key, and the sorting order can be customized.
Batch processing (batcher): Multiple changes can be placed in an atomic batch
Supports forward and backward iterators (iteration)
Supports instantaneous snapshots to obtain consistent views
Supports automatic data compression.

1.2.1 API

创建数据库:
db, err := leveldb.OpenFile(file, options)

读取Get和写入Put操作:
dat, err := db.Get(key, nil) //返回的切片只是拷贝,是可以修改的.
err:=db.Put(key, value, nil)
删除操作:
err:=db.Delete(key, nil)

批量处理
leveldb.Batch{
    
    }.

1.2.2 Source code analysis of leveldb.batch (atomic batching)

github.com/syndtr/goleveldb/leveldb/batch.go file

The batch creation methods provided by leveldb are:

func leveldb.MakeBatch(n int) *leveldb.Batch
//实际上就是对data切片设置为n,实际上可以减少cpu开销.


ethdb中的实现Batch方法:
func (db *Database) NewBatch() ethdb.Batch {
    
    
    return &batch{
    
    
        db: db.db,
        b:  new(leveldb.Batch),
    }
}
ethdb中的batch结构体:创建的时候为一个db绑定一个batch.
type batch struct {
    
    
    db   *leveldb.DB
    b    *leveldb.Batch
    size int
}

How to implement batch processing (join operation) of leveldb?

Batch processing supports Put and Delete operations, so it needs to be open to the outside worldPut, Delete operations and execution (Replay) operations.

1. 结构体:
type Batch struct{
    
    
    data  []byte //存放批量操作的切片
    index []batchIndex //存放每个批量操作在data中的末尾位置

    // internalLen is sums of key/value pair l5. Put和Delete操作ength plus 8-bytes internal key.
    internalLen int
}

2. batchIndex数据类型:
type batchIndex struct{
    
    
    keyType            keyType
    keyPos, keyLen     int //一个数组中的实例的位置1. pos+len 或者2. startPos+endPos
    valuePos, valueLen int
}
batchIndex中需要记录一个Put/Delete请求的放在batch中需要记录Key和Value的位置
    keyType的表示:
type keyType uint
keyTypeDel=keyType(0)
keyTypeVal=keyType(1)

3. 重置Batch
func (b *Batch) Reset() {
    
    
    b.data = b.data[:0]
    b.index = b.index[:0]
    b.internalLen = 0
}

4. Batch长度:
func (b *Batch) Len() int {
    
    
    return len(b.index)
}

5. Put和Delete操作
func (b *Batch) Put(key, value []byte) 
func (b *Batch) Delete(key []byte)

 上面两个实际调用的是appendRec方法

appendRec method:

import "encoding/binary"

func (b *Batch) appendRec(kt keyType, key, value []byte) {
    
    
    //binary.MaxVarintLen32:32位整数经过Varint编码后的最大长度,此时是5.
    //binary.MaxVarintlen32
    n := 1 + binary.MaxVarintLen32 + len(key)
    if kt == keyTypeVal {
    
     //如果是put操作则需要加value的长度
        n += binary.MaxVarintLen32 + len(value)
    }
    b.grow(n) //扩展空间
    index := batchIndex{
    
    keyType: kt}
    o := len(b.data) //在leveldb中key,value都是[]byte数组.
    data := b.data[:o+n] //data的len长度增加从o变为o+n
    data[o] = byte(kt)//每一个batch中的操作第一个字节表示操作类型,
    o++
    //binary.PutUvarint把uint64编码加入到data切片中
    o += binary.PutUvarint(data[o:], uint64(len(key)))
    index.keyPos = o //记录key存放的开始位置
    index.keyLen = len(key) //记录key存放的长度
    o += copy(data[o:], key) //把key复制到data[0:]
    if kt == keyTypeVal {
    
     //put方法,那么就还需要存放val值
        o += binary.PutUvarint(data[o:], uint64(len(value)))
        index.valuePos = o
        index.valueLen = len(value)
        o += copy(data[o:], value)
    }
    b.data = data[:o]
    b.index = append(b.index, index)
    //我觉得应该
    b.internalLen += index.keyLen + index.valueLen + 8
} 

 调用grow方法(切片是由len,cap以及底层数组的指针三元组表示):
func (b *Batch) grow(n int) {
    
    
    o := len(b.data) //data的数组长度
    if cap(b.data)-o < n {
    
    
        div := 1
        if len(b.index) > batchGrowRec {
    
     
            div = len(b.index) / batchGrowRec
        }
        ndata := make([]byte, o, o+n+o/div) 
        //还会额外增加一部分空间:o+n+o*(batchGrowRec)/(b.index)
        //
        copy(ndata, b.data)
        b.data = ndata
    }
}

How to implement batch processing (replay operation) of leveldb?

//批处理就是遍历batch的index数组(数组中的每个元素记录了每一条记录)
func (b *Batch)Replay(r BatchReplay)error{
    
    
    for _,index:=range b.index{
    
    
        switch index.keytype{
    
    
        case keyTypeVal: //put操作,返回key和value的[]byte
            r.Put(index.k(b.data), index.v(b.data))
        case keyTypeDel://delete操作,返回key的[]byte
            r.Delete(index.k(b.data))
        }
    }
}

1.2.3 Source code analysis of iterator.Iterator (iterator) in leveldb of go-ethereum

The iterator implemented by leveldb can be forward and backward (forward and backward)

If the underlying database of Ethereum ethdb uses leveldb, it provides an external interface to create iterators of leveldb. The interface is as follows:

//为满足key的前缀prefix的数据库内容子集创建一个迭代器
func (db *Database) NewIterator(prefix []byte, start []byte) ethdb.Iterator{
    
    
    return db.db.NewIterator(bytesPrefixRange(prefix, start), nil)
    //bytesPrefixRange()返回满足的键范围.    
    //return的类型是leveldb包中iterator.Iterator
}
 上述bytesPrefix

We can find based on the above findings that the iterator instance in leveldb can be assigned to ethdb.Iterator, which means that the iterator instance implements the method of ethdb.Iterator. The interface of ethdb.Iterator As follows:

type Iterator interface {
    
    
    // 移动迭代器到下一个kv对,返回该迭代器是否是饥饿的(exhausted)
    Next() bool

    // Error returns any accumulated error. Exhausting all the key/value pairs
    // is not considered to be an error.
    Error() error

    // 返回当前迭代器指向的key/value对的key或者nil,且不要对返回的切片的值进行更改
    // 如果需要对切片的值更改,需要用copy方式
    Key() []byte

    //返回当前迭代器指向的key/value对的value或者nil,且不要对返回的切片的值进行更改\
    //如果需要对切片的值更改,需要用copy方式
    Value() []byte

    //Release是释放相关的资源,Release是可以被调用多次的且不会返回error.
    Release()
}

1.2.4 snapshot in leveldb of go-ethereum

The type of snapshot interface is defined in Ethereum go-ethereum/ethdb/snapshot.go:

type Snapshot interface{
    
    
    //检索键值存储数据存储支持的快照中是否存在
    Has(key []byte)(bool,error)
    // 检索键值存储数据存储支持的快照中并返回key所对应的value
    Get(key []byte)([]byte,error)
    //释放资源
    Release()
}