以太坊源码分析之三网络分析之一网络的发现维护

以太坊源码分析之三网络分析之一网络的发现维护


前面分析了启动,启动完成了就要开始布网了,只有网络开始服务,才能进行各种分布式去中心化的工作。这里重点介绍网络发现、维护还有数据交互。对RPC等的分析会往后推一下。
1、 网络发现(UDP的网络发现)
在前面提到过,以太坊中,发现网络使用的是UDP的网络通信方式。在以太坊中维护网络使用的是Kademlia的DHT的方式。DHT,分布式哈希表,是一种分布式的存储方法,这种方法目前在P2P的网络中被广泛的使用,比如常见的电驴和BT。
2、 以太坊的网络发现的主要模块
以太坊中,与此部分相关的主要是P2P模块中的discover/discv5,后者是对前者的一个优化,目前正在进行中,所以此处只分析前面的部分。具体看下图。
以太坊的节点的命令比较简单只有常见的如下几个:
PING:探测一个节点,判断其是否在线
ping struct {
Version    uint
From, To   rpcEndpoint
Expiration uint64
// Ignore additional fields (for forward compatibility).
Rest []rlp.RawValue `rlp:"tail"`
}
PONG:PING命令响应
// pong is the reply to ping.
pong struct {
// This field should mirror the UDP envelope address
// of the ping packet, which provides a way to discover the
// the external address (after NAT).
To rpcEndpoint


ReplyTok   []byte // This contains the hash of the ping packet.
Expiration uint64 // Absolute timestamp at which the packet becomes invalid.
// Ignore additional fields (for forward compatibility).
Rest []rlp.RawValue `rlp:"tail"`
}
FINDNODE: 向节点查询某个与目标节点ID距离接近的节点
// findnode is a query for nodes close to the given target.
findnode struct {
Target     NodeID // doesn't need to be an actual public key
Expiration uint64
// Ignore additional fields (for forward compatibility).
Rest []rlp.RawValue `rlp:"tail"`
}
NEIGHBORS: FIND_NODE命令响应,发送与目标节点ID距离接近的K桶中的节点
// reply to findnode
neighbors struct {
Nodes      []rpcNode
Expiration uint64
// Ignore additional fields (for forward compatibility).
Rest []rlp.RawValue `rlp:"tail"`
}


3、 网络节点间的算法Kademlia
算法的部分这里不准备展开讲,有兴趣的可以查看源码和相关的资料,这里只介绍一下大概的内容。
Kad网的几个基础概念,节点距离,节点ID,K桶,邻居节点。需要说明的是,节点的距离不是传统意义的多少跳或者空间距离,而是一个根据节点ID进行异或(当然也可以用其它的算法)出来的一个距离。K桶其实就是就是网络节点的路由的数据结构。其中记录了nodid,distance,ip,endpoint等信息。以太坊的K桶按照与目标节点距离进行排序,共分为256个K桶,第个K桶中有16个节点。邻居节点就是与本节点可以相连通的结点。
Kademlia的算法
4、 工作流程
以太坊网络在第一次启动时会随机生成本机节点LocalId(即本节点的NODEID。
程序根据命令行或者相关配置信息连接种子节点或者相关节点数据信息,完成乒乓握手后,写入K桶。 
 
主要过程:
首先,选择随机节点(首次是从种子节点,以最后面提到的,再次启动后从数据库leveldb中得到),写到内存的桶数据结构;
tab.seedRand()
其次,启动一个检查过期goroutine, 根据过期的实际情况来定期更新leveldb中的stale data(陈旧数据)。
// Start the background expiration goroutine after loading seeds so that the search for
// seed nodes also considers older nodes that would otherwise be removed by the
// expiration.
tab.db.ensureExpirer()
再次,用一个goroutine,loop,处理节点的更新验证等动作。下面写的步骤就在这个goroutine中;主要就是doRefresh:
go tab.loop()
然后,加载种子节点
tab.loadSeedNodes(false)


再后,在doRefresh中以self.ID查找。
go tab.doRefresh(refreshDone)
最后,循环3遍:利用随机生成的目标ID,继续查找
// bootstrap or discarded faulty peers).
func (tab *Table) doRefresh(done chan struct{}) {
defer close(done)
// Load nodes from the database and insert
// them. This should yield a few previously seen nodes that are
// (hopefully) still alive.
tab.loadSeedNodes(true)


// Run self lookup to discover new neighbor nodes.
tab.lookup(tab.self.ID, false)


// The Kademlia paper specifies that the bucket refresh should
// perform a lookup in the least recently used bucket. We cannot
// adhere to this because the findnode target is a 512bit value
// (not hash-sized) and it is not easily possible to generate a
// sha3 preimage that falls into a chosen bucket.
// We perform a few lookups with a random target instead.
for i := 0; i < 3; i++ {
var target NodeID
crand.Read(target[:])
tab.lookup(target, false)
5、 发现节点主要是上面提到的那个函数
func (tab *Table) lookup(targetID NodeID, refreshIfEmpty bool) []*Node {
var (
target         = crypto.Keccak256Hash(targetID[:])
asked          = make(map[NodeID]bool)
seen           = make(map[NodeID]bool)
reply          = make(chan []*Node, alpha)
pendingQueries = 0
result         *nodesByDistance
)
// don't query further if we hit ourself.
// unlikely to happen often in practice.
asked[tab.self.ID] = true


for {
tab.mutex.Lock()
// generate initial result set
result = tab.closest(target, bucketSize)
tab.mutex.Unlock()
if len(result.entries) > 0 || !refreshIfEmpty {
break
}
// The result set is empty, all nodes were dropped, refresh.
// We actually wait for the refresh to complete here. The very
// first query will hit this case and run the bootstrapping
// logic.
<-tab.refresh()
refreshIfEmpty = false
}


for {
// ask the alpha closest nodes that we haven't asked yet
for i := 0; i < len(result.entries) && pendingQueries < alpha; i++ {
n := result.entries[i]
if !asked[n.ID] {
asked[n.ID] = true
pendingQueries++
go func() {
// Find potential neighbors to bond with
                    //调用findnode来查找节点
r, err := tab.net.findnode(n.ID, n.addr(), targetID)
if err != nil {
// Bump the failure counter to detect and evacuate non-bonded entries
fails := tab.db.findFails(n.ID) + 1
tab.db.updateFindFails(n.ID, fails)
log.Trace("Bumping findnode failure counter", "id", n.ID, "failcount", fails)


if fails >= maxFindnodeFailures {
log.Trace("Too many findnode failures, dropping", "id", n.ID, "failcount", fails)
tab.delete(n)
}
}
reply <- tab.bondall(r)//连接并维护节点
}()
}
}
if pendingQueries == 0 {
// we have asked all closest nodes, stop the search
break
}
// wait for the next reply
for _, n := range <-reply {
if n != nil && !seen[n.ID] {
seen[n.ID] = true
result.push(n, bucketSize)
}
}
pendingQueries--
}
return result.entries
}
6、 维护结点
下面这段代码有一大段的英文说明,意思是:要确保本地与远程节点绑定后,才可将其插入桶中。这样才能保证PING/PONG的成功进行。绑定意味着远程操作可以起作用,而不是简单的节点乒乓。忽略半连接即PING/PONG只完成一半。
func (tab *Table) bond(pinged bool, id NodeID, addr *net.UDPAddr, tcpPort uint16) (*Node, error) {
if id == tab.self.ID {
return nil, errors.New("is self")
}
if pinged && !tab.isInitDone() {
return nil, errors.New("still initializing")
}
// Start bonding if we haven't seen this node for a while or if it failed findnode too often.
node, fails := tab.db.node(id), tab.db.findFails(id)
age := time.Since(tab.db.bondTime(id))
var result error
if fails > 0 || age > nodeDBNodeExpiration {
log.Trace("Starting bonding ping/pong", "id", id, "known", node != nil, "failcount", fails, "age", age)


tab.bondmu.Lock()
w := tab.bonding[id]
if w != nil {
// Wait for an existing bonding process to complete.
tab.bondmu.Unlock()
<-w.done
} else {
// Register a new bonding process.
w = &bondproc{done: make(chan struct{})}
tab.bonding[id] = w
tab.bondmu.Unlock()
// Do the ping/pong. The result goes into w.
tab.pingpong(w, pinged, id, addr, tcpPort)
// Unregister the process after it's done.
tab.bondmu.Lock()
delete(tab.bonding, id)
tab.bondmu.Unlock()
}
// Retrieve the bonding results
result = w.err
if result == nil {
node = w.n
}
}
// Add the node to the table even if the bonding ping/pong
// fails. It will be relaced quickly if it continues to be
// unresponsive.
if node != nil {
tab.add(node)
tab.db.updateFindFails(id, 0)
}
return node, result
}
7、 更新节点
上面的调用下面的这个代码和tab.db.updateFindFails(id, 0)
// add attempts to add the given node its corresponding bucket. If the
// bucket has space available, adding the node succeeds immediately.
// Otherwise, the node is added if the least recently active node in
// the bucket does not respond to a ping packet.
//
// The caller must not hold tab.mutex.
func (tab *Table) add(new *Node) {
tab.mutex.Lock()
defer tab.mutex.Unlock()


b := tab.bucket(new.sha)//计算距离
    //判断是否在桶
if !tab.bumpOrAdd(b, new) {
// Node is not in table. Add it to the replacement list.
tab.addReplacement(b, new)
}
}
8、 距离计算的代码
// logdist returns the logarithmic distance between a and b, log2(a ^ b).
func logdist(a, b common.Hash) int {
lz := 0
for i := range a {
x := a[i] ^ b[i]
if x == 0 {
lz += 8
} else {
lz += lzcount[x]
break
}
}
return len(a)*8 - lz
}


// hashAtDistance returns a random hash such that logdist(a, b) == n
func hashAtDistance(a common.Hash, n int) (b common.Hash) {
if n == 0 {
return a
}
// flip bit at position n, fill the rest with random bits
b = a
pos := len(a) - n/8 - 1
bit := byte(0x01) << (byte(n%8) - 1)
if bit == 0 {
pos++
bit = 0x80
}
b[pos] = a[pos]&^bit | ^a[pos]&bit // TODO: randomize end bits
for i := pos + 1; i < len(a); i++ {
b[i] = byte(rand.Intn(255))
}
return b
}
9、 种子节点:
在params/bootnodes.go中:
// MainnetBootnodes are the enode URLs of the P2P bootstrap nodes running on
// the main Ethereum network.
var MainnetBootnodes = []string{
// Ethereum Foundation Go Bootnodes
"enode://a979fb575495b8d6db44f750317d0f4622bf4c2aa3365d6af7c284339968eef29b69ad0dce72a4d8db5ebb4968de0e3bec910127f134779fbcb0cb6d3331163c@52.16.188.185:30303", // IE
"enode://3f1d12044546b76342d59d4a05532c14b85aa669704bfe1f864fe079415aa2c02d743e03218e57a33fb94523adb54032871a6c51b2cc5514cb7c7e35b3ed0a99@13.93.211.84:30303",  // US-WEST
"enode://78de8a0916848093c73790ead81d1928bec737d565119932b98c6b100d944b7a95e94f847f689fc723399d2e31129d182f7ef3863f2b4c820abbf3ab2722344d@191.235.84.50:30303", // BR
"enode://158f8aab45f6d19c6cbf4a089c2670541a8da11978a2f90dbf6a502a4a3bab80d288afdbeb7ec0ef6d92de563767f3b1ea9e8e334ca711e9f8e2df5a0385e8e6@13.75.154.138:30303", // AU
"enode://1118980bf48b0a3640bdba04e0fe78b1add18e1cd99bf22d53daac1fd9972ad650df52176e7c7d89d1114cfef2bc23a2959aa54998a46afcf7d91809f0855082@52.74.57.123:30303",  // SG


// Ethereum Foundation C++ Bootnodes
"enode://979b7fa28feeb35a4741660a16076f1943202cb72b6af70d327f053e248bab9ba81760f39d0701ef1d8f89cc1fbd2cacba0710a12cd5314d5e0c9021aa3637f9@5.1.83.226:30303", // DE
}
这样下来,基本上的UDP的网络发现和维护就实现了。

猜你喜欢

转载自blog.csdn.net/fpcc/article/details/80555403