以太坊源码分析之 P2P网络（二）

上一篇文章简单介绍了下一些基础的类型定义，从这一篇开始我们将描述p2p网络的更多细节。从关于节点的定义来看，其实不同定义是有不同含义的，Node代表的是一个孤立的节点，这个节点不代表我们和他会建立连接，而Peer是肯定会去连接的，但是不代表一定会建立出连接，只有建立连接以后才会生成session，在session上才进行了以太坊的数据的交换。

对于了解p2p系统的人来说，肯定对区块链p2p底层有一种疑惑，为什么呢？因为在中心化的p2p网络中，会有一个server用来搜集peer信息，这样在数据交互过程中，每个peer一般情况下是先通过这个server拿到一定数量的peer列表，然后挨个去建立连接，最后进行数据交互。但众所周知的是，区块链是一个中心化的系统，这种server的存在将会彻底破坏区块链可信任的基础，那么以太坊是如何解决节点获取问题的呢？答案就是Kademlia算法，这是一种分布式存储及路由的算法，能够保证经过最多n步后找到需要的数据，具体的算法可以参考 https://www.jianshu.com/p/f2c31e632f1d 这篇文章，比较通俗易懂。

在这里我们更加关注以太坊中关于节点发现的实现，这部分的逻辑都是在NodeTable中，我们先来看下NodeTable类的成员变量和函数，然后再根据代码逻辑详细说明整个流程，后续如果有时间我会再补个流程图。

NodeTable类

// NodeTable类负责以太坊p2p网络底层节点发现的所有管理
// 节点发现是通过udp来完成，因此这里继承了UDPSocketEvents，来响应一些事件
class NodeTable: UDPSocketEvents, public std::enable_shared_from_this<NodeTable>
{
    friend std::ostream& operator<<(std::ostream& _out, NodeTable const& _nodeTable);
    using NodeSocket = UDPSocket<NodeTable, 1280>;              // UDPSocket，这是在UDP.h中定义，1280表示的是最大数据报大小
    using TimePoint = std::chrono::steady_clock::time_point;    // < Steady time point.
    using NodeIdTimePoint = std::pair<NodeID, TimePoint>;
    struct EvictionTimeout                                      // 用于记录淘汰的节点的timepoint，以及用于替代他的新节点id
    { 
        NodeID newNodeID;
        TimePoint evictedTimePoint;
    };

public:
    enum NodeRelation { Unknown = 0, Known };   // 判断节点的关系，在部分函数参数中需要
    enum DiscoverType { Random = 0 }; 
    NodeTable(ba::io_service& _io, KeyPair const& _alias, NodeIPEndpoint const& _endpoint, bool _enabled = true);   //构造函数需要一个用于io的host，证书以及要监听的ip地址和端口
    ~NodeTable();
    //返回两个nodeid基于异或计算的距离，这就是NodeEntry中的distance，也是判断两个节点逻辑上“距离”的计算方法，可不用关注细节
    static int distance(NodeID const& _a, NodeID const& _b) { u256 d = sha3(_a) ^ sha3(_b); unsigned ret; for (ret = 0; d >>= 1; ++ret) {}; return ret; }
    void setEventHandler(NodeTableEventHandler* _handler) { m_nodeEventHandler.reset(_handler); }   //为NodeEntryAdded和NodeEntryDropped事件设置事件句柄，实际上这两个事件都会在上层被处理，这里暂不关注
    void processEvents();                      // 这个函数也是在上层被调用的，这样上层就可以来处理setEventHandler设置的事件了
    std::shared_ptr<NodeEntry> addNode(Node const& _node, NodeRelation _relation = NodeRelation::Unknown);  //添加节点，这部分内容较多，会在后面流程介绍细说
    std::list<NodeID> nodes() const;           // 返回node table中活跃的node id的列表
    unsigned count() const { return m_nodes.size(); }  // 返回节点数量
    std::list<NodeEntry> snapshot() const;             //返回节点快照，这里可以发现关注的都是NodeEntry，这是因为node table需要关心distance
    bool haveNode(NodeID const& _id) { Guard l(x_nodes); return m_nodes.count(_id) > 0; }  // 判断节点是否已经存在
    Node node(NodeID const& _id);              // 返回该node id对应的node，如果不存在返回空节点
    // 下面就是Kademlia算法需要配置的一些常量
    static unsigned const s_addressByteSize = h256::size;                   // < Size of address type in bytes. 32位
    static unsigned const s_bits = 8 * s_addressByteSize;                   // < Denoted by n in [Kademlia].256个bit
    static unsigned const s_bins = s_bits - 1;                              // < Size of m_state (excludes root, which is us). 255个槽位
    static unsigned const s_maxSteps = boost::static_log2<s_bits>::value;   // < Max iterations of discovery. (discover), discovery的最大迭代次数，n取log
    // 可选的参数
    static unsigned const s_bucketSize = 16;            // < Denoted by k in [Kademlia]. Number of nodes stored in each bucket. 每一个bucket保存的node数
    static unsigned const s_alpha = 3;                  // < Denoted by \alpha in [Kademlia]. Number of concurrent FindNode requests. findNode请求的并发数
    // 一些定时器间隔
    std::chrono::milliseconds const c_evictionCheckInterval = std::chrono::milliseconds(75);      // 淘汰超时检测的间隔
    std::chrono::milliseconds const c_reqTimeout = std::chrono::milliseconds(300);                // 每个请求的等待时间
    std::chrono::milliseconds const c_bucketRefresh = std::chrono::milliseconds(7200);            // 更新bucket的时间，避免node数据变得老旧
    struct NodeBucket   //槽位，每个不同的distance都会包含若干个节点，最多不超过上面的s_bucketSize，也就是16个
    {
        unsigned distance;
        std::list<std::weak_ptr<NodeEntry>> nodes;
    };
    void ping(NodeIPEndpoint _to) const;     // ping, 连接某个端点
    void ping(NodeEntry* _n) const;          // 用来ping已知节点，这是node table在更新buckets或者淘汰过程中调用
    NodeEntry center() const { return NodeEntry(m_node.id, m_node.publicKey(), m_node.endpoint); }
    std::shared_ptr<NodeEntry> nodeEntry(NodeID _id);
    void doDiscover(NodeID _target, unsigned _round = 0, std::shared_ptr<std::set<std::shared_ptr<NodeEntry>>> _tried =  std::shared_ptr<std::set<std::shared_ptr<NodeEntry>>>());    // 用于发现给定目标距离近的节点
    std::vector<std::shared_ptr<NodeEntry>> nearestNodeEntries(NodeID _target);          //返回距离target最近的节点列表
    void evict(std::shared_ptr<NodeEntry> _leastSeen, std::shared_ptr<NodeEntry> _new);  // 异步丢弃不响应的_leastSeen节点，并添加_new节点，否则丢弃_new
    void noteActiveNode(Public const& _pubk, bi::udp::endpoint const& _endpoint);        //为了维持节点table，无论何时从一个节点获取到activity，都会调用这个noteActiveNode
    void dropNode(std::shared_ptr<NodeEntry> _n);     //当超时出现后，调用
    NodeBucket& bucket_UNSAFE(NodeEntry const* _n);   //这是返回bucket的引用，后面可以看到，这是唯一添加node到bucket的入口
    void onReceived(UDPSocketFace*, bi::udp::endpoint const& _from, bytesConstRef _packet); //当m_socket收到数据包，调用该函数，这是继承的UDPSocketEvents里函数
    void onDisconnected(UDPSocketFace*) {}            //当socket端口后调用，也是继承的UDPSocketEvents里函数
    void doCheckEvictions();                          // 被evict调用确认淘汰检查被调度，并且在没有淘汰剩余时停止，异步操作
    void doDiscovery();                               // 在c_bucketRefresh间隔内查询随机node
    std::unique_ptr<NodeTableEventHandler> m_nodeEventHandler;      // < Event handler for node events. node事件的事件句柄
    Node m_node;                                                    // < This node. LOCK x_state if endpoint access or mutation is required. Do not modify id. 当前自己这个节点
    Secret m_secret;                                                // < This nodes secret key. 当前节点的私钥
    mutable Mutex x_nodes;                                          // < LOCK x_state first if both locks are required. Mutable for thread-safe copy in nodes() const.
    std::unordered_map<NodeID, std::shared_ptr<NodeEntry>> m_nodes; // 已知的节点endpoints，m_nodes记录的是建立过连接的node信息
    mutable Mutex x_state;                                          // < LOCK x_state first if both x_nodes and x_state locks are required.
    std::array<NodeBucket, s_bins> m_state;                         // p2p节点网络的状态， m_state是记录了不同bucket的节点，不代表就能连上,在noteActiveNode这个函数中添加
    Mutex x_evictions;                                              // < LOCK x_evictions first if both x_nodes and x_evictions locks are required.
    std::unordered_map<NodeID, EvictionTimeout> m_evictions;        // < Eviction timeouts. 
    Mutex x_pubkDiscoverPings;                                      // < LOCK x_nodes first if both x_nodes and x_pubkDiscoverPings locks are required.
    std::unordered_map<bi::address, TimePoint> m_pubkDiscoverPings; // 由于可能不知道pubk，有些节点会有一个获取pubk的流程，当处于这个流程时，节点信息保存在这里
    Mutex x_findNodeTimeout;
    std::list<NodeIdTimePoint> m_findNodeTimeout;                   // FindNode请求超时
    std::shared_ptr<NodeSocket> m_socket;                           // < Shared pointer for our UDPSocket; ASIO requires shared_ptr.
    NodeSocket* m_socketPointer;                                    // < Set to m_socket.get(). Socket is created in constructor and disconnected in destructor to ensure access to pointer is safe.
    Logger m_logger{createLogger(VerbosityDebug, "discov")};
    DeadlineOps m_timers; ///< this should be the last member - it must be destroyed first
};

从上面的成员变量和成员函数的数量可以看出，整个NodeTable还是比较复杂的，这里只需要关心不同成员变量对应的含义是什么，不同成员函数的功能是什么，大概了解清楚即可，下面我们将根据流程介绍详细的代码。

首先看下NodeTable的构造函数，成员变量的赋值可暂时不考虑，看下函数内部细节，重点可以关注两处，一个是m_socketPointer指针调用了connect函数，m_socketPointer是对UDPSocket一个包装后的指针，connect函数里面实际上是一个等待连接的操作，具体函数内容可以在UDP.h/cpp文件查看，这样NodeTable就可以接受别的节点的连接请求了；其次是doDiscovery函数，这个函数里面就是进行节点发现的操作。

NodeTable::NodeTable(ba::io_service& _io, KeyPair const& _alias, NodeIPEndpoint const& _endpoint, bool _enabled):
    m_node(Node(_alias.pub(), _endpoint)),
    m_secret(_alias.secret()),
    m_socket(make_shared<NodeSocket>(_io, *reinterpret_cast<UDPSocketEvents*>(this), (bi::udp::endpoint)m_node.endpoint)),
    m_socketPointer(m_socket.get()),
    m_timers(_io)
{
    for (unsigned i = 0; i < s_bins; i++)
        m_state[i].distance = i;  //这里说明的是每个节点槽的个数，每个nodeid有32位，因此256个bit，对应256个槽位
    if (!_enabled)
        return;
    try
    {
        m_socketPointer->connect(); //开启连接，这时候就可以接受外界发来的消息了，m_socketPointer指定了回调句柄就是NodeTable
        doDiscovery();  //节点发现
    }
    catch (std::exception const& _e)
    {
        cwarn << "Exception connecting NodeTable socket: " << _e.what();
        cwarn << "Discovery disabled.";
    }
}

下面就让我们紧跟doDiscovery函数看看里面干了些啥。。

void NodeTable::doDiscovery()
{
    //定时器，为了避免bucket过于老旧，需要定时刷新bucket，间隔时间为7200ms，这个定时器只会跑一次
    m_timers.schedule(c_bucketRefresh.count(), [this](boost::system::error_code const& _ec)
    {
        if (_ec)
            // we can't use m_logger here, because captured this might be already destroyed
            clog(VerbosityDebug, "discov")
                << "Discovery timer was probably cancelled: " << _ec.value() << " "
                << _ec.message();
        if (_ec.value() == boost::asio::error::operation_aborted || m_timers.isStopped())
            return;
        LOG(m_logger) << "performing random discovery";
        NodeID randNodeId;
        crypto::Nonce::get().ref().copyTo(randNodeId.ref().cropped(0, h256::size));
        crypto::Nonce::get().ref().copyTo(randNodeId.ref().cropped(h256::size, h256::size));
        doDiscover(randNodeId);
    });
}

从函数体中可以看到，这个函数启动了一个定时任务，也就是定时完成节点bucket的更新操作，实际上，第一次启动时，每个槽位里面时没有任何节点的，不过不影响更新过程，从定时器回调函数可以看出刷新的流程，首先时随机从NodeId的值域空间选择一个随机的NodeID(randNodeId)，然后根据这个id调用了doDiscover函数，继续看doDiscover(randNodeId)干了啥。。

void NodeTable::doDiscover(NodeID _node, unsigned _round, shared_ptr<set<shared_ptr<NodeEntry>>> _tried)
{
    // NOTE: ONLY called by doDiscovery!
    
    if (!m_socketPointer->isOpen())  //如果监听的socket已经挂了，没有继续的意义了
        return;    
    if (_round == s_maxSteps)  //已经跑到了最大的轮数，停止
    {
        LOG(m_logger) << "Terminating discover after " << _round << " rounds.";
        doDiscovery();  //这里面其实是又注册了刷新节点的定时器任务
        return;
    }
    else if (!_round && !_tried)  //这是表示第一次这个函数被调用
        // initialized _tried on first round
        _tried = make_shared<set<shared_ptr<NodeEntry>>>();  
    
    auto nearest = nearestNodeEntries(_node);  //获取距离_node距离近的节点信息
    list<shared_ptr<NodeEntry>> tried; //每次记录尝试获取的节点list
    for (unsigned i = 0; i < nearest.size() && tried.size() < s_alpha; i++)  //每次请求数目为s_alpha个
        if (!_tried->count(nearest[i]))  //如果这个节点已经探测过，跳过
        {
            auto r = nearest[i];
            tried.push_back(r); //添加到tried，注意和_tried不一样
            FindNode p(r->endpoint, _node);
            p.sign(m_secret);
            DEV_GUARDED(x_findNodeTimeout)
                m_findNodeTimeout.push_back(make_pair(r->id, chrono::steady_clock::now()));  //记录findNode可能超时的list
            m_socketPointer->send(p); //发送findnode数据包
        }
    
    if (tried.empty()) //如果没有可以连接的最近节点，重新生成随机节点探测，退出当前discover流程
    {
        LOG(m_logger) << "Terminating discover after " << _round << " rounds.";
        doDiscovery();
        return;
    }
        
    while (!tried.empty()) //这个里面很奇怪，为啥不在添加到tried的时候直接添加给_tried ？
    {
        _tried->insert(tried.front());  //添加到_tried里面
        tried.pop_front();
    }

    //定时检查请求是否超时，间隔时间600ms
    m_timers.schedule(c_reqTimeout.count() * 2, [this, _node, _round, _tried](boost::system::error_code const& _ec)
    {
        if (_ec)
            // we can't use m_logger here, because captured this might be already destroyed
            clog(VerbosityDebug, "discov")
                << "Discovery timer was probably cancelled: " << _ec.value() << " "
                << _ec.message();

        if (_ec.value() == boost::asio::error::operation_aborted || m_timers.isStopped())
            return;

        // error::operation_aborted means that the timer was probably aborted. 
        // It usually happens when "this" object is deallocated, in which case 
        // subsequent call to doDiscover() would cause a crash. We can not rely on 
        // m_timers.isStopped(), because "this" pointer was captured by the lambda,
        // and therefore, in case of deallocation m_timers object no longer exists.

        doDiscover(_node, _round + 1, _tried);  //进行下一次循环discover
    });
}

这个函数开始有点复杂了，我们看到在doDiscover实际上包含了3个参数，而在doDiscovery函数中调用时只使用了一个随机nodeid, 看头文件我们看到了其他两个参数有默认参数，通过函数过程说明我们知道，这个函数是一个递归调用，其实就是干了这么一件事，把这个随机节点的所有邻近节点全部获取到，其中_round是表示递归的深度，最多调用这么多次，_tried是表示我们尝试的连接的节点集合，主要是为了去重用的，从中大概了解节点发现的大致流程：

定时器开启，定时执行刷新过程，选择一个随机节点，然后调用doDiscover函数
获取这个节点的邻近节点集合，然后根据配置，每次最多同时连接s_alpha个节点
再次设置定时器，进行下一轮的discover过程
停止discover的条件，一是这个随机节点已经没有其他邻近的节点了，二是超过了设定的最大discover次数，无论是哪种情况都会停止本轮discover过程，再次调用doDiscovery生成一个随机id进行下一次的节点发现过程

事情进行到这里，似乎节点发现就结束了？实际上不是，从上述函数中还有两个地方值得注意，一个是获取到邻近节点时，发送的不是ping包，而是findNode包，这样这个节点返回的时候返回的是该节点所拥有的节点信息，这部分信息是会和本地存储的节点信息进行一些整合的，这也是为什么每次调用nearestNodeEntries返回结果不同的原因，另一个值得注意的是可以看到每两轮doDiscover中间的时间间隔是reqTimeout * 2，这也是为了有足够的时间让findNode数据返回，也是为了调用nearestNodeEntries得到尽可能多的信息。这时候，大家可能好奇nearestNodeEntries这个函数里面到底是如何去获取到最邻近的节点信息的，继续看。。。

vector<shared_ptr<NodeEntry>> NodeTable::nearestNodeEntries(NodeID _target)
{
    // send s_alpha FindNode packets to nodes we know, closest to target
    // 发送s_alpha个FindNode包给我们知道最接近target的节点
    static unsigned lastBin = s_bins - 1;   //最后一个bucket编号
    unsigned head = distance(m_node.id, _target);  //当前节点与target的逻辑异或距离
    unsigned tail = head == 0 ? lastBin : (head - 1) % s_bins; //环形，前一个
    
    map<unsigned, list<shared_ptr<NodeEntry>>> found;  //目前已发现的
    
    // if d is 0, then we roll look forward, if last, we reverse, else, spread from d
    if (head > 1 && tail != lastBin)
        while (head != tail && head < s_bins)
        {
            Guard l(x_state);
            for (auto const& n: m_state[head].nodes)
                if (auto p = n.lock())
                    found[distance(_target, p->id)].push_back(p);

            if (tail)
                for (auto const& n: m_state[tail].nodes)
                    if (auto p = n.lock())
                        found[distance(_target, p->id)].push_back(p);

            head++;
            if (tail)  //tail到0就不继续了，后面由head处理，直到head到达队尾，如果head先到，结束
                tail--;
        }
    else if (head < 2) //head == 0 or head == 1，head==0表明_target表明就是m_node自己，head == 1表明tail就是m_node.id自己，不需要添加
        while (head < s_bins)
        {
            Guard l(x_state);
            for (auto const& n: m_state[head].nodes)
                if (auto p = n.lock())
                    found[distance(_target, p->id)].push_back(p);
            head++;
        }
    else // head >=2 && tail == lastBin
        while (tail > 0)
        {
            Guard l(x_state);
            for (auto const& n: m_state[tail].nodes)
                if (auto p = n.lock())
                    found[distance(_target, p->id)].push_back(p);
            tail--;
        }
    
    vector<shared_ptr<NodeEntry>> ret;
    //每次最多返回s_bucketSize个node
    for (auto& nodes: found)
        for (auto const& n: nodes.second)
            if (ret.size() < s_bucketSize && !!n->endpoint && n->endpoint.isAllowed())
                ret.push_back(n);
    return ret;
}

这个函数的最主要语义是，先计算出随机节点与本节点的距离，这样才知道随机节点在本节点保存的节点的槽位，然后从该槽位两侧同时遍历，这样才能找到距离_target最近的节点，在本地的节点table全部遍历完以后，选出其中s_bucketSize返回。

前面介绍了节点发现的主要流程，后面我们再回头看下，当我们给最近的节点发送findNode数据包后，如果那些节点回复了，会是怎么个流程，在NodeTable的构造函数中给UDPSocket注册了event事件，也就是NodeTable本身，因此udp socket接受消息后都会调用NodeTable的OnReceived函数，如下。。

//socket的消息接受
void NodeTable::onReceived(UDPSocketFace*, bi::udp::endpoint const& _from, bytesConstRef _packet)
{
    try {
        //解析udp包
        unique_ptr<DiscoveryDatagram> packet = DiscoveryDatagram::interpretUDP(_from, _packet);
        if (!packet)
            return;
        if (packet->isExpired())  //判断该包是否过期，所有数据包会携带一个ts，必须大于当前时间
        {
            LOG(m_logger) << "Invalid packet (timestamp in the past) from "
                          << _from.address().to_string() << ":" << _from.port();
            return;
        }
        
        //根据不同包类型，进行相应处理
        switch (packet->packetType())
        {
            case Pong::type:  //Ping消息的回应
            {
                auto in = dynamic_cast<Pong const&>(*packet);
                // whenever a pong is received, check if it's in m_evictions
                // 检查该节点是否在m_evictions里面
                bool found = false;
                NodeID leastSeenID;
                EvictionTimeout evictionEntry;   //淘汰超时的节点，这部分后面需要回来看？？？
                DEV_GUARDED(x_evictions)
                { 
                    auto e = m_evictions.find(in.sourceid);
                    if (e != m_evictions.end())
                    { 
                        if (e->second.evictedTimePoint > std::chrono::steady_clock::now())
                        {
                            found = true;
                            leastSeenID = e->first;
                            evictionEntry = e->second;
                            m_evictions.erase(e);  //从m_evictions中移除
                        }
                    }
                }

                if (found)
                {
                    if (auto n = nodeEntry(evictionEntry.newNodeID))
                        dropNode(n); //如果新节点也保存在m_nodes中，删除
                    if (auto n = nodeEntry(leastSeenID))
                        n->pending = false;
                }
                else
                {
                    // if not, check if it's known/pending or a pubk discovery ping
                    // 如果不再m_evictions里面，观察是否是know/pending，或者是一个pubk 发现ping
                    if (auto n = nodeEntry(in.sourceid))
                        n->pending = false;
                    else
                    {
                        DEV_GUARDED(x_pubkDiscoverPings)
                        {
                            if (!m_pubkDiscoverPings.count(_from.address()))  //如果不明pong，直接返回
                                return; // unsolicited pong; don't note node as active
                            m_pubkDiscoverPings.erase(_from.address());
                        }
                        if (!haveNode(in.sourceid))   //如果不在m_nodes中，添加node
                            addNode(Node(in.sourceid, NodeIPEndpoint(_from.address(), _from.port(), _from.port())));
                    }
                }
                
                // update our endpoint address and UDP port
                // 更新我们自己的地址和udp端口，难道每一个pong都要更新下？很浪费
                DEV_GUARDED(x_nodes)
                {
                    if ((!m_node.endpoint || !m_node.endpoint.isAllowed()) && isPublicAddress(in.destination.address))
                        m_node.endpoint.address = in.destination.address;
                    m_node.endpoint.udpPort = in.destination.udpPort;
                }

                LOG(m_logger) << "PONG from " << in.sourceid << " " << _from;
                break;
            }
                
            case Neighbours::type:  //获取邻居节点
            {
                auto in = dynamic_cast<Neighbours const&>(*packet);
                bool expected = false;  //标记是否向某个节点发送过findnode包
                auto now = chrono::steady_clock::now();
                DEV_GUARDED(x_findNodeTimeout)
                    //调用了std::list<>的remove_if，删除list中所有满足条件的元素
                    m_findNodeTimeout.remove_if([&](NodeIdTimePoint const& t)
                    {
                        if (t.first == in.sourceid && now - t.second < c_reqTimeout)
                            expected = true;
                        else if (t.first == in.sourceid)
                            return true; //只要是当前包发送的节点，就移除
                        return false;
                    });
                if (!expected)    //如果未发现向某个节点发送findnode，要么是误收要么该node已经被删除
                {
                    cnetdetails << "Dropping unsolicited neighbours packet from "
                                << _from.address();
                    break;
                }

                for (auto n: in.neighbours)
                    addNode(Node(n.node, n.endpoint));  //挨个添加到自己的m_nodes中
                break;
            }

            case FindNode::type: //接收到findNode数据包，返回请求的target的相近节点
            {
                auto in = dynamic_cast<FindNode const&>(*packet);
                vector<shared_ptr<NodeEntry>> nearest = nearestNodeEntries(in.target);  //找到当前节点保存的与target最相近的节点列表
                static unsigned const nlimit = (m_socketPointer->maxDatagramSize - 109) / 90;  //发送数量限制，发送的节点数量
                for (unsigned offset = 0; offset < nearest.size(); offset += nlimit)
                {
                    Neighbours out(_from, nearest, offset, nlimit); 
                    out.sign(m_secret);
                    if (out.data.size() > 1280)
                        cnetlog << "Sending truncated datagram, size: " << out.data.size();
                    m_socketPointer->send(out);
                }
                break;
            }

            case PingNode::type: //接收到ping数据，这是有节点过来建立连接
            {
                auto in = dynamic_cast<PingNode const&>(*packet);
                in.source.address = _from.address();
                in.source.udpPort = _from.port();
                addNode(Node(in.sourceid, in.source));  //添加节点
                
                Pong p(in.source);
                p.echo = in.echo;
                p.sign(m_secret);
                m_socketPointer->send(p); //返回pong数据包
                break;
            }
        }

        noteActiveNode(packet->sourceid, _from);  //标记活跃的节点
    }
    catch (std::exception const& _e)
    {
        LOG(m_logger) << "Exception processing message from " << _from.address().to_string() << ":"
                      << _from.port() << ": " << _e.what();
    }
    catch (...)
    {
        LOG(m_logger) << "Exception processing message from " << _from.address().to_string() << ":"
                      << _from.port();
    }
}

从这个函数可以看到，节点发现里面一共由四种数据包类型，分别是Ping, Pong, FindNode, Neighbours四种，这里我们先只看下discover中发起的FindNode请求，从函数中，当其他节点收到FindNode数据包后，首先去找出待查找的_target节点id，然后根据这个_target在自己的节点table中找出这个自己保存的与_target相近的节点数据，并封装成 Neighbours 数据包返回给当前节点，具体的解析Neighbours的解析我们后面再来分析；这个函数还干了一件事情，因为当节点收到了其他节点发来的消息时，无论是其他节点主动发来消息还是被动回复消息，都说明这个节点当前是活跃状态，因此需要调用noteActiveNode函数来标识一下，继续来看下这个函数。。。

void NodeTable::noteActiveNode(Public const& _pubk, bi::udp::endpoint const& _endpoint)
{
    if (_pubk == m_node.address() || !NodeIPEndpoint(_endpoint.address(), _endpoint.port(), _endpoint.port()).isAllowed())
        return; // 这里判断了下id是否正确以及地址是否可用

    shared_ptr<NodeEntry> newNode = nodeEntry(_pubk);  //查找该id对应的nodeentry，这是在所有已经有连接关系的nodes中查找的
    if (newNode && !newNode->pending) //如果该node正常运行，这里面有个疑问？为何只处理在nodes中找到的，如果没有找到，不更说明应该添加到nodes里面去么？
    {
        LOG(m_logger) << "Noting active node: " << _pubk << " " << _endpoint.address().to_string()
                      << ":" << _endpoint.port();
        newNode->endpoint.address = _endpoint.address();
        newNode->endpoint.udpPort = _endpoint.port();   //更新下，可能node的地址信息会变化

        shared_ptr<NodeEntry> nodeToEvict; //有活跃的，必然将会有被淘汰的
        {
            Guard l(x_state);
            // Find a bucket to put a node to
            NodeBucket& s = bucket_UNSAFE(newNode.get());  //获取该节点所在的bucket
            auto& nodes = s.nodes; //该bucket的所有node list

            // check if the node is already in the bucket 
            // 查看该节点是否早已存在于该bucket
            auto it = std::find(nodes.begin(), nodes.end(), newNode);
            if (it != nodes.end())
            {
                // if it was in the bucket, move it to the last position
                // 如果已经存在，移动到最后的位置，表示最新
                nodes.splice(nodes.end(), nodes, it);
            }
            else
            {
                if (nodes.size() < s_bucketSize) //如果这个bucket还没有超过最大限制
                {
                    // if it was not there, just add it as a most recently seen node
                    // (i.e. to the end of the list)
                    // 添加到该bucket中
                    nodes.push_back(newNode);
                    if (m_nodeEventHandler)
                        m_nodeEventHandler->appendEvent(newNode->id, NodeEntryAdded); //并向上层注册节点添加事件
                }
                else
                {
                    // if bucket is full, start eviction process for the least recently seen node
                    // 如果bucket已满，那么为最少看见的node启动淘汰进程，最前面的表示最少被看见
                    nodeToEvict = nodes.front().lock(); //获取到需要被淘汰的node的shared_ptr
                    // It could have been replaced in addNode(), then weak_ptr is expired.
                    // If so, just add a new one instead of expired
                    if (!nodeToEvict) //如果在addNode的时候已经被删除了，那么直接添加
                    {
                        nodes.pop_front();
                        nodes.push_back(newNode);
                        if (m_nodeEventHandler)
                            m_nodeEventHandler->appendEvent(newNode->id, NodeEntryAdded);  //注册节点添加事件
                    }
                }
            }
        }

        if (nodeToEvict) //如果有需要被删除的节点
            evict(nodeToEvict, newNode); //如果有应该被删除的节点，调用evict
    }
}

从上面函数中，我们可以看到这个函数就是看看需不需要以及能不能把这个活跃节点放到m_state中，也就是对应的bucket里面去，另外如果能够且是把一些老节点给踢出来了话，被抛弃的老节点就会走淘汰流程，这个流程将会在evict函数中进行。针对我在代码中的疑问，我自己理清楚了下，应该是这样的，一个节点启动后应该只是找几个知名节点开始节点发现，这些知名节点是已经建立连接且保存在bucket中了，当他们返回临近节点后，这些临近节点将会走addNode流程，在那里面节点会发ping包过去，收到了回复的pong包后会再走一边noteActiveNode流程，因此noteActiveNode里面只需要关心已经建立过连接的即可。下面我们继续看下evict流程，pong包后面将会详细介绍，继续。。。

void NodeTable::evict(shared_ptr<NodeEntry> _leastSeen, shared_ptr<NodeEntry> _new)
{
    if (!m_socketPointer->isOpen())  //当前连接关闭
        return;
    
    unsigned evicts = 0;
    DEV_GUARDED(x_evictions)
    {
        EvictionTimeout evictTimeout{_new->id, chrono::steady_clock::now()};  
        m_evictions.emplace(_leastSeen->id, evictTimeout); //插入一个淘汰节点
        evicts = m_evictions.size();
    }

    if (evicts == 1)  //之所以等于1才触发check流程，是因为在doCheckEvictions会一直处理至m_evictions为空
        doCheckEvictions(); //执行真正淘汰检查
    ping(_leastSeen.get()); //最后的挣扎
}

其实淘汰一个节点还是挺难的，毕竟当时是辛辛苦苦连上的，evict函数中会对长时间没有交互的节点做最后一次尝试，如果你在设定的时间内没有搭理我，那就对不起了，我真的要删除你了，doCheckEvictions函数就设定了一个定时器，定时来看看这些待淘汰的节点什么样了，一起来开下。。。

void NodeTable::doCheckEvictions()
{
    m_timers.schedule(c_evictionCheckInterval.count(), [this](boost::system::error_code const& _ec)
    {  //c_evictionCheckInterval, 75ms

      if (_ec)

// we can't use m_logger here, because captured this might be already destroyed clog(VerbosityDebug, "discov") << "Check Evictions timer was probably cancelled: " << _ec.value() << " " << _ec.message(); if (_ec.value() == boost::asio::error::operation_aborted || m_timers.isStopped()) return; bool evictionsRemain = false; //标记是否m_evictions是否还有未处理的超时标记 list<shared_ptr<NodeEntry>> drop; { Guard le(x_evictions); Guard ln(x_nodes); for (auto& e: m_evictions) //如果最后一次ping，依然超时了，那么就会移除该node if (chrono::steady_clock::now() - e.second.evictedTimePoint > c_reqTimeout) //请求超时300ms if (m_nodes.count(e.second.newNodeID)) drop.push_back(m_nodes[e.second.newNodeID]); //注意这是从m_nodes里面移除，m_state会在后面来移除 evictionsRemain = (m_evictions.size() - drop.size() > 0); //是否还有超时时间未超过超时时间的 } drop.unique(); for (auto n: drop) dropNode(n); //被移除的节点需要通知上层哦，毕竟节点不可用了，也就不用用来传输p2p数据了 if (evictionsRemain) doCheckEvictions(); //如果仍然还有未处理的等待超时检测的节点，那就继续设置定时任务来检查，直到m_evictions为空 });}

简单看下dropNode。。。

void NodeTable::dropNode(shared_ptr<NodeEntry> _n)
{
    // remove from nodetable 首先从node table中移除，注意看noteActiveNode里面，对NodetoEvict的节点只有在节点不存在以后才pop了，否则还是保留着呢，避免其实节点好好的，先就被删掉了
    {
        Guard l(x_state);
        NodeBucket& s = bucket_UNSAFE(_n.get());
        s.nodes.remove_if(
            [_n](weak_ptr<NodeEntry> const& _bucketEntry) { return _bucketEntry == _n; });
    }
    
    // notify host，通知host，告知上层节点被删掉了
    LOG(m_logger) << "p2p.nodes.drop " << _n->id;
    if (m_nodeEventHandler)
        m_nodeEventHandler->appendEvent(_n->id, NodeEntryDropped);
}

回到evict函数，最后一行进行了最后一次上次，给待删除节点发了一个ping包，如果该节点没有响应，那么这个节点就会被doCheckEvictions里面给干掉，但是如果回复了一个pong包，会怎么样呢？这部分逻辑也是在上面onReceived中处理的，前面只介绍了这个函数里面处理的FindNode请求，下面就看下收到其他三个请求后都干了点啥，先看下Ping请求。。。

            case PingNode::type: //接收到ping数据，这是有节点过来建立连接
            {
                auto in = dynamic_cast<PingNode const&>(*packet);
                in.source.address = _from.address();
                in.source.udpPort = _from.port();
                addNode(Node(in.sourceid, in.source));  //添加节点
                
                Pong p(in.source);
                p.echo = in.echo;
                p.sign(m_secret);
                m_socketPointer->send(p); //返回pong数据包
                break;
            }

这里面看到有节点发Ping请求过来，我们先把这个节点添加到自己的节点中，然后再发一个pong包回去，这里暂时不继续介绍addNode函数，因为这个后面别的数据包类型也会调用，这里有这个记忆即可。Ping请求比较简单，再来看看Pong请求，这个比较复杂。。。

           case Pong::type:  //Ping消息的回应
            {
                auto in = dynamic_cast<Pong const&>(*packet);
                // whenever a pong is received, check if it's in m_evictions
                // 检查该节点是否在m_evictions里面
                bool found = false;
                NodeID leastSeenID;
                EvictionTimeout evictionEntry; 
                DEV_GUARDED(x_evictions)
                { 
                    auto e = m_evictions.find(in.sourceid);
                    if (e != m_evictions.end())
                    { 
                        if (e->second.evictedTimePoint > std::chrono::steady_clock::now())
                        {
                            found = true;
                            leastSeenID = e->first;  //本来将被删除的节点id
                            evictionEntry = e->second;
                            m_evictions.erase(e);  //这里面就是收到了evict函数最后的ping了，逃出生天，从m_evictions中移除
                        }
                    }
                }

                if (found)
                {
                    if (auto n = nodeEntry(evictionEntry.newNodeID))
                        dropNode(n); //如果新节点也保存在m_nodes中，删除，这个我有点想不通，就不能共存？
                    if (auto n = nodeEntry(leastSeenID))
                        n->pending = false;  //这有点老子依然牛逼的意思，哈哈哈
                }
                else
                {
                    // if not, check if it's known/pending or a pubk discovery ping
                    // 如果不在m_evictions里面，观察是否是know/pending，或者是一个pubk 发现ping
                    if (auto n = nodeEntry(in.sourceid))
                        n->pending = false;  //原来这个节点是否是pending状态
                    else
                    {
                        DEV_GUARDED(x_pubkDiscoverPings)
                        {
                            if (!m_pubkDiscoverPings.count(_from.address()))  //如果不明pong，直接返回
                                return; // unsolicited pong; don't note node as active
                            m_pubkDiscoverPings.erase(_from.address());
                        }
                        if (!haveNode(in.sourceid))   //如果不在m_nodes中，添加node
                            addNode(Node(in.sourceid, NodeIPEndpoint(_from.address(), _from.port(), _from.port())));
                    }
                }
                
                // update our endpoint address and UDP port
                // 更新我们自己的地址和udp端口，难道每一个pong都要更新下？很浪费
                DEV_GUARDED(x_nodes)
                {
                    if ((!m_node.endpoint || !m_node.endpoint.isAllowed()) && isPublicAddress(in.destination.address))
                        m_node.endpoint.address = in.destination.address;
                    m_node.endpoint.udpPort = in.destination.udpPort;
                }

                LOG(m_logger) << "PONG from " << in.sourceid << " " << _from;
                break;
            }

这里面干的事情就很多了，首先看下这个节点是不是在等待被淘汰的列表里面，如果是的话把它给捞出来，如果不是的话要看下他是不是已经在m_nodes里面了只不过是等待连接上的状态，这一般是由addNode触发的，addNode可以直接添加一个节点，然后把这个节点放到列表里，再给他发一个ping消息，等待回复修改penging状态，或者是有时候可能只知道一个ip地址不知道nodeid，这时候pong消息会把这个nodeid传过来，当前节点再把这个节点连带地址添加到m_nodes里面去。这里也可以总结下发送ping的三种情况：

最开始加入时，先添加节点，再发送ping消息，然后等待回复修改状态
淘汰节点的检测，向待淘汰节点节点发送ping消息，有回复再从待淘汰的列表中移走
不知道对方的nodeid，先发送ping，根据pong中消息的nodeid进行组合

            case Neighbours::type:  //获取邻居节点
            {
                auto in = dynamic_cast<Neighbours const&>(*packet);
                bool expected = false;  //标记是否向某个节点发送过findnode包
                auto now = chrono::steady_clock::now();
                DEV_GUARDED(x_findNodeTimeout)
                    //调用了std::list<>的remove_if，删除list中所有满足条件的元素
                    m_findNodeTimeout.remove_if([&](NodeIdTimePoint const& t)
                    {
                        if (t.first == in.sourceid && now - t.second < c_reqTimeout)
                            expected = true;
                        else if (t.first == in.sourceid)
                            return true; //只要是当前包发送的节点，就移除
                        return false;
                    });
                if (!expected)    //如果未发现向某个节点发送findnode，要么是误收要么该node已经被删除
                {
                    cnetdetails << "Dropping unsolicited neighbours packet from "
                                << _from.address();
                    break;
                }

                for (auto n: in.neighbours)
                    addNode(Node(n.node, n.endpoint));  //挨个添加到自己的m_nodes中
                break;
            }

这部分先check了一遍，就是看这个Neighbours包的发送方，到底是不是我们之前发送过FindNode包的节点，我们发送过FindNode包的节点都会保存在m_findNodeTimeout里面了，这个添加过程可以在doDiscover函数中看到；如果不是或者请求超时了，就认为这个节点发过来的数据无效，如果是，那就读取返回的neighbours数据，根据返回的节点挨个调用addNode，那么接下来，就让我们来重点看看addNode里面具体完成了哪些事件吧。。。

shared_ptr<NodeEntry> NodeTable::addNode(Node const& _node, NodeRelation _relation)
{
    if (_relation == Known) //如果这个节点已经连接上了，直接注册即可
    {
        auto ret = make_shared<NodeEntry>(m_node.id, _node.id, _node.endpoint); //生成对应NodeEntry
        ret->pending = false;
        DEV_GUARDED(x_nodes)
            m_nodes[_node.id] = ret; //添加到m_nodes
        noteActiveNode(_node.id, _node.endpoint);  //添加到m_state
        return ret;
    }
    
    if (!_node.endpoint)  //地址不存在，没有办法建立连接的，直接返回空说明添加node失败
        return shared_ptr<NodeEntry>();
    
    // ping address to recover nodeid if nodeid is empty
    // 如果nodeid为空，发ping消息过去，获取对应nodeid
    if (!_node.id)
    {
        DEV_GUARDED(x_nodes)
        LOG(m_logger) << "Sending public key discovery Ping to "
                      << (bi::udp::endpoint)_node.endpoint
                      << " (Advertising: " << (bi::udp::endpoint)m_node.endpoint << ")";
        DEV_GUARDED(x_pubkDiscoverPings)
            m_pubkDiscoverPings[_node.endpoint.address] = std::chrono::steady_clock::now(); 
        ping(_node.endpoint);
        return shared_ptr<NodeEntry>();
    }
    
    DEV_GUARDED(x_nodes)
        if (m_nodes.count(_node.id))  //已经存在，直接返回
            return m_nodes[_node.id];
    
    auto ret = make_shared<NodeEntry>(m_node.id, _node.id, _node.endpoint);
    DEV_GUARDED(x_nodes)
        m_nodes[_node.id] = ret; //不存在则添加，并发送ping消息过去
        LOG(m_logger) << "addNode pending for " << _node.endpoint;
        ping(_node.endpoint);
        return ret;
}

好了，到这里NodeTable节点发现相关的细节已经全部介绍完了，至于底层的通讯细节可以留到后面来介绍。可能这是c++的版本的缘故，虽然整个实现比较完整，但是也有不少逻辑貌似还没有完全的覆盖，比如：

addNode的时候默认关系都是unknown的，因此添加到m_nodes中去后pending状态都是false，然后发送ping消息等待回复，但是这些node都是从别的m_nodes中获取到的，并不保证这些节点一直处于活跃状态，对于一直没有回复pong消息的，貌似只有在更新的节点过来后走淘汰流程才给剔除了，是否不太合理？
前面addNode中处于pending状态的节点还可能被剔除，但是m_pubkDiscoverPings和m_findNodeTimeout中没有回复的节点可能再也没有机会被删除掉了，在公链这种节点动态变化非常频繁的网络中，是否可能存在这两者数据会越来越大的风险？

另外，在m_nodes和m_state这两个中很有可能不明白这两个都在干嘛，m_nodes是保存了当前节点所有想产生关联的node，不一定代表就一定能连上，但是肯定都会尝试连接，而m_state是本节点保存了所有发生过连接的节点列表，记住，一定是至少曾经连接过，这一点可以从m_state的添加只发生在noteActiveNode这个函数里看出来，但是这里面的节点如果节点不能连接了，其实他是不知道的，另外还有一点，就是节点的添加事件也是只在这个函数中体现，也就说，只有活跃可连接的节点才会传递给上层来进行p2p通信。

谢谢大家查阅，如有任何建议或者疑问，欢迎留言评论！！

以太坊源码分析之 P2P网络（二）

猜你喜欢