RocketMQ 源码分析 BrokerHousekeepingService(五)

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/MakeContral/article/details/84982117

前言

RocketMQ在早期版本使用的是Zookeeper,考虑到Zookeeper不够轻量,后面用namesrv来代替。熟悉Zookeeper的知道它使用心跳检测来判断服务是否正常,那么namesrv肯定也少不了这块功能,那么它是怎么判断一个Broker节点是否Live呢?如果一个Broker断开了namesrv是如何处理的呢?

在前面篇章提到NamesrvController的构造函数中有一个BrokerHousekeepingService,这个就是来处理Broker连接发生变化的服务。

BrokerHousekeepingService

BrokerHousekeepingService是NamesrvController构造函数的时候生成的:

public NamesrvController(NamesrvConfig namesrvConfig, NettyServerConfig nettyServerConfig) {
    // Name Server配置
    this.namesrvConfig = namesrvConfig;
    // Netty配置 
    this.nettyServerConfig = nettyServerConfig;
    // KV配置管理
    this.kvConfigManager = new KVConfigManager(this);
    // 路由信息管理 
    this.routeInfoManager = new RouteInfoManager();
    // Broker连接事件处理服务 
    this.brokerHousekeepingService = new BrokerHousekeepingService(this);
    this.configuration = new Configuration(
        log,
        this.namesrvConfig, this.nettyServerConfig
    );
    this.configuration.setStorePathFromConfig(this.namesrvConfig, "configStorePath");
}

看一下BrokerHousekeepingService这个类的情况:

public class BrokerHousekeepingService implements ChannelEventListener {
    private static final InternalLogger log = InternalLoggerFactory.getLogger(LoggerName.NAMESRV_LOGGER_NAME);
    private final NamesrvController namesrvController;

    public BrokerHousekeepingService(NamesrvController namesrvController) {
        this.namesrvController = namesrvController;
    }

    @Override
    public void onChannelConnect(String remoteAddr, Channel channel) {
    }

    //监测close事件
    @Override
    public void onChannelClose(String remoteAddr, Channel channel) {
        this.namesrvController.getRouteInfoManager().onChannelDestroy(remoteAddr, channel);
    }

    //监测异常事件
    @Override
    public void onChannelException(String remoteAddr, Channel channel) {
        this.namesrvController.getRouteInfoManager().onChannelDestroy(remoteAddr, channel);
    }

    //监测闲置事件
    @Override
    public void onChannelIdle(String remoteAddr, Channel channel) {
        this.namesrvController.getRouteInfoManager().onChannelDestroy(remoteAddr, channel);
    }
}

可以看到这个类实现了ChannelEventListener接口,除了onChannelConnect外,其余各个方法均委托给namesrvController的routeInfoManager的onChannelDestroy方法。

**这里需要netty的一些基础,简单来说每一个broker与namesrv通过一个“通道”channel进行“沟通”。namesrv通过监测这些通道是否发生某些事件,去做出相应的变动。**可以点进routeInfoManager的onChannelDestroy方法看看,对于宕机的broker是如何处理的。

onChannelDestroy

onChannelDestroy大概有130行的代码,我打算将它拆开来分析,那么我拆的依据是什么呢?

通过上一篇博客我们知道一个broker注册到namesrv的时候会触发registerBroker方法,registerBroker会往brokerAddrTable、clusterAddrTable、brokerLiveTable、topicQueueTable中添加新的broker信息。那么现在发现了一个宕机的broker,就是将上述的添加的信息一个个去除即可!!

放一张registerBroker注册的图,方便我们知道broker和这些路由的关联。

在这里插入图片描述

移除brokerLiveTable

onChannelDestroy第一个要删除的就是brokerLiveTable中的信息,截取部分onChannelDestroy的代码:

public void onChannelDestroy(String remoteAddr, Channel channel) {
    String brokerAddrFound = null;
    if (channel != null) {
        try {
            try {
                this.lock.readLock().lockInterruptibly();
                //遍历了存活的BrokerTable,如果channel是相同的,获取对应的BrokerAddress
                Iterator<Entry<String, BrokerLiveInfo>> itBrokerLiveTable =
                    this.brokerLiveTable.entrySet().iterator();
                while (itBrokerLiveTable.hasNext()) {
                    Entry<String, BrokerLiveInfo> entry = itBrokerLiveTable.next();
                    //如果LiveTable中的channel和传入的channel相同,则返回对应的brokerAddr
                    if (entry.getValue().getChannel() == channel) {
                        brokerAddrFound = entry.getKey();
                        break;
                    }
                }
            } finally {
                this.lock.readLock().unlock();
            }
        } catch (Exception e) {
            log.error("onChannelDestroy Exception", e);
        }
    }
    
   ///省略...............
}

这部分是比较简单的,就是迭代了brokerLiveTable,判断live表中channel和传入的channel是否为同一个对象,如果是则从把这个channel对应的brokerAddr返回到brokerAddrFound中。如果没有找到那么brokerAddrFound为null。

你们可能会有疑问为什么这里不在brokerLiveTable移除这个channel呢?!!

注意看这步的操作是加了一个readLock,意味着现在允许其他线程对brokerLiveTable是可读,但是不允许其他线程变动brokerLiveTable。因为它希望再要修改brokerLiveTable的时候,读表的操作是不会被阻塞的

移除brokerAddrTable

接着往下看如何修改brokerAddrTable的,截取部分onChannelDestroy的代码:

try {
    this.lock.writeLock().lockInterruptibly();
    //根据找到的brokerAddrFound移除LiveTable中的Broker
    this.brokerLiveTable.remove(brokerAddrFound);
    this.filterServerTable.remove(brokerAddrFound);
    String brokerNameFound = null;
    boolean removeBrokerName = false;
    Iterator<Entry<String, BrokerData>> itBrokerAddrTable =
        this.brokerAddrTable.entrySet().iterator();
    while (itBrokerAddrTable.hasNext() && (null == brokerNameFound)) {
        BrokerData brokerData = itBrokerAddrTable.next().getValue();

        Iterator<Entry<Long, String>> it = brokerData.getBrokerAddrs().entrySet().iterator();
        while (it.hasNext()) {
            Entry<Long, String> entry = it.next();
            Long brokerId = entry.getKey();
            String brokerAddr = entry.getValue();
            //现在是针对brokerData中的BrokerAddrs,找到对应的从BrokerAddrs移除
            if (brokerAddr.equals(brokerAddrFound)) {
                brokerNameFound = brokerData.getBrokerName();
                it.remove();
                log.info("remove brokerAddr[{}, {}] from brokerAddrTable, because channel destroyed",
                    brokerId, brokerAddr);
                break;
            }
        }

        //如果BrokerAddrs为空,那么brokerDataTable就需要把这个没有用的brokerData移除
        if (brokerData.getBrokerAddrs().isEmpty()) {
            removeBrokerName = true;
            itBrokerAddrTable.remove();
            log.info("remove brokerName[{}] from brokerAddrTable, because channel destroyed",
                brokerData.getBrokerName());
        }
    }
    //省略.....
}

首先要明白brokerAddrTable是一个hashmap1,一个brokername对应一个brokerData。每一个brokerData存放了一个BrokerAddrs为hashmap2,一个brokerID对应一个brokerAddr。

根据前面获得brokerAddrFound,第一个就是要移除BrokerAddrs中的一组信息。如果移除后,当前的BrokerAddrs已经是空的了:if (brokerData.getBrokerAddrs().isEmpty()),那么意味着一个brokername已经找不到对应的broker的addr和id了。那么就要把brokerAddrTable对应的broker也要移除。解释的可能不太清楚,不过看看注册图可能就会明白。

移除clusterAddrTable

clusterAddrTable存放着集群的信息,如果宕机的broker的brokername只有它一个,那么集群信息中就需要把clusterAddrTable中的set移除,截取部分onChannelDestroy的代码:

//如果removeBrokerName,那么意味着以这个brokername为name的broker没有一个了。
//那么对应的集群中已经没有了cluter - > broker了。所以这里要把集群中一个broker删除掉
if (brokerNameFound != null && removeBrokerName) {
    Iterator<Entry<String, Set<String>>> it = this.clusterAddrTable.entrySet().iterator();
    while (it.hasNext()) {
        Entry<String, Set<String>> entry = it.next();
        String clusterName = entry.getKey();
        Set<String> brokerNames = entry.getValue();
        //直接移除在set中移除brokerNameFound
        boolean removed = brokerNames.remove(brokerNameFound);
        if (removed) {
            log.info("remove brokerName[{}], clusterName[{}] from clusterAddrTable, because channel destroyed",
                brokerNameFound, clusterName);

            if (brokerNames.isEmpty()) {
                log.info("remove the clusterName[{}] from clusterAddrTable, because channel destroyed and no broker in this cluster",
                    clusterName);
                it.remove();
            }

            break;
        }
    }
}

移除topicQueueTable

一个topic可能会存在多个broker中,如果removeBrokerName为true,意味着原本存在brokername的信息就要移除,变量topicQueueTable中的List,如果有对应的brokername就移除。截取部分onChannelDestroy的代码:

//如果removeBrokerName,那么topic就不会再存放在一个叫brokerNameFound的broker中了,移除
if (removeBrokerName) {
    Iterator<Entry<String, List<QueueData>>> itTopicQueueTable =
        this.topicQueueTable.entrySet().iterator();
    while (itTopicQueueTable.hasNext()) {
        Entry<String, List<QueueData>> entry = itTopicQueueTable.next();
        String topic = entry.getKey();
        List<QueueData> queueDataList = entry.getValue();

        Iterator<QueueData> itQueueData = queueDataList.iterator();
        while (itQueueData.hasNext()) {
            QueueData queueData = itQueueData.next();
            if (queueData.getBrokerName().equals(brokerNameFound)) {
                itQueueData.remove();
                log.info("remove topic[{} {}], from topicQueueTable, because channel destroyed",
                    topic, queueData);
            }
        }

        if (queueDataList.isEmpty()) {
            itTopicQueueTable.remove();
            log.info("remove topic[{}] all queue, from topicQueueTable, because channel destroyed",
                topic);
        }
    }
}

onChannelDestroy整体代码

public void onChannelDestroy(String remoteAddr, Channel channel) {
    String brokerAddrFound = null;
    if (channel != null) {
        try {
            try {
                this.lock.readLock().lockInterruptibly();
                //遍历了存活的BrokerTable,如果channel是相同的,获取对应的BrokerAddress
                Iterator<Entry<String, BrokerLiveInfo>> itBrokerLiveTable =
                    this.brokerLiveTable.entrySet().iterator();
                while (itBrokerLiveTable.hasNext()) {
                    Entry<String, BrokerLiveInfo> entry = itBrokerLiveTable.next();
                    if (entry.getValue().getChannel() == channel) {
                        brokerAddrFound = entry.getKey();
                        break;
                    }
                }
            } finally {
                this.lock.readLock().unlock();
            }
        } catch (Exception e) {
            log.error("onChannelDestroy Exception", e);
        }
    }

    if (null == brokerAddrFound) {//说明现有存活的表中没有这个broker,直接用传入的remoteAddr
        brokerAddrFound = remoteAddr;
    } else {
        log.info("the broker's channel destroyed, {}, clean it's data structure at once", brokerAddrFound);
    }

    if (brokerAddrFound != null && brokerAddrFound.length() > 0) {

        try {
            try {
                this.lock.writeLock().lockInterruptibly();
                //根据找到的brokerAddrFound移除LiveTable中的Broker
                this.brokerLiveTable.remove(brokerAddrFound);
                //TODO filterServerTable是干什么的?
                this.filterServerTable.remove(brokerAddrFound);
                String brokerNameFound = null;
                boolean removeBrokerName = false;
                Iterator<Entry<String, BrokerData>> itBrokerAddrTable =
                    this.brokerAddrTable.entrySet().iterator();
                while (itBrokerAddrTable.hasNext() && (null == brokerNameFound)) {
                    BrokerData brokerData = itBrokerAddrTable.next().getValue();

                    Iterator<Entry<Long, String>> it = brokerData.getBrokerAddrs().entrySet().iterator();
                    while (it.hasNext()) {
                        Entry<Long, String> entry = it.next();
                        Long brokerId = entry.getKey();
                        String brokerAddr = entry.getValue();
                        //上面的操作是将liveTable中移除,现在是针对brokerData中的BrokerAddrs,找到对应的从BrokerAddrs移除
                        if (brokerAddr.equals(brokerAddrFound)) {
                            brokerNameFound = brokerData.getBrokerName();
                            it.remove();
                            log.info("remove brokerAddr[{}, {}] from brokerAddrTable, because channel destroyed",
                                brokerId, brokerAddr);
                            break;
                        }
                    }

                    //如果BrokerAddrs为空,那么brokerDataTable就需要把这个没有用的brokerData移除
                    if (brokerData.getBrokerAddrs().isEmpty()) {
                        removeBrokerName = true;
                        itBrokerAddrTable.remove();
                        log.info("remove brokerName[{}] from brokerAddrTable, because channel destroyed",
                            brokerData.getBrokerName());
                    }
                }

                //如果removeBrokerName,那么意味着以这个brokername为name的broker没有一个了。
                //那么对应的集群中已经没有了cluter - > broker了。所以这里要把集群中一个broker删除掉
                if (brokerNameFound != null && removeBrokerName) {
                    Iterator<Entry<String, Set<String>>> it = this.clusterAddrTable.entrySet().iterator();
                    while (it.hasNext()) {
                        Entry<String, Set<String>> entry = it.next();
                        String clusterName = entry.getKey();
                        Set<String> brokerNames = entry.getValue();
                        boolean removed = brokerNames.remove(brokerNameFound);
                        if (removed) {
                            log.info("remove brokerName[{}], clusterName[{}] from clusterAddrTable, because channel destroyed",
                                brokerNameFound, clusterName);

                            if (brokerNames.isEmpty()) {
                                log.info("remove the clusterName[{}] from clusterAddrTable, because channel destroyed and no broker in this cluster",
                                    clusterName);
                                it.remove();
                            }

                            break;
                        }
                    }
                }

                //如果removeBrokerName,那么topic就不会再存放在一个叫brokerNameFound的broker中了,移除
                if (removeBrokerName) {
                    Iterator<Entry<String, List<QueueData>>> itTopicQueueTable =
                        this.topicQueueTable.entrySet().iterator();
                    while (itTopicQueueTable.hasNext()) {
                        Entry<String, List<QueueData>> entry = itTopicQueueTable.next();
                        String topic = entry.getKey();
                        List<QueueData> queueDataList = entry.getValue();

                        Iterator<QueueData> itQueueData = queueDataList.iterator();
                        while (itQueueData.hasNext()) {
                            QueueData queueData = itQueueData.next();
                            if (queueData.getBrokerName().equals(brokerNameFound)) {
                                itQueueData.remove();
                                log.info("remove topic[{} {}], from topicQueueTable, because channel destroyed",
                                    topic, queueData);
                            }
                        }

                        if (queueDataList.isEmpty()) {
                            itTopicQueueTable.remove();
                            log.info("remove topic[{}] all queue, from topicQueueTable, because channel destroyed",
                                topic);
                        }
                    }
                }
            } finally {
                this.lock.writeLock().unlock();
            }
        } catch (Exception e) {
            log.error("onChannelDestroy Exception", e);
        }
    }
}

猜你喜欢

转载自blog.csdn.net/MakeContral/article/details/84982117