dubbo源码系列10-集群容错之负载均衡

一、前沿

LoadBalance 意为负载均衡，它主要负责将网络请求均摊分发到不同的机器上，避免集群中部分机器的压力过大，而另外一些机器比较空闲的情况。通过负载均衡，可以让集群中的每台机器得到适合自己处理能力的负载，这样可以还可以避免资源浪费。负载均衡分为软件均衡和硬件均衡，在开发中我们几乎很难接触到硬件均衡，但软件均衡是我们很容易接触到的，软件均衡比如 Nginx。

dubbo 中为了将consumer 的调用请求均摊分发到不同的 provider 上也引入了负载均衡的实现，从而避免少数 provider 负载过大导致请求超时。 dubbo 提供了四种负载均衡的实现，如下：

RandomLoadBalance ：基于权重随机算法

RoundRobinLoadBalance ：基于加权轮询算法

LeastActiveLoadBalance ：基于最少活跃调用数算法

ConsistentHashLoadBalance : 基于 hash 一致性算法

配置文件中定义了这四种负载均衡算法，如下图：

二、负载均衡结构

负载均衡接口 LoadBalance，如下图：

负载均衡父类 AbstractLoadBalance ，如下图：

负载均衡的实现类，如下图：

三、负载均衡源码

在 dubbo集群中我们得知，在选择 Invoker 时主要通过负载均衡来选择的，dubbo 中所有负载均衡实现类都继承了 AbstractLoadBalance ，下面我们先从 AbstractLoadBalance 源码分析开始。

3.1 AbstractLoadBalance

AbstractLoadBalance 实现了 LoadBalance 接口，并封装了计算权重的公共逻辑，源码如下：

public abstract class AbstractLoadBalance implements LoadBalance {
    /**
     * Calculate the weight according to the uptime proportion of warmup time
     * the new weight will be within 1(inclusive) to weight(inclusive)
     *
     * @param uptime the uptime in milliseconds
     * @param warmup the warmup time in milliseconds
     * @param weight the weight of an invoker
     * @return weight which takes warmup into account
     */
    static int calculateWarmupWeight(int uptime, int warmup, int weight) {
        // 计算权重，下面代码逻辑上等同于 (uptime / warmup) * weight
        // 随着 provider 运行时间 uptime 的增大，权重计算值 ww 会慢慢接近配置值 weight
        int ww = (int) ((float) uptime / ((float) warmup / (float) weight));
        return ww < 1 ? 1 : (ww > weight ? weight : ww);
    }

    @Override
    public <T> Invoker<T> select(List<Invoker<T>> invokers, URL url, Invocation invocation) {
        if (CollectionUtils.isEmpty(invokers)) {
            return null;
        }
        // 只有一个 invoker 时，直接返回，不需要负载均衡选择
        if (invokers.size() == 1) {
            return invokers.get(0);
        }
        // 调用 doSelect 方法进行负载均衡，该方法为抽象方法，由具体子类实现
        return doSelect(invokers, url, invocation);
    }

    protected abstract <T> Invoker<T> doSelect(List<Invoker<T>> invokers, URL url, Invocation invocation);


    /**
     * Get the weight of the invoker's invocation which takes warmup time into account
     * if the uptime is within the warmup time, the weight will be reduce proportionally
     *
     * @param invoker    the invoker
     * @param invocation the invocation of this invoker
     * @return weight
     */
    protected int getWeight(Invoker<?> invoker, Invocation invocation) {
        // 从 URL 中获取权重 weight 配置值，默认是 100
        int weight = invoker.getUrl().getMethodParameter(invocation.getMethodName(), WEIGHT_KEY, DEFAULT_WEIGHT);
        if (weight > 0) {
            // 从 URL 中获取provider启动时间戳 remote.timestamp 值，默认是 0
            long timestamp = invoker.getUrl().getParameter(REMOTE_TIMESTAMP_KEY, 0L);
            if (timestamp > 0L) {
                // 计算 provider 运行时长
                int uptime = (int) (System.currentTimeMillis() - timestamp);
                // 获取服务预热时间，默认为10分钟
                int warmup = invoker.getUrl().getParameter(WARMUP_KEY, DEFAULT_WARMUP);
                // 如果 provider 运行时间小于预热时间，则重新计算服务权重，即降权
                if (uptime > 0 && uptime < warmup) {
                    // 计算权重
                    weight = calculateWarmupWeight(uptime, warmup, weight);
                }
            }
        }
        // 权重大于等于0时，直接返回该值，否则返回 0
        return weight >= 0 ? weight : 0;
    }

}

AbstractLoadBalance 中主要实现了以下逻辑：

1）、select 方法中调用具体实现类的 doSelect 方法选择 invoker

2）、getWeight 方法中计算权重

在计算权重的过程中，主要保证了当 provider 运行时长小于预热时间时，对 provider 服务降权，避免该 provider 在启动之初就处于高负载状态。服务预热是一个优化手段，与此类似的还有 JVM 预热。主要目的是让服务启动后“低功率”运行一段时间，使其效率慢慢提升至最佳状态

AbstractLoadBalance 源码逻辑挺简单的，接下来分析四个实现类的源码，我们先从默认实现类 RandomLoadBalance 源码分析开始

3.2 RandomLoadBalance

RandomLoadBalance 是加权随机算法的具体实现，它的算法思想很简单。假设我们有一组服务器 servers = [A, B, C]，他们对应的权重为 weights = [5, 3, 2]，权重总和为10。现在把这些权重值平铺在一维坐标值上，[0, 5) 区间属于服务器 A，[5, 8) 区间属于服务器 B，[8, 10) 区间属于服务器 C。接下来通过随机数生成器生成一个范围在 [0, 10) 之间的随机数，然后计算这个随机数会落到哪个区间上。比如数字3会落到服务器 A 对应的区间上，此时返回服务器 A 即可。权重越大的机器，在坐标轴上对应的区间范围就越大，因此随机数生成器生成的数字就会有更大的概率落到此区间内。只要随机数生成器产生的随机数分布性很好，在经过多次选择后，每个服务器被选中的次数比例接近其权重比例。比如，经过一万次选择后，服务器 A 被选中的次数大约为5000次，服务器 B 被选中的次数约为3000次，服务器 C 被选中的次数约为2000次

源码如下：

/**
 * This class select one provider from multiple providers randomly.
 * You can define weights for each provider:
 * If the weights are all the same then it will use random.nextInt(number of invokers).
 * If the weights are different then it will use random.nextInt(w1 + w2 + ... + wn)
 * Note that if the performance of the machine is better than others, you can set a larger weight.
 * If the performance is not so good, you can set a smaller weight.
 */
public class RandomLoadBalance extends AbstractLoadBalance {

    public static final String NAME = "random";

    /**
     * Select one invoker between a list using a random criteria
     * @param invokers List of possible invokers
     * @param url URL
     * @param invocation Invocation
     * @param <T>
     * @return The selected invoker
     */
    @Override
    protected <T> Invoker<T> doSelect(List<Invoker<T>> invokers, URL url, Invocation invocation) {
        // Number of invokers
        int length = invokers.size();
        // Every invoker has the same weight?
        boolean sameWeight = true;
        // the weight of every invokers
        int[] weights = new int[length];
        // the first invoker's weight
        int firstWeight = getWeight(invokers.get(0), invocation);
        weights[0] = firstWeight;
        // The sum of weights
        int totalWeight = firstWeight;
        // 循环计算每个 invoker 的 权重值，存入 weights 数组中，计算所有 invoker 的权重和
        for (int i = 1; i < length; i++) {
            int weight = getWeight(invokers.get(i), invocation);
            // save for later use
            weights[i] = weight;
            // Sum
            totalWeight += weight;
            if (sameWeight && weight != firstWeight) {
                // 所有 invoker 不是相同的权重
                sameWeight = false;
            }
        }
        // 权重和大于0 && invoker 不是具有相同的权重，获取随机数
        if (totalWeight > 0 && !sameWeight) {
            // If (not every invoker has the same weight & at least one invoker's weight>0), select randomly based on totalWeight.
            // 随机获取一个在 [0, totalWeight) 之间的数字
            int offset = ThreadLocalRandom.current().nextInt(totalWeight);
            // Return a invoker based on the random value.
            // 循环让 offset 数减去 provider 权重值，当 offset 小于0时，返回相应的 Invoker。
            // 举例说明一下，我们有 servers = [A, B, C]，weights = [5, 3, 2]，offset = 7。
            // 第一次循环，offset - 5 = 2 > 0，即 offset > 5，表明其不会落在服务器 A 对应的区间上。
            // 第二次循环，offset - 3 = -1 < 0，即 5 < offset < 8，表明其会落在服务器 B 对应的区间上，返回机器 B 对应的 invoker
            for (int i = 0; i < length; i++) {
                // offset 减去 invoker 权重值
                offset -= weights[i];
                if (offset < 0) {
                    // 在机器对应的权重区间，返回其对应的 invoker
                    return invokers.get(i);
                }
            }
        }
        // If all invokers have the same weight value or totalWeight=0, return evenly.
        // 如果所有 invoker 权重值一样，则随机返回一个 invoker 即可
        return invokers.get(ThreadLocalRandom.current().nextInt(length));
    }

}

RandomLoadBalance 的算法思想比较简单，在经过多次请求后，能够将调用请求按照权重值进行“均匀”分配

缺点：调用次数比较少时，Random 产生的随机数可能会比较集中，此时多数请求会落到同一台服务器上，这个缺点并不是很严重，多数情况下可以忽略

RandomLoadBalance 是一个简单且高效的负载均衡实现，因此 Dubbo 选择它作为缺省实现

3.3 LeastActiveLoadBalance

LeastActiveLoadBalance 即最小活跃数负载均衡，活跃调用数越小，表明该 provider 效率越高，单位时间内可处理更多的请求。此时应优先将请求分配给该 provider。

基本思想：每个 provider 对应一个活跃数 active。初始情况下，所有 provider 活跃数均为0。每收到一个请求，活跃数加1，完成请求后则将活跃数减1。在服务运行一段时间后，性能好的 provider处理请求的速度更快，因此活跃数下降的也越快，此时这样的provider能够优先获取到新的服务请求

除了最小活跃数，LeastActiveLoadBalance 在实现上还引入了权重值。所以准确的来说，LeastActiveLoadBalance 是基于加权最小活跃数算法实现的。举个例子说明一下，在一个provider集群中，有两个性能优异的provider。某一时刻它们的活跃数相同，此时 Dubbo 会根据它们的权重去分配请求，权重越大，获取到新请求的概率就越大。如果两个provider权重相同，此时随机选择一个即可

源码如下：

/**
 * LeastActiveLoadBalance
 * <p>
 * Filter the number of invokers with the least number of active calls and count the weights and quantities of these invokers.
 * If there is only one invoker, use the invoker directly;
 * if there are multiple invokers and the weights are not the same, then random according to the total weight;
 * if there are multiple invokers and the same weight, then randomly called.
 */
public class LeastActiveLoadBalance extends AbstractLoadBalance {

    public static final String NAME = "leastactive";

    @Override
    protected <T> Invoker<T> doSelect(List<Invoker<T>> invokers, URL url, Invocation invocation) {
        // Number of invokers
        int length = invokers.size();
        // The least active value of all invokers
        // 最小活跃数
        int leastActive = -1;
        // The number of invokers having the same least active value (leastActive)
        // 具有相同“最小活跃数”的 provider（以下用 Invoker 代称）数量
        int leastCount = 0;
        // The index of invokers having the same least active value (leastActive)
        // leastIndexs 用于记录具有相同“最小活跃数”的 Invoker 在 invokers 列表中的下标位置信息
        int[] leastIndexes = new int[length];
        // the weight of every invokers
        int[] weights = new int[length];
        // The sum of the warmup weights of all the least active invokes
        int totalWeight = 0;
        // The weight of the first least active invoke
        int firstWeight = 0;
        // Every least active invoker has the same weight value?
        boolean sameWeight = true;


        // Filter out all the least active invokers
        for (int i = 0; i < length; i++) {
            Invoker<T> invoker = invokers.get(i);
            // Get the active number of the invoke
            // 获取 invoker 对应的活跃数
            int active = RpcStatus.getStatus(invoker.getUrl(), invocation.getMethodName()).getActive();
            // Get the weight of the invoke configuration. The default value is 100.
            // 获取 invoker 对应的权重
            int afterWarmup = getWeight(invoker, invocation);
            // save for later use
            weights[i] = afterWarmup;
            // If it is the first invoker or the active number of the invoker is less than the current least active number
            // 查找最小活跃数，使用 leastActive 变量记录
            if (leastActive == -1 || active < leastActive) {
                // Reset the active number of the current invoker to the least active number
                // 较小活跃数赋值给 leastActive
                leastActive = active;
                // Reset the number of least active invokers
                leastCount = 1;
                // Put the first least active invoker first in leastIndexes
                leastIndexes[0] = i;
                // Reset totalWeight
                totalWeight = afterWarmup;
                // Record the weight the first least active invoker
                firstWeight = afterWarmup;
                // Each invoke has the same weight (only one invoker here)
                sameWeight = true;
                // If current invoker's active value equals with leaseActive, then accumulating.
            } else if (active == leastActive) {
                // 当前 invoker 的活跃数与最小活跃数相同
                // Record the index of the least active invoker in leastIndexes order
                // 记录最小活跃数的 invoker 的下标位置
                leastIndexes[leastCount++] = i;
                // Accumulate the total weight of the least active invoker
                // 累加最小活跃 invoker 的权重
                totalWeight += afterWarmup;
                // If every invoker has the same weight?
                // 当前 invoker 的权重和 firstWeight 不相等，sameWeight 设置为 false
                if (sameWeight && i > 0
                        && afterWarmup != firstWeight) {
                    sameWeight = false;
                }
            }
        }
        // Choose an invoker from all the least active invokers
        // 最小活跃的 invoker 只有一个，直接返回该 invoker
        if (leastCount == 1) {
            // If we got exactly one invoker having the least active value, return this invoker directly.
            return invokers.get(leastIndexes[0]);
        }
        // 具有多个最小活跃的 invoker，但它们权重不同
        if (!sameWeight && totalWeight > 0) {
            // If (not every invoker has the same weight & at least one invoker's weight>0), select randomly based on 
            // totalWeight.
            // 随机获取一个在 [0,totalWeight) 之间的数字
            int offsetWeight = ThreadLocalRandom.current().nextInt(totalWeight);
            // Return a invoker based on the random value.
            for (int i = 0; i < leastCount; i++) {
                // 获取 i 位置的值，即 invoker 的下标值
                int leastIndex = leastIndexes[i];
                // offsetWeight 减去 该 invoker 对应的权重
                offsetWeight -= weights[leastIndex];
                if (offsetWeight < 0) {
                    // 返回该 invoker
                    return invokers.get(leastIndex);
                }
            }
        }
        // If all invokers have the same weight value or totalWeight=0, return evenly.
        // 如果多个最小活跃的 invoker 具有相同的权重，则从这些 invoker 中随机选取一个返回
        return invokers.get(leastIndexes[ThreadLocalRandom.current().nextInt(leastCount)]);
    }
}

总之，LeastActiveLoadBalance 对于选择 invoker 的思想总结为以下三点：

1）、只有一个最小活跃的 invoker 时，直接返回该 invoker 即可

2）、具有多个相同最小活跃的 invoker 但权重不同时，按照权重选择 invoker

3）、具有多个相同最小活跃的 invoker 而且权重也相同时，随机从这些 invoker 中选取一个

3.4 ConsistentHashLoadBalance

一致性 hash 算法由麻省理工学院的 Karger 及其合作者于1997年提出的，算法提出之初是用于大规模缓存系统的负载均衡

基本思想如下：

1）、首先根据 ip 或者其他的信息为缓存节点生成一个 hash，并将这个 hash 投射到 [0, 2^32 - 1] 的圆环上

2）、当有查询或写入请求时，则为缓存项的 key 生成一个 hash 值。然后查找第一个大于或等于该 hash 值的缓存节点，并到这个节点中查询或写入缓存项

3）、如果当前节点挂了，则在下一次查询或写入缓存时，为缓存项查找另一个大于其 hash 值的缓存节点即可

大致效果如下图所示，每个缓存节点在圆环上占据一个位置。如果缓存项的 key 的 hash 值小于缓存节点 hash 值，则到该缓存节点中存储或读取缓存项。比如下面绿色点对应的缓存项将会被存储到 cache-2 节点中。由于 cache-3 挂了，原本应该存到该节点中的缓存项最终会存储到 cache-4 节点中

下面来看看一致性 hash 在 Dubbo 中的应用。我们把上图的缓存节点替换成 Dubbo 的服务提供者，于是得到了下图：

上图中相同颜色的节点均属于同一个服务提供者，比如 Invoker1-1，Invoker1-2，……, Invoker1-160。这样做的目的是通过引入虚拟节点，让 Invoker 在圆环上分散开来，避免数据倾斜问题。所谓数据倾斜是指，由于节点不够分散，导致大量请求落到了同一个节点上，而其他节点只会接收到了少量请求的情况。如下图所示：

上图中由于 Invoker-1 和 Invoker-2 在圆环上分布不均，导致系统中75%的请求都会落到 Invoker-1 上，只有 25% 的请求会落到 Invoker-2 上。解决这个问题办法是引入虚拟节点，通过虚拟节点均衡各个节点的请求量

一致性 hash 算法了解了之后，下面我们分析 dubbo 中的 ConsistentHashLoadBalance 源码，如下：

/**
 * ConsistentHashLoadBalance
 */
public class ConsistentHashLoadBalance extends AbstractLoadBalance {
    public static final String NAME = "consistenthash";

    /**
     * Hash nodes name
     */
    public static final String HASH_NODES = "hash.nodes";

    /**
     * Hash arguments name
     */
    public static final String HASH_ARGUMENTS = "hash.arguments";

    private final ConcurrentMap<String, ConsistentHashSelector<?>> selectors = new ConcurrentHashMap<String, ConsistentHashSelector<?>>();

    @SuppressWarnings("unchecked")
    @Override
    protected <T> Invoker<T> doSelect(List<Invoker<T>> invokers, URL url, Invocation invocation) {
        // 获取方法名
        String methodName = RpcUtils.getMethodName(invocation);
        // key = 全限定类名 + "." + 方法名，比如 com.xxx.DemoService.sayHello
        String key = invokers.get(0).getUrl().getServiceKey() + "." + methodName;
        // 获取对象的 hashCode 值，具体就是调用对象的 hashCode 方法
        int identityHashCode = System.identityHashCode(invokers);
        ConsistentHashSelector<T> selector = (ConsistentHashSelector<T>) selectors.get(key);
        // 检测 invokers 是否变化了，如果 invokers 发生了变化，则意味着服务提供者数量发生了变化，可能新增也可能减少了。
        // 此时 selector.identityHashCode != identityHashCode 条件成立
        if (selector == null || selector.identityHashCode != identityHashCode) {
            // 创建 ConsistentHashSelector 并存入 selectors map中
            selectors.put(key, new ConsistentHashSelector<T>(invokers, methodName, identityHashCode));
            selector = (ConsistentHashSelector<T>) selectors.get(key);
        }
        // 调用 ConsistentHashSelector 的 select 方法选择 Invoker
        return selector.select(invocation);
    }

    private static final class ConsistentHashSelector<T> {

        // 使用 TreeMap 存储 Invoker 的虚拟节点
        private final TreeMap<Long, Invoker<T>> virtualInvokers;

        private final int replicaNumber;

        private final int identityHashCode;

        private final int[] argumentIndex;

        ConsistentHashSelector(List<Invoker<T>> invokers, String methodName, int identityHashCode) {
            this.virtualInvokers = new TreeMap<Long, Invoker<T>>();
            this.identityHashCode = identityHashCode;
            URL url = invokers.get(0).getUrl();
            // 从 URL 中获取虚拟节点数量，默认 160
            this.replicaNumber = url.getMethodParameter(methodName, HASH_NODES, 160);
            // 获取参与 hash 计算的参数下标值，默认对第一个参数进行 hash 运算
            String[] index = COMMA_SPLIT_PATTERN.split(url.getMethodParameter(methodName, HASH_ARGUMENTS, "0"));
            argumentIndex = new int[index.length];
            for (int i = 0; i < index.length; i++) {
                argumentIndex[i] = Integer.parseInt(index[i]);
            }
            //  每个 invoker 被虚拟成 160个节点
            for (Invoker<T> invoker : invokers) {
                // 获取 ip 地址
                String address = invoker.getUrl().getAddress();
                for (int i = 0; i < replicaNumber / 4; i++) {
                    // 对 address + i 进行md5运算，得到一个长度为16的字节数组
                    byte[] digest = md5(address + i);
                    // 对 digest 部分字节进行4次 hash 运算，得到四个不同的 long 型正整数
                    for (int h = 0; h < 4; h++) {
                        // h = 0 时，取 digest 中下标为 0、1、2、3 的4个字节进行位运算
                        // h = 1 时，取 digest 中下标为 4、5、6、7 的4个字节进行位运算
                        // h = 2 时，取 digest 中下标为 8、9、10、11 的4个字节进行位运算
                        // h = 3 时，取 digest 中下标为 12、13、14、15 的4个字节进行位运算
                        long m = hash(digest, h);
                        // 将 hash 到 invoker 的映射关系存储到 virtualInvokers 中，
                        // virtualInvokers 需要提供高效的查询操作，因此选用 TreeMap 作为存储结构
                        virtualInvokers.put(m, invoker);
                    }
                }
            }
        }

        public Invoker<T> select(Invocation invocation) {
            // 将参数转化为 key
            String key = toKey(invocation.getArguments());
            // 获取 key 的MD5值
            byte[] digest = md5(key);
            // 取 digest 数组的前四个字节进行 hash 运算，再将 hash 值传给 selectForKey 方法，查找合适的 Invoker
            return selectForKey(hash(digest, 0));
        }

        private String toKey(Object[] args) {
            // 将所有参数拼接起来
            StringBuilder buf = new StringBuilder();
            for (int i : argumentIndex) {
                if (i >= 0 && i < args.length) {
                    buf.append(args[i]);
                }
            }
            return buf.toString();
        }

        private Invoker<T> selectForKey(long hash) {
            // 到 TreeMap 中查找第一个节点值大于或等于当前 hash 的 Invoker
            Map.Entry<Long, Invoker<T>> entry = virtualInvokers.ceilingEntry(hash);
            // 如果 hash 大于 Invoker 在圆环上最大的位置，此时 entry = null， 需要将 TreeMap 的头节点赋值给 entry
            if (entry == null) {
                entry = virtualInvokers.firstEntry();
            }
            // 返回 Invoker
            return entry.getValue();
        }

        private long hash(byte[] digest, int number) {
            return (((long) (digest[3 + number * 4] & 0xFF) << 24)
                    | ((long) (digest[2 + number * 4] & 0xFF) << 16)
                    | ((long) (digest[1 + number * 4] & 0xFF) << 8)
                    | (digest[number * 4] & 0xFF))
                    & 0xFFFFFFFFL;
        }

        private byte[] md5(String value) {
            MessageDigest md5;
            try {
                md5 = MessageDigest.getInstance("MD5");
            } catch (NoSuchAlgorithmException e) {
                throw new IllegalStateException(e.getMessage(), e);
            }
            md5.reset();
            byte[] bytes = value.getBytes(StandardCharsets.UTF_8);
            md5.update(bytes);
            return md5.digest();
        }

    }

}

ConsistentHashLoadBalance 中实现了以下逻辑：

1）、先检测 invoker 列表是否变化了，如果变化了，则重新创建 ConsistentHashSelector，并存入 map 中

2）、创建 ConsistentHashSelector 时，先获取虚拟节点数和参与 hash 的参数下标值，默认对第一个参数进行 hash 运行，将每个 invoker 虚拟成160个节点，计算虚拟节点 hash 值并存入 TreeMap 中

3）、调用 ConsistentHashSelector 的 select 方法选择 Invoker，注意计算 hash 值时 ConsistentHashLoadBalance 只受参数值影响，具有相同参数值的请求将会被分配给同一个 provider

3.5 RoundRobinLoadBalance

RoundRobinLoadBalance 加权轮询负载均衡的实现，我们先了解一下什么是加权轮询？

轮询：指将请求轮流分配给每台服务器。举个例子，我们有三台服务器 A、B、C。我们将第一个请求分配给服务器 A，第二个请求分配给服务器 B，第三个请求分配给服务器 C，第四个请求再次分配给服务器 A。这个过程就叫做轮询。轮询是一种无状态负载均衡算法，实现简单，适用于每台服务器性能相近的场景下

加权轮询：现实情况下，每台服务器的性能几乎不可能相近，这时如果还将等量的请求轮询分配给性能较差的服务器，显然是不合理的。这时需要对轮询过程加权，以调控每台服务器的负载。经过加权后，每台服务器能够得到的请求数比例，接近或等于他们的权重比。比如服务器 A、B、C 权重比为 5:2:1。那么在8次请求中，服务器 A 将收到其中的5次请求，服务器 B 会收到其中的2次请求，服务器 C 则收到其中的1次请求

了解了加权轮询算法之后，下面我们分析 RoundRobinLoadBalance 源码，如下：

/**
 * Round robin load balance.
 */
public class RoundRobinLoadBalance extends AbstractLoadBalance {
    public static final String NAME = "roundrobin";
    
    private static final int RECYCLE_PERIOD = 60000;
    
    protected static class WeightedRoundRobin {
        // provider 的权重
        private int weight;
        // 当前权重值
        private AtomicLong current = new AtomicLong(0);
        // 最后更新时间
        private long lastUpdate;
        public int getWeight() {
            return weight;
        }
        public void setWeight(int weight) {
            this.weight = weight;
            // 初始情况下，当前权重为 0
            current.set(0);
        }
        public long increaseCurrent() {
            // 当前权重 + provider 权重
            return current.addAndGet(weight);
        }
        public void sel(int total) {
            // 当前权重 减去 权重之和
            current.addAndGet(-1 * total);
        }
        public long getLastUpdate() {
            return lastUpdate;
        }
        public void setLastUpdate(long lastUpdate) {
            this.lastUpdate = lastUpdate;
        }
    }

    // 最外层为服务类名 + 方法名，第二层为 url 到 WeightedRoundRobin 的映射关系。这里我们可以将 url 看成是服务提供者的 id
    private ConcurrentMap<String, ConcurrentMap<String, WeightedRoundRobin>> methodWeightMap = new ConcurrentHashMap<String, ConcurrentMap<String, WeightedRoundRobin>>();
    // 原子更新锁
    private AtomicBoolean updateLock = new AtomicBoolean();
    
    /**
     * get invoker addr list cached for specified invocation
     * <p>
     * <b>for unit test only</b>
     * 
     * @param invokers
     * @param invocation
     * @return
     */
    protected <T> Collection<String> getInvokerAddrList(List<Invoker<T>> invokers, Invocation invocation) {
        String key = invokers.get(0).getUrl().getServiceKey() + "." + invocation.getMethodName();
        Map<String, WeightedRoundRobin> map = methodWeightMap.get(key);
        if (map != null) {
            return map.keySet();
        }
        return null;
    }
    
    @Override
    protected <T> Invoker<T> doSelect(List<Invoker<T>> invokers, URL url, Invocation invocation) {
        // key = 全限定类名 + "." + 方法名，比如 com.xxx.DemoService.sayHello
        String key = invokers.get(0).getUrl().getServiceKey() + "." + invocation.getMethodName();
        ConcurrentMap<String, WeightedRoundRobin> map = methodWeightMap.get(key);
        if (map == null) {
            methodWeightMap.putIfAbsent(key, new ConcurrentHashMap<String, WeightedRoundRobin>());
            map = methodWeightMap.get(key);
        }
        int totalWeight = 0;
        // 最大权重
        long maxCurrent = Long.MIN_VALUE;
        long now = System.currentTimeMillis();
        Invoker<T> selectedInvoker = null;
        WeightedRoundRobin selectedWRR = null;
        // 下面这个循环主要做了这样几件事情：
        //   1、 遍历 Invoker 列表，检测当前 Invoker 是否有相应的 WeightedRoundRobin，没有则创建
        //   2. 检测 Invoker 权重是否发生了变化，若变化了，则更新 WeightedRoundRobin 的 weight 字段
        //   3. 让 current 字段加上自身权重，等价于 current += weight
        //   4. 设置 lastUpdate 字段，即 lastUpdate = now
        //   5. 寻找具有最大 current 的 Invoker，以及 Invoker 对应的 WeightedRoundRobin，暂存起来，留作后用
        //   6. 计算权重总和
        for (Invoker<T> invoker : invokers) {
            String identifyString = invoker.getUrl().toIdentityString();
            WeightedRoundRobin weightedRoundRobin = map.get(identifyString);
            // 获取当前 invoker 的权重
            int weight = getWeight(invoker, invocation);

            if (weightedRoundRobin == null) {
                // 创建 WeightedRoundRobin，设置权重，存入 map 中
                weightedRoundRobin = new WeightedRoundRobin();
                weightedRoundRobin.setWeight(weight);
                map.putIfAbsent(identifyString, weightedRoundRobin);
            }
            if (weight != weightedRoundRobin.getWeight()) {
                //weight changed
                // 权重变化了，更新成新的权重
                weightedRoundRobin.setWeight(weight);
            }
            // 获取当前权重：current += weight ，该值主要用来控制 invoker 是否能被轮询到
            long cur = weightedRoundRobin.increaseCurrent();
            // 设置最后更新时间
            weightedRoundRobin.setLastUpdate(now);
            // 获取当前权重最大的 invoker
            if (cur > maxCurrent) {
                // 当前最大权重赋值给 maxCurrent
                maxCurrent = cur;
                // 轮询到的 invoker
                selectedInvoker = invoker;
                selectedWRR = weightedRoundRobin;
            }
            // 累加权重之和
            totalWeight += weight;
        }

        // 对 <identifyString, WeightedRoundRobin> 进行检查，过滤掉长时间未被更新的节点。该节点可能挂了，invokers 中不包含该节点，所以该节点的 lastUpdate 长时间无法被更新。
        // 若未更新时长超过阈值后，就会被移除掉，默认阈值为60秒。
        if (!updateLock.get() && invokers.size() != map.size()) {
            // CAS方式获取到锁了
            if (updateLock.compareAndSet(false, true)) {
                try {
                    // copy -> modify -> update reference
                    ConcurrentMap<String, WeightedRoundRobin> newMap = new ConcurrentHashMap<String, WeightedRoundRobin>();
                    // 拷贝map 到 newMap 中
                    newMap.putAll(map);
                    Iterator<Entry<String, WeightedRoundRobin>> it = newMap.entrySet().iterator();
                    while (it.hasNext()) {
                        Entry<String, WeightedRoundRobin> item = it.next();
                        // 将本次时间减去上次更新时间大于 60秒 的 invoker 移除掉
                        if (now - item.getValue().getLastUpdate() > RECYCLE_PERIOD) {
                            it.remove();
                        }
                    }
                    // 最新的 WeightedRoundRobin 更新到 map 中
                    methodWeightMap.put(key, newMap);
                } finally {
                    updateLock.set(false);
                }
            }
        }
        // 本次轮询到的 invoker，即当前权重最大 的 Invoker
        if (selectedInvoker != null) {
            // 让 current 减去权重总和，等价于 current -= totalWeight
            selectedWRR.sel(totalWeight);
            // 返回具有当前权重最大 的 Invoker
            return selectedInvoker;
        }
        // should not happen here
        // 根据权重没有选出 invoker，则取第一个 invoker 返回
        return invokers.get(0);
    }

}

RoundRobinLoadBalance 加权轮询比较难以理解，下面使用具体例子给大家分析一下整个过程，假设有 A、B、C 三台机器，分别对应的权重 weights = [5,2,1]，过程如下：

第一次：循环过后 A、B、C 三台机器对应的 current 分别为[5,2,1]，机器A current 最大，返回机器A，返回之前 A 对应的 current 变为 5-8 = -3，即新的 current 分别为[-3,2,1]

第二次：循环过后 A、B、C 三台机器对应的 current 分别为[2,4,2]，机器B current 最大，返回机器B，返回之前 B 对应的 current 变为 4-8 = -4，即新的 current 分别为[2,-4,2]

第三次：循环过后 A、B、C 三台机器对应的 current 分别为[5,-2,3]，机器A current 最大，返回机器A，返回之前 A 对应的 current 变为 5-8 = -3，即新的 current 分别为[-3,-2,3]

第四次：循环过后 A、B、C 三台机器对应的 current 分别为[2,0,4]，机器C current 最大，返回机器C，返回之前 C 对应的 current 变为 4-8 = -4，即新的 current 分别为[2,0,-4]

第五次：循环过后 A、B、C 三台机器对应的 current 分别为[7,2,-3]，机器A current 最大，返回机器A，返回之前 A 对应的 current 变为 7-8 = -1，即新的 current 分别为[-1,2,-3]

第六次：循环过后 A、B、C 三台机器对应的 current 分别为[4,4,-2]，机器A current 最大，返回机器A，返回之前 A 对应的 current 变为 4-8 = -4，即新的 current 分别为[-4,4,-2]

第七次：循环过后 A、B、C 三台机器对应的 current 分别为[1,6,-1]，机器B current 最大，返回机器B，返回之前 B 对应的 current 变为 6-8 = -2，即新的 current 分别为[1,-2,-1]

第八次：循环过后 A、B、C 三台机器对应的 current 分别为[6,0,0]，机器A current 最大，返回机器A，返回之前 A 对应的 current 变为 6-8 = -2，即新的 current 分别为[-2,0,0]

上述经过了8次加权轮询之后，提供服务的机器顺序为[A、B、A、C、A、A、B、A]，可以看出不同服务器可以穿插获取请求，在这8次请求中， A、B、C 三台机器处理请求次数分别为 5次、2次、1次，正好符合权重比 5:2:1

四、总结

本文对 dubbo 中四种负载均衡策略源码做了详细的分析，理解负载均衡代码逻辑的关键之处在于对算法的理解。相信看过本文的同学会对负载均衡有了一定的了解，希望不当之处请指正

参考：

https://dubbo.apache.org/zh-cn/docs/source_code_guide/loadbalance.html