SOFA Source Code Analysis - Load Balancing and Consistent Hash

foreword

SOFA has built-in load balancing and supports 5 load balancing algorithms, random (default algorithm), local priority, round-robin algorithm, consistent hash, and load-poll by weight (not recommended, it has been marked and abandoned).

Let's take a look at their implementation together (the focus is on consistent hashing).

Source code analysis

The specific source code is in the AbstractLoadBalancer class, and the subclass needs to implement the doSelect method:


public abstract ProviderInfo doSelect(SofaRequest invocation, List<ProviderInfo> providerInfos);

Random is the default algorithm, and the RandomLoadBalancer class is the specific implementation, which is basically the logic of providerInfos.get(random.nextInt(size)), but considering the weight, it will randomly find a number according to the total weight, and then the number will decrease until it is less than When 0, determine that node. Seems like it has nothing to do with weights? Someone who knows can guide me.

The local priority algorithm is to find the localhost of the local machine for matching, preferentially select the service with the same address as the local machine, and then randomly select one from the list of these services.

Polling is one by one. Use the take-from-algorithm.

Then there is the consistent Hash, and let's talk about it. It is necessary to review the consistent hash algorithm demo we wrote before: implement a consistent hash algorithm yourself .

The concrete implementation of SOFA is the ConsistentHashLoadBalancer class. Internally maintains a Map, each service corresponds to a selector, this selector internally maintains a TreeMap, SOFA will evenly hash all nodes in the Map, that is, the hash ring, using virtual nodes. When the node is obtained according to the key of the service (if the service list has not changed), the node larger than it will be found through the hash value, and the same request will find the same node every time (according to the first parameter hash).

Let's take a look at the specific implementation.

First look at the doSelect method:

@Override
public ProviderInfo doSelect(SofaRequest request, List<ProviderInfo> providerInfos) {
    String interfaceId = request.getInterfaceName();
    String method = request.getMethodName();
    String key = interfaceId + "#" + method;
    int hashcode = providerInfos.hashCode(); // 判断是否同样的服务列表
    Selector selector = selectorCache.get(key);
    if (selector == null // 原来没有
        ||
        selector.getHashCode() != hashcode) { // 或者服务列表已经变化
        selector = new Selector(interfaceId, method, providerInfos, hashcode);
        selectorCache.put(key, selector);
    }
    return selector.select(request);
}

Find the corresponding service selector from the map according to the interface name and method name. If not, or the service list has changed, create a new one, which is a bit different from the cache consistency Hash design.

The purpose of the cache consistency hash is: if the service list changes, such as the increase or decrease of nodes, the cached key can still find the corresponding cache node through the same hash algorithm (the data of at most one node is invalid - if the increase or decrease a node).

But the purpose of the consistent hash of the RPC service is to hope that the same request always falls on the same node.

And here it is impossible to determine which node is added, and simply create a new one directly.

Then, calling the select method of the selection returns a service node.

First look at the constructor of the selector:

public Selector(String interfaceId, String method, List<ProviderInfo> actualNodes, int hashcode) {
    this.interfaceId = interfaceId;
    this.method = method;
    this.hashcode = hashcode;
    // 创建虚拟节点环 (默认一个provider共创建128个虚拟节点,较多比较均匀)
    this.virtualNodes = new TreeMap<Long, ProviderInfo>();
    int num = 128;
    for (ProviderInfo providerInfo : actualNodes) {
        for (int i = 0; i < num / 4; i++) {
            byte[] digest = messageDigest(providerInfo.getHost() + providerInfo.getPort() + i);
            for (int h = 0; h < 4; h++) {
                long m = hash(digest, h);
                virtualNodes.put(m, providerInfo);
            }
        }
    }
}

The main logic is to construct virtual nodes and use TreeMap, the same as what we implemented before. So how are virtual nodes designed?

SOFA assigns 128 virtual nodes to each node, which are stored in the Map, that is, 128 references point to the same object. The hash algorithm here is used for md5 and then a complex hash wave, in order to be more balanced.

When using the select method, how to find the same node?

Code:

private ProviderInfo sekectForKey(long hash) {
    ProviderInfo providerInfo = virtualNodes.get(hash);
    if (providerInfo == null) {
        SortedMap<Long, ProviderInfo> tailMap = virtualNodes.tailMap(hash);
        if (tailMap.isEmpty()) {
            hash = virtualNodes.firstKey();
        } else {
            hash = tailMap.firstKey();
        }
        providerInfo = virtualNodes.get(hash);
    }
    return providerInfo;
}

The first parameter of the hash method is to find the first node in the node set that is larger than his hash value, and if there is no larger than him, then the smallest node (back to the origin).

Standard consistent hash algorithm. It is guaranteed that every same request will land on the same node.

Summarize

The purpose of RPC coherent hashing and cache coherent hashing are different.
The purpose of the cache is: when the cache nodes in the cluster increase or decrease, the service accessing the same key can still access the same node (there are few invalid nodes caused by the increase or decrease). It will not cause inaccessibility like ordinary retrieval algorithms, which will cause cache avalanches or even DB downtime.

The purpose of RPC is to hope that the same request (the first parameter is the same) will be hit on the same node every time.

Thinking about it from another angle, it is actually the same, and the purpose is to visit the same node every time for the same request.

Well, that's it for SOFA's load balancing.

bye!!!

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325254452&siteId=291194637