"Learn Sentinel Together" Principle - Call Chain

If you reprint , please indicate the original source, thank you!

series of articles

Sentinel principle - full analysis

Sentinel Principle - Sliding Window

Sentinel Principle - Entity Class

Sentinel combat - current limit

Sentinel combat - console chapter

Sentinel in action - rule persistence

Sentinel combat - cluster current limiting articles

Sentinel series of tutorials, now uploaded to github and gitee:

sentinel-tutorial.png

We already know the principle of sentinel's implementation of current limiting and downgrading, and its core is a call chain composed of a bunch of Slots.

Here is a general introduction to the functional responsibilities of each slot:

  • NodeSelectorSlotIt is responsible for collecting resource paths, and storing the call paths of these resources in a tree structure for current limiting and downgrading according to the call paths;
  • ClusterBuilderSlotIt is used to store the statistical information of the resource and the caller information, such as the RT, QPS, thread count, etc. of the resource, which will be used as the basis for multi-dimensional current limiting and downgrading;
  • StatisticsSlotIt is used to record and count runtime information of different dimensions;
  • SystemSlotThen control the total ingress flow through the state of the system, such as load1, etc.;
  • AuthoritySlotThen according to the black and white list, to do black and white list control;
  • FlowSlotIt is used to limit the current according to the preset current limiting rules and the status of the previous slot statistics;
  • DegradeSlotThen, through statistical information and preset rules, the circuit breaker is downgraded;

After each Slot executes the business logic processing, it will call the fireEntry() method, which will trigger the entry method of the next node, and the next node will call its fireEntry, and so on until the last Slot. A sentinel's chain of responsibility is formed.

Next, we will study the principle of these Slots in detail.

NodeSelectorSlot

NodeSelectorSlotIt is used to construct a call chain. Specifically, the call path of resources is encapsulated into nodes one by one, and then formed into a tree-like structure to form a complete call chain, which NodeSelectorSlot is the most critical and most complex of all slots. Slot, which involves the following core concepts:

  • Resource

Resources are a key concept in Sentinel. It can be anything in a Java application, such as a service provided by the application, or another service called by the application, or even a piece of code.

As long as the code defined by Sentinel API is a resource, it can be protected by Sentinel. In most cases, resources can be identified using method signatures, URLs, or even service names as resource names.

Simply put, a resource is a medium Sentinel uses to protect a system. The class used to wrap resources in the source code is: com.alibaba.csp.sentinel.slotchain.ResourceWrapper, which has two subclasses: StringResourceWrapperand MethodResourceWrapper, by name, you can wrap a string or a method as a resource.

For example, I have a service A that has a lot of requests and is often overwhelmed by the sudden increase in traffic. In order to prevent this, we can simply define a Sentinel resource and use this resource to adjust the request. So that the request that is allowed to pass will not crash service A.

resource.png

The status of each resource is also different, which depends on the service of the resource backend. Some resources may be relatively stable, and some resources may not be stable. Then in the entire call chain, Sentinel needs to control unstable resources. When a resource in the call link is unstable, for example, when the timeout or abnormal proportion increases, the call of this resource is restricted, and the request fails quickly, so as to avoid affecting other resources and eventually lead to The aftermath of an avalanche.

  • Context

A context is a class that holds metadata about the current state of a call chain, and a context is created each time a resource is entered. **The same resource name may create multiple contexts. **A Context contains three core objects:

1) The root node of the current call chain: EntranceNode

2) Current entry: Entry

3) The node associated with the current entry: Node

Only one entry currently being processed is saved in the context, and the root node of the call chain is also saved. Note that every time a new resource is entered, a new context is created.

  • Entry

Each call SphU#entry()will generate an Entry entry, and the entry will save the following data: the creation time of the entry, the node associated with the current entry, and the node corresponding to the call source associated with the current entry. Entry is an abstract class with only one implementation class, a static class in CtSph: CtEntry

  • Node

The node is used to save various real-time statistical information of a resource. It is an interface. By accessing the node, you can obtain the real-time status of the corresponding resource, and perform current limiting and downgrade operations based on this.

You may be confused when you see this. What is the use of so many classes? Next, let's go a step further and explore the role of these classes. Before that, let me show you the relationship between them, as follows As shown in the figure:

relations.png

Here is a brief introduction to the functions of several Nodes:

node effect
StatisticNode Perform specific resource statistics operations
DefaultNode The node holds the statistics of the specified resource in the specified context. When the entry method is called multiple times in the same context, a series of child nodes may be created under the node. <br />In addition, each DefaultNode will be associated with a ClusterNode
ClusterNode The node saves the overall runtime statistics of resources, including rt, number of threads, qps, etc. The same resource will share the same ClusterNode globally, no matter which context it belongs to
Entrance Node This node represents the entry node of a call chain tree, through which all child nodes in the call chain tree can be obtained

Context creation and destruction

The first thing we need to be clear about is that every time the entry() method is executed to attempt to break through a resource, a context is generated. This context will hold the root node of the call chain and the current entry.

Context is created by ContextUtil, the specific method is trueEntry, the code is as follows:

protected static Context trueEnter(String name, String origin) {
    // 先从ThreadLocal中获取
    Context context = contextHolder.get();
    if (context == null) {
        // 如果ThreadLocal中获取不到Context
        // 则根据name从map中获取根节点,只要是相同的资源名,就能直接从map中获取到node
        Map<String, DefaultNode> localCacheNameMap = contextNameNodeMap;
        DefaultNode node = localCacheNameMap.get(name);
        if (node == null) {
            // 省略部分代码
            try {
                LOCK.lock();
                node = contextNameNodeMap.get(name);
                if (node == null) {
                    // 省略部分代码
                    // 创建一个新的入口节点
                    node = new EntranceNode(new StringResourceWrapper(name, EntryType.IN), null);
                    Constants.ROOT.addChild(node);
                    // 省略部分代码
                }
            } finally {
                LOCK.unlock();
            }
        }
        // 创建一个新的Context,并设置Context的根节点,即设置EntranceNode
        context = new Context(node, name);
        context.setOrigin(origin);
        // 将该Context保存到ThreadLocal中去
        contextHolder.set(context);
    }
    return context;
}

In the above code, I omitted part of the code, and only kept the core part. From the source code, the process of generating Context can be clearly seen:

  • 1. Get it from ThreadLocal first. If you can get it, return it directly. If you can't get it, continue to step 2.
  • 2. Obtain from a static map according to the name of the context, if it can be obtained, return directly, otherwise continue to step 3
  • 3. After locking, perform a double check. If it still cannot be obtained from the map, create an EntranceNode, add the EntranceNode to a global ROOT node, and then add the node to the map (this Part of the code is omitted in the above code)
  • 4. Create a context according to EntranceNode, and save the context to ThreadLocal, the next request can be obtained directly

When will the context saved in ThreadLocal be cleared? From the code, we can see that the specific clearing work is in the exit method of ContextUtil. When this method is executed, the context object saved in ThreadLocal will be cleared. The specific code is very simple, so I will not post the code here.

When will the ContextUtil.exit method be called? There are two situations: one is when ContextUtil.exit is actively called, and the other is when an entry entry wants to exit and the trueExit method of the Entry is executed, the ContextUtil.exit method will be triggered at this time. But there is a premise that when the parent Entry of the current Entry is null, it means that the Entry is already the topmost root node, and the context can be cleared.

call chain tree

A call chain tree is created when the SphU#entry() method is called multiple times in a context. When the specific code creates the CtEntry object in the entry method:

CtEntry(ResourceWrapper resourceWrapper, ProcessorSlot<Object> chain, Context context) {
    super(resourceWrapper);
    this.chain = chain;
    this.context = context;
    // 获取「上下文」中上一次的入口
    parent = context.getCurEntry();
    if (parent != null) {
        // 然后将当前入口设置为上一次入口的子节点
        ((CtEntry)parent).child = this;
    }
    // 设置「上下文」的当前入口为该类本身
    context.setCurEntry(this);
}

It may not be so intuitive to see the code here, you can use some graphics to describe the process.

construction trunk

create context

The creation of the context has been analyzed above. During initialization, the curEntry attribute in the context has no value, as shown in the following figure:

create-context.png

Create Entry

Every time a new Entry object is created, the curEntry of the context is reset, and the original curEntry of the context is set as the parent node of the new Entry object, as shown in the following figure:

new-entry.png

Exit Entry

When an Entry exits, the curEntry of the context will be reset. When the Entry is the topmost entry, the context saved in ThreadLocal will also be cleared, as shown in the following figure:

entry-exit.png

Construct leaf nodes

The above process is to construct a tree of call chains, but this tree has only a trunk and no leaves. When were the leaf nodes created? DefaultNode is a leaf node, in which the statistical information of the target resource in the current state is saved. Through analysis, we know that the leaf node is created in the entry method of NodeSelectorSlot. The specific code is as follows:

@Override
public void entry(Context context, ResourceWrapper resourceWrapper, Object obj, int count, Object... args) throws Throwable {
    // 根据「上下文」的名称获取DefaultNode
    // 多线程环境下,每个线程都会创建一个context,
    // 只要资源名相同,则context的名称也相同,那么获取到的节点就相同
    DefaultNode node = map.get(context.getName());
    if (node == null) {
        synchronized (this) {
            node = map.get(context.getName());
            if (node == null) {
                // 如果当前「上下文」中没有该节点,则创建一个DefaultNode节点
                node = Env.nodeBuilder.buildTreeNode(resourceWrapper, null);
                // 省略部分代码
            }
            // 将当前node作为「上下文」的最后一个节点的子节点添加进去
            // 如果context的curEntry.parent.curNode为null,则添加到entranceNode中去
            // 否则添加到context的curEntry.parent.curNode中去
            ((DefaultNode)context.getLastNode()).addChild(node);
        }
    }
    // 将该节点设置为「上下文」中的当前节点
    // 实际是将当前节点赋值给context中curEntry的curNode
    // 在Context的getLastNode中会用到在此处设置的curNode
    context.setCurNode(node);
    fireEntry(context, resourceWrapper, node, count, args);
}

The above code can be decomposed into the following steps: 1) Obtain the DefaultNode corresponding to the current context. If not, a new DefaultNode node will be generated for the current call. Its function is to perform various statistical measures on resources for flow control; 2 ) Add the newly created DefaultNode node to the context as a child node of "entranceNode" or "curEntry.parent.curNode"; 3) Add the DefaultNode node to the context as the curNode of "curEntry".

Step 2 above is not executed every time. Let's look at step 3 first, setting the current DefaultNode as the curNode of the context, in fact, assigning the current node to the curNode of the curEntry in the context, which is represented graphically like this:

create-default-node.png

After creating different Entry multiple times and executing the entry method of NodeSelectorSlot, it will become such a call chain tree:

create-multi-default-node.png

PS: The node0, node1, and node2 in the figure here may be the same node, because the node obtained from the map in the same context is the same, and different node names are used here just for the sake of clarity.

save child nodes

The construction process of the leaf node has been analyzed above, and the leaf node is stored in the curNode attribute of each Entry.

We know that only the entry node and the current Entry are saved in the context. When is the child node saved? In fact, the child node is saved in step 2 in the above code.

Let's analyze the situation of step 2 above:

When the entry method of NodeSelectorSlot is called for the first time, there must be no DefaultNode in the map, so it will enter the second step, create a node, and add the node to the child node of the lastNode of the context after the creation is completed. Let's take a look at the context's getLastNode method:

public Node getLastNode() {
    // 如果curEntry不存在时,返回entranceNode
    // 否则返回curEntry的lastNode,
    // 需要注意的是curEntry的lastNode是获取的parent的curNode,
    // 如果每次进入的资源不同,就会每次都创建一个CtEntry,则parent为null,
    // 所以curEntry.getLastNode()也为null
    if (curEntry != null && curEntry.getLastNode() != null) {
        return curEntry.getLastNode();
    } else {
        return entranceNode;
    }
}

We can know from the code that the value of lastNode may be entryNode in context or curEntry.parent.curNode, but they are all nodes of type "DefaultNode", and all child nodes of DefaultNode are stored in a HashSet.

When the getLastNode method is called for the first time, curEntry in the context is null, because curEntry is only assigned in step 3. Therefore, the initial value of lastNode is the entryNode of the context. Then after adding node to the child node of entryNode, it becomes the following:

add-child-1.png

Then enter again, the resource name is different, a new Entry will be generated again, and the above graph will become the following:

add-child-2.png

At this time, the getLastNode method of the context is called again, because the parent of curEntry is no longer null at this time, so the obtained lastNode is curEntry.parent.curNode. It can be easily seen in the above figure that this node is node0 . Then add the current node node1 to the child nodes of lastNode, and the above graph becomes the following:

add-child-3.png

Then set the current node to the curNode of the context, and the above graph becomes the following:

add-child-4.png

If you create another Entry, and then enter a different resource name again, the above picture becomes as follows:

add-child-5.png

So far, the basic functions of NodeSelectorSlot have been roughly analyzed.

PS: The above analysis is based on the premise that each time SphU.entry(name) is executed, the resource name is different. If the resource names are the same, then the generated nodes are the same, then the node will only be added to the child nodes of the entryNode for the first time. Otherwise, only a new Entry will be created, and then the curEntry in the context will be replaced. value of .

ClusterBuilderSlot

After the entry method of NodeSelectorSlot is executed, the fireEntry method will be called, which will trigger the entry method of ClusterBuilderSlot.

The entry method of ClusterBuilderSlot is relatively simple. The specific code is as follows:

@Override
public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count, Object... args) throws Throwable {
    if (clusterNode == null) {
        synchronized (lock) {
            if (clusterNode == null) {
                // Create the cluster node.
                clusterNode = Env.nodeBuilder.buildClusterNode();
                // 将clusterNode保存到全局的map中去
                HashMap<ResourceWrapper, ClusterNode> newMap = new HashMap<ResourceWrapper, ClusterNode>(16);
                newMap.putAll(clusterNodeMap);
                newMap.put(node.getId(), clusterNode);

                clusterNodeMap = newMap;
            }
        }
    }
    // 将clusterNode塞到DefaultNode中去
    node.setClusterNode(clusterNode);

    // 省略部分代码

    fireEntry(context, resourceWrapper, node, count, args);
}

The responsibilities of NodeSelectorSlot are relatively simple and mainly do two things:

1. Create a clusterNode for each resource, and then plug the clusterNode into the DefaultNode

2. Keep the clusterNode in the global map and use the resource as the key of the map

PS: There is only one ClusterNode for a resource, but there can be multiple DefaultNodes

StatistcSlot

StatisticSlot is responsible for counting the real-time status of resources. The specific code is as follows:

@Override
public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count, Object... args) throws Throwable {
    try {
        // 触发下一个Slot的entry方法
        fireEntry(context, resourceWrapper, node, count, args);
        // 如果能通过SlotChain中后面的Slot的entry方法,说明没有被限流或降级
        // 统计信息
        node.increaseThreadNum();
        node.addPassRequest();
        // 省略部分代码
    } catch (BlockException e) {
        context.getCurEntry().setError(e);
        // Add block count.
        node.increaseBlockedQps();
        // 省略部分代码
        throw e;
    } catch (Throwable e) {
        context.getCurEntry().setError(e);
        // Should not happen
        node.increaseExceptionQps();
        // 省略部分代码
        throw e;
    }
}

@Override
public void exit(Context context, ResourceWrapper resourceWrapper, int count, Object... args) {
    DefaultNode node = (DefaultNode)context.getCurNode();
    if (context.getCurEntry().getError() == null) {
        long rt = TimeUtil.currentTimeMillis() - context.getCurEntry().getCreateTime();
        if (rt > Constants.TIME_DROP_VALVE) {
            rt = Constants.TIME_DROP_VALVE;
        }
        node.rt(rt);
        // 省略部分代码
        node.decreaseThreadNum();
		// 省略部分代码
    } 
    fireExit(context, resourceWrapper, count);
}

The code is divided into two parts. The first part is the entry method. This method will first trigger the entry method of the subsequent slot, that is, the rules of SystemSlot, FlowSlot, DegradeSlot, etc. If the rules do not pass, a BlockException will be thrown, which will be counted in the node. The number of blocks. Otherwise, information such as the number of requests passed and the number of threads will be counted in the node. The second part is in the exit method. When exiting the Entry entry, the rt time will be counted and the number of threads will be reduced.

The real-time data of these statistics will be used by subsequent verification rules, and the specific statistical method 滑动窗口is realized by . Later I will analyze the principle of sliding window in detail.

SystemSlot

SystemSlot performs flow control based on the total request statistics, mainly to prevent the system from being overwhelmed. The specific code is as follows:

@Override
public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count, Object... args)
    throws Throwable {
    SystemRuleManager.checkSystem(resourceWrapper);
    fireEntry(context, resourceWrapper, node, count, args);
}

public static void checkSystem(ResourceWrapper resourceWrapper) throws BlockException {
    // 省略部分代码
    // total qps
    double currentQps = Constants.ENTRY_NODE.successQps();
    if (currentQps > qps) {
        throw new SystemBlockException(resourceWrapper.getName(), "qps");
    }
    // total thread
    int currentThread = Constants.ENTRY_NODE.curThreadNum();
    if (currentThread > maxThread) {
        throw new SystemBlockException(resourceWrapper.getName(), "thread");
    }
    double rt = Constants.ENTRY_NODE.avgRt();
    if (rt > maxRt) {
        throw new SystemBlockException(resourceWrapper.getName(), "rt");
    }
    // 完全按照RT,BBR算法来
    if (highestSystemLoadIsSet && getCurrentSystemAvgLoad() > highestSystemLoad) {
        if (currentThread > 1 &&
            currentThread > Constants.ENTRY_NODE.maxSuccessQps() * Constants.ENTRY_NODE.minRt() / 1000) {
            throw new SystemBlockException(resourceWrapper.getName(), "load");
        }
    }
}

Among them, Constants.ENTRY_NODE is a global ClusterNode, and the value of this node is counted in StatisticsSlot.

AuthoritySlot

What AuthoritySlot does is relatively simple. It mainly filters according to the black and white list. As long as there is a rule that fails to pass the verification, an exception will be thrown.

@Override
public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count, Object... args) throws Throwable {
    AuthorityRuleManager.checkAuthority(resourceWrapper, context, node, count);
    fireEntry(context, resourceWrapper, node, count, args);
}

public static void checkAuthority(ResourceWrapper resource, Context context, DefaultNode node, int count) throws BlockException {
    if (authorityRules == null) {
        return;
    }
    // 根据资源名称获取相应的规则
    List<AuthorityRule> rules = authorityRules.get(resource.getName());
    if (rules == null) {
        return;
    }
    for (AuthorityRule rule : rules) {
        // 只要有一条规则校验不通过,就抛出AuthorityException
        if (!rule.passCheck(context, node, count)) {
            throw new AuthorityException(context.getOrigin());
        }
    }
}

FlowSlot

FlowSlot mainly matches and checks the set current limiting rules based on the previously calculated information. If the rule verification fails, the current limiting is performed. The specific code is as follows:

@Override
public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count, Object... args) throws Throwable {
    FlowRuleManager.checkFlow(resourceWrapper, context, node, count);
    fireEntry(context, resourceWrapper, node, count, args);
}

public static void checkFlow(ResourceWrapper resource, Context context, DefaultNode node, int count) throws BlockException {
    List<FlowRule> rules = flowRules.get(resource.getName());
    if (rules != null) {
        for (FlowRule rule : rules) {
            if (!rule.passCheck(context, node, count)) {
                throw new FlowException(rule.getLimitApp());
            }
        }
    }
}

DegradeSlot

DegradeSlot mainly matches and verifies the set downgrade rules according to the previously collected information. If the rule verification fails, the downgrade is performed. The specific code is as follows:

@Override
public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count, Object... args) throws Throwable {
    DegradeRuleManager.checkDegrade(resourceWrapper, context, node, count);
    fireEntry(context, resourceWrapper, node, count, args);
}

public static void checkDegrade(ResourceWrapper resource, Context context, DefaultNode node, int count) throws BlockException {
    List<DegradeRule> rules = degradeRules.get(resource.getName());
    if (rules != null) {
        for (DegradeRule rule : rules) {
            if (!rule.passCheck(context, node, count)) {
                throw new DegradeException(rule.getLimitApp());
            }
        }
    }
}

Summarize

Sentinel's current limiting and downgrading functions are mainly implemented through a SlotChain. In the chain slot, there are 7 core Slots, and these Slots perform their respective functions and can be divided into the following types:

1. NodeSelectorSlot and ClusterBuilderSlot for resource call path construction

2. StatisticsSlot for real-time status statistics of resources

3. SystemSlot, AuthoritySlot, FlowSlot, DegradeSlot for system protection, current limiting, downgrade and other rule verification

The latter Slots depend on the results of the previous Slot statistics. So far, the functions of each Slot have been basically analyzed clearly.

For more original good articles, please pay attention to "Houyi Code by Code"

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324127826&siteId=291194637