Reliability assurance mechanism ack message and fail principle Storm

.Storm spout of a bolt of nextTuple and execute

Storm's API is very rich, but remember, ACK mechanism described in this article require special support. In other words, the paper said ACK mechanism is just an alternative mechanism Storm, you can choose to ignore it no confirmation of a lightweight way to use the Best effort Storm, this article describes the reason for ACK mechanism alone, because of its elegance.

When writing nextTuple spout, and eventually you have to emit a Tuple, remember to do the following:

this.collector.emit(new Values(...), msgID);

MsgID this parameter must be, must be! Otherwise Storm does not trace the Tuple. This means that after executing the spout emit in the Storm, will create a map similar to the following in the system:

msgID-Tuple:ACKvalue

ACKvalue its initial value of a random number which is generated msg that emit from the spout, obviously, if a plurality of spout once emit msg, then the result of the exclusive OR is ACKvalue all of these values. This will be described in detail later.

In addition, emit new msg in the execute method in subsequent bolt, remember, be sure to bring tuple parameters, or follow the topology will be lost for tracking the tuple, to write:

collector.emit(tuple, new Values(...));

Instead of:

collector.emit(new Values(...));

Of course, if you want to deliberately not tracking results in a Tuple subsequent topology (for example, you just want to ensure the successful implementation of several specific steps), you can no longer carry the tuple parameters emit method execute a particular bolt in all up to you.

Two .ACK principle of evolutionary context

If the sender sends a message, the recipient want to confirm whether it has received, the simplest solution is the following:

14534869-341f333c92e811da
ACK Principle 1

Here we can say a lot, such as TCP is so achievable, TCP that old stuff really do not want to say, so stating it this way is to use a method of in-band ACK reliability assurance. Relay extended to a pipelined manner, it becomes like the following:

14534869-556936872dde6b11
ACK Principle 2

But, think about this as a problem.

Pressure intermediate nodes big? You know, even if TCP is not supposed to work. TCP is an end-protocol, but end to end confirmation, and apparently no Storm layered design, apparently in the application layer, this embodiment is not suitable for relaying confirmation. Ever since, we have to choose a end to end to see how the program:

14534869-8831ea8ae324cf05
ACK Principle 3

It looks pretty good look, but you can go further. Because application logic underlying layer is often not so fixed, the complicated service logic, as far as to be decoupled by middleware, i.e., to avoid direct communication between the receiver and the sender. So, we obviously replaced the band ACK way of confirmation:

14534869-f3c3ee5d7ab85653
ACK Principle 4

Well, now's data path was cleared of clean, confirm assisted by the intermediate service called Acker, Acker middle of this service may be a cluster, or more. An outer band and the data band completely separate confirmation, to improve the robustness of the system, the structure is very clear.

But this perfect?

You know, Storm topology is ultimately a directed acyclic graph is obviously not a straight line ah. A bolt treated, may divide multiple msg sent to multiple bolt, at the same time, a bolt can receive multiple Tuple from multiple bolt, such as merge multiple logs often faced with this situation. A typical scenario is shown in the diagram below:

14534869-8585ac609e64e9be
Directed acyclic graph

This is how to do?

In fact, as simple as a bolt can track every launch msg belongs Tuple, targeting to reach the final msg leaf node belongs Tuple, then think of ways to deal with the results tell spout, of course, obvious way is through Acker to care:

14534869-853b8a0408241ce7
Be forwarded by Acker

Approach are obvious and universal, but the problem has also been written in the drawing. To solve this problem, complex data structures and complex maintenance facilities. Storm uses a very clever way to solve this problem.

The ACK achieve three .Storm

Storm uses a very simple XOR to solve this problem.

If the upstream bolt in emit a msg when, how to confirm bolt downstream you receive it? In fact, this is a circulation problem. In fact, there is no need to explicitly tell the upstream downstream bolt I received, instead of using the following methods can be:

14534869-c141fdd7669efd3a
image

bolt1在发出一个msg的时候,会将一个随机生成的ID报告给Acker,并将此ID携带到msg中向下游传递,一旦bolt2收到了一个属于同一个Tuple的msg,它便会将解析出来的ID也传递给Acker,不出意外,二者XOR的结果将会是0!

这就是原理。下面我们看一下更一般的图示:

14534869-ad9ef60c747e5756
image

Acker要做的仅仅是搜集所有bolt报告给它的其接收到的ID和发出的ID,然后进行XOR运算。当然了,在bolt在ack方法调用中往Acker发送ID的时候,其会携带Tuple信息,Acker会根据这个Tuple对应到内存中其保存的表项。上面的例子,按照这种方式运作,便是:

14534869-eb6e73faa0065dd1
ack运作原理

最后,给出一个一般化的流程:

14534869-ee12c9eabace32f9
一般化的流程

如果有哪一步错误没有使得msg正确传递到叶子终点,显然Acker不会收到能使XOR结果为0的ID,进而最终spout会超时重发msg。但是…

但是更容易出现的问题是,你忘了调用ack方法!不是出错了导致ID没有报道,而是你根本就忘了上报!这会导致该轮msg处理失败,但这不是最严重的后果,最严重的显然是OOM,即OutOfMemoryError…

四.神奇的XOR

很多人都知道xor的概念以及如何操作,但是很少有人真正理解它。

很少有人理解XOR的理由很简单,不是因为它很难,而是因为中文对XOR的翻译没有做到信达雅中的达。

什么是XOR,中文解释是异或,也许是计算机专业学生普遍是古文学的不好,其实“异”这个词本身就有排他的含义,如果不理解这个,就很难直接掌握XOR的本质!

XOR的本质含义就是“exclusive or”,你要么是A,要么是B,不能既是A又是B。用韦恩图解释要简单的多:

14534869-7de959ef5c591822
A xor B

简单做个实验,把B盖上去,即计算一下A xor B xor B的值,按照概念,即互斥的概念,下图就是答案:

14534869-08ac4a7bc14ef87c
A xor B xor B

Unlike a total solar eclipse like it? This is the total solar eclipse of reason. By this example, cover the lid, you are not only to understand the xor operation can be done by two digital switching out?

If we do not cover a B, a C but cover it? Exclusive or three phases, namely:

14534869-0ecd51494c1de5f4
A xor B xor C

Find the law yet? No matter which set a cap to the above, overlapped portion will be anti-color, anti-cover once again, according to this, you will find XOR full compliance with the law and in conjunction with the exchange rate the way it is, you can find a group ...

Due to the presence of exchange law, Storm inside which bolt first call ack does not matter. As long as Acker found ACKvalue becomes 0, it will report the corresponding spout msg successfully processed, the final outcome:

14534869-3eb01c72eecaa40a
XOR has nothing to do with the order

Reprinted from the article:
the use of magic xor (exclusive or) to solve the problem Storm confirmation of Tuple

Reproduced in: https: //www.jianshu.com/p/59ba096841de

Guess you like

Origin blog.csdn.net/weixin_34082789/article/details/91246133