Reaction architecture (1): Basic Concepts

1 What is reactive?

1.1 Introduction Scheme

In order to intuitively understand what is reactive, we start with a more familiar analogy we all start. First open Excel, enter the following formula B, C, D three:

B, C and D values for each cell are dependent on its three left of the cell, when we sequentially input 1, 2, and 3, the changes will be automatically transferred to the B, C and D in the three columns A, and a respective state change trigger, as shown below:

A column top to bottom, we can imagine a data stream, an event is triggered every time data arrives, the event is propagated to the cell to the right, which will handle the event and to change its state. This process is actually a series of reactive core idea.

Through this example, you should be able to feel the core reaction type is the data stream (data stream), let's look at an example. Many of us will take the subway to work every day on the subway every two minutes, and will be shared with a lot of subway track, because you will not worry about rear-end, and not sit and last two carriages of it? In fact, if by reactive architecture to build the subway system, there is no need to worry about rear-end problem. In the reactive system, the real-time per metro will own speed and location status information to the other on the downstream subway, but also receive real-time status information of the other subway, and make real-time feedback. For example, when suddenly and unexpectedly found Metro downstream deceleration immediately adjust its speed and deceleration event notification to the upstream subway, so, on the subway all the entire track mechanism to form a back pressure (back pressure), automatically adjusted according to the state of the upstream and downstream their speed. Here we look at the Wikipedia definition of reactive programming:

Reactive programming (reactive programming) is based on a data stream (data stream) and pass changes (propagation of change) of the declarative (Declarative) programming paradigm.

From the above definition, we can see that the core reactive programming data stream is transmitted and changes. Wikipedia definition given more generic, is universal, no distinction between asynchronous and synchronous data streams and, more specifically, asynchronous data streams (asynchronous data stream) or the reactive stream (reactive stream) is the reaction formula best practices for programming. Careful readers will find that talk so much, is not that the observer mode (Observer Pattern) Well! In fact, this is not accurate, in fact, does not refer to a specific type of reaction technology, but refers to some of the architectural design principles, the observer pattern is a means to achieve reactive, reactive flow in the next (Reactive Stream) a we will find reactive flow model based observer expanded more features, more powerful and easier to use.

1.2 reactive History

As early as 1985, David Harel and Amir Pnueli had published the "development of reactive systems" paper, in the paper, they used the dichotomy of the complex calculations to summarize proposed conversion formula (transformative) and reactive (reactive )system. It refers to systems in which the reaction can continue to interact with the environment, and to respond in a timely manner. Such as video surveillance system will be continuously monitored, and when a stranger broke into immediately trigger an alarm.

Table 1 reactive History

time	event
1985	"The development of reactive systems" by David Harel & Amir Pnueli
1997	Functional reactive programming (FRP) by Conal Elliott
2009	Rx 1.0 for .NET by Erik Meijer’s team at Microsoft
2013	Rx for Java by Netflix
2013	Reactive Declaration V1.0
2014	Reactive Declaration V2.0
2015	Reactive Streams
Now	RxJava 3, Akka Streams, Reactor, Vert.x 3, Ratpack

1 Google search trends

As can be seen on Google search trends, from the beginning of June 2013, reactive programming search elements there has been explosive growth trend, because in June 2013 Declaration reactive released the first version.

Introduction 1.3 ReactiveX

ReactiveX is an acronym for Reactive Extensions, usually abbreviated as Rx, originally an extension of LINQ, Microsoft developed by a team of architects led by Erik Meijer, open in November 2012. Rx is a programming model, the goal is to provide a consistent programming interface to help developers more easily handle asynchronous data stream. Rx supports almost all popular programming languages, most of the language library is responsible for the organization ReactiveX maintenance, more popular there RxJava / RxJS / Rx.NET / Rx.Scala / Rx.Swift, community site is http://reactivex.io /.

1.4 reactive Declaration

June 2013, Roland Kuhn, who issued a "Declaration of reactive," The Declaration defines the architecture design principles reactive systems should have. Reaction system meets the design principle is called reactive systems. The reaction formula of the Declaration, the reaction systems requires responsivity (Responsive), resilience (the Resilient), elastic (Elastic) and message-driven (Message Driven) four characteristics, the following is an excerpt from the reaction of formula declaration official website, description abstract, we do not tangle details, we can understand.

Instant responsive (Responsive). The system should respond immediately to user requests. Instant response is the cornerstone of the availability and practicality, and more importantly, the immediate response means that problems can be detected quickly and effectively deal with them.
Resilience (Resilient). When a failure occurs in the system is still able to maintain immediate response, recovery for each component are delegated to another external components. In addition, when necessary, to ensure high availability through replication. Therefore, the client components no longer bear the processing component failure.
Elastic (Elastic). The system remains under real-time responsiveness to changing workloads. The reaction systems can react to the rate of change in the input load, such as the underlying computing resources via laterally retractable. This means that there is not the bottleneck of the design center, so that the individual components may be fragmented or replication, and load balancing between them.
Message driven (Message Driven). Reaction systems rely on asynchronous messaging, thus ensuring loose coupling, isolation, there is a clear boundary between the position of the transparent component. The border also provides a means of delegating will fail as a message. Use explicit messaging, and can be monitored by shaping the message flow in the system queue, and the back pressure applied when necessary to achieve load management, and an elastic flow control. Message using location transparent transfer as a means of communication, or so across the cluster use the same structure and semantic components in a single host managing failure as possible. Non-blocking communication so that only the recipient can only consume resources activities, thus reducing system overhead.

1.5 Reactive Streams

Scheme declaration describes the design principles only and does not give a specific implementation specification, the reaction results in each frame are each achieve a its API specification, and can not communicate with each other. To solve this problem, Reactive Streams specification came into being.

Reactive Streams的目标是定义一组最小化的异步流处理接口，使得在不同框架之间，甚至不同语言之间实现交互性。Reactive Streams规范包含了4个接口，7个方法，43条规则以及一套用于兼容性测试的标准套件TCK(The Technology Compatibility Kit)。该规范已经成为了业界标准，并且在Java 9中已经实现，对应的实现接口为java.util.concurrent.Flow。有一点需要提醒的是，虽然Java 9已经实现了Reactive Streams，但这并不意味着像RxJava、Reactor、Akka Streams这些流处理框架就没有意义了，事实上恰恰相反。Reactive Streams的目的在于增强不同框架之间的交互性，提供的是一组最小功能集合，无法满足我们日常的流处理需求，例如组合、过滤、缓存、限流等功能都需要额外实现。流处理框架的目的就在于提供这些额外的功能实现，并通过Reactive Streams规范实现跨框架的交互性。

举个例子来说，MongoDB的Java驱动实现了Reactive Streams规范，开发者使用任何一个流处理框架，仅需要几行代码即可实时监听数据库的变化。例如下面是基于Akka Stream的实现代码：

mongo
  .collection("users")  
  .watch()  
  .toSource  
  .groupedWithin(10, 1.second)  
  .throttle(1, 1.second) .runForeach { docs => // 处理增量数据 }

上面的几行代码实现了如下功能：

将接收到的流数据进行缓冲以方便批处理，满足以下任一条件便结束缓冲并向后传递
- 缓冲满10个元素
- 缓冲时间超过了1000毫秒
对缓冲后的元素进行流控，每秒只允许通过1个元素

1.6 小结

本章首先通过形象的例子让大家对反应式系统有一个直观的认知，然后带领大家一起回顾了反应式的发展历史，最后向大家介绍了三个反应式项目，包括ReactiveX、反应式宣言和Reactive Streams。 ReactiveX是反应式扩展，旨在为各个编程语言提供反应式编程工具。反应式宣言站在一个更高的角度，使用抽象语言向大家描述什么是反应式系统，以及实现反应式系统应该遵循的一些设计原则。Reactive Streams规范的目的在于提高各个反应式框架之间的交互性，本身并不适合作为开发框架直接使用，开发者应该选择一个成熟的反应式框架，并通过Reactive Streams规范与其它框架实现交互。

2 为什么需要反应式？

2.1 命令式编程 VS 声明式编程

实际上我们绝大多数程序员都在使用传统的命令式编程，这也是计算机的工作方式。命令式编程就是对硬件操作的抽象，程序员需要通过指令，精确的告诉计算机干什么事情。这也是编程工作中最枯燥的地方，程序员需要耗尽脑汁，将复杂、易变的业务需求翻译成精确的计算机指令。

声明式编程是解决程序员的利器，声明式编程更关注我想要什么(What)而不是怎么去做(How)。SQL是最典型的声明式语言，我们通过SQL描述想要什么，最终由数据库引擎执行SQL语句并将结果返回给我们。

SELECT COUNT(*)  FROM USER u  WHERE u.age > 30

1.5节使用Akka Stream实现监听MongoDB的代码也是典型的声明式编程，如果采用命令式方式重写，不仅费时费力，而且还会导致代码量暴增，最重要的是要通过更多的单元测试保证实现的正确性。

反应式架构推荐使用声明式编程，使用更接近自然语言的方式描述业务逻辑，代码清晰易懂并且富有表达力，最重要的是大大降低了后期维护成本。

2.2 同步编程 VS 异步编程

当谈到同步与异步时，就不得不提一下阻塞与非阻塞的概念，因为这两组概念很容易混淆。导致混淆的原因是它们在描述同一个东西，但是关注点不同。阻塞与非阻塞关注方法执行时当前线程的状态，而同步与异步则关注方法调用结果的通知机制。因为是从不同角度描述方法的调用过程，所以这两组概念也可以相互组合，即将线程状态和通知机制进行组合。例如JDK1.3及以前的BIO是同步阻塞模式，JDK1.4发布的NIO是同步非阻塞模式，JDK1.7发布的NIO.2是异步非阻塞模式。

跟命令式编程一样，同步编程也是目前被广泛采用的传统编程方式。同步编程的优点是代码简单并且容易理解，代码按照先后顺序依次执行；缺点是CPU利用率非常低，大部分时间都白白浪费在了IO等待上。

异步编程通过充分利用CPU资源并行执行任务，在执行时间和资源利用率上远远高于同步方式。举个例子来说，对于一个10核服务器，使用同步方式抓取10个网页，每个网页耗时1秒，则总耗时为10秒；如果采用异步方式，10个抓取任务分别在各自的线程上执行，总耗时只有1秒。构建反应式系统并非易事，尤其是针对遗留系统进行改造，这将会是一个较为漫长的过程。反应式架构的核心思想是异步非阻塞的反应式流，作为过渡阶段，我们可以选择先对系统进行完全异步化重构，为进一步向反应式架构演进奠定基础。接下来，我们将先分析一个传统的同步示例，然后针对该示例进行异步化重构。

2.3 同步编程示例

假设我们要实现一个查询手机套餐余额的方法，该方法接受一个手机号参数，返回该手机号的套餐余额信息，包括剩余通话时间、剩余短信数量和剩余网络流量。由于查询套餐余额需要连续发起三次同步阻塞的数据库查询请求，所以在实现中需要利用缓存提高读取性能，代码如下：

private PhonePlanCache cache;  

public PhonePlan retrievePhonePlan(String phoneNo) { PhonePlan plan = cache.get(phoneNo); if (plan != null) { return plan; } Long leftTalk = readLeftTalk(phoneNo); Long leftText = readLeftText(phoneNo); Long leftData = readLeftData(phoneNo); return new PhonePlan(leftTalk, leftText, leftData); }

首先我们检查是否可以直接从缓存中读取套餐余额信息，如果可以则直接返回，否则连续发起三次同步阻塞的远程调用，从数据库中依次读取通话余额、短信余额和流量余额。代码逻辑非常简单，但是由于同步阻塞代码对线程池依赖非常严重，接下来我们还需要根据SLA估算线程池和连接池大小。估算的过程并不容易，好在我们有利特尔法则。

1954年， John Little基于等候理论提出了利特尔法则(Little's law)：在一个稳定的系统中，系统可以同时处理的请求数量L，等于请求到达的平均速度 λ 乘以请求的平均处理时间W，即：

L = λ * W

这个法则同样可以用来计算线程池和连接池大小。例如系统每秒接收1000个请求，每个请求的平均处理时间是10ms，则合适的数据库连接池大小应该为10。也就是说系统可以同时处理10个请求。从长时间来看，系统平均会有10个线程在等待数据库连接上的响应。但是需要注意的是，利特尔法则只适用于一个稳定系统，无法处理峰值情况，而通常系统请求数量的峰值会比平均值高很多。假设为了应付峰值情况，我们将线程池大小调整为50，由于连接池大小仍为10，所以会导致大量线程在等待可用连接，我们需要再次增大连接池大小以改善系统性能。通常经过如此反复调整后的参数已经严重偏离了利特尔法则，导致系统性能严重下降，在高并发场景下，如果网络稍有抖动或数据库稍有延迟，则会导致瞬间积压大量请求，如果没有有效的应对措施，系统将面临瘫痪风险。

2.4 同步编程面临的挑战

传统应用通常基于Servlet容器进行部署，而Servlet是基于线程的请求处理模型。从上文的讨论中我们发现，通常需要设置一个较大的线程池以获得较好的性能，较大的线程池会导致以下三个问题：

额外的内存开销。在Java中，每个线程都有自己的栈空间，默认是1MB。如果设置线程池大小为200，则应用在启动时至少需要200M内存，一方面造成了内存浪费，另一方面也导致应用启动变慢。试想一下，如果同时部署1000个节点，这些问题将会被放大1000倍。
CPU利用率低。有两个方面原因会导致极低的CPU利用率。一方面是在Oracle JDK 1.2版本之后，所有平台的JVM实现都使用1:1线程模型(Solaris是个特例)，这意味着一个Java线程会被映射到一个轻量级进程上，而有效的轻量级进程数量取决于CPU的个数以及核数。如果Java的线程数量远大于有效的轻量级进程数量，则频繁的线程上限文切换会浪费大量CPU时间；另一方面，由于传统的远程操作或IO操作均为阻塞操作，会导致执行线程被挂起从而无法执行其他任务，大大降低了CPU的利用率。
资源竞争激烈。当增大线程池后，其他的共享资源便会成为性能瓶颈，如数据库连接池资源。如果存在共享资源瓶颈，即使设置再大的线程池，也无法有效地提升性能。此时会导致多个线程竞争数据库连接，使得数据库连接成为系统瓶颈。

除了上面这些问题，同步编程还会深刻地影响到我们的架构。

假设我们准备开发一个单点登录微服务，微服务框架使用 Dubbo 2.x，该版本尚未支持反应式编程，微服务接口之间调用仍然是同步阻塞方式。假设我们需要实现如下两个接口：

用户登录接口
令牌验证接口

对于用户登录接口，由于需要多次访问数据库或缓存，并且需要使用Argon2等慢哈希算法进行密码校验，导致平均响应时间较长，约为500毫秒。而对于令牌验证接口，由于只需要做简单的签名校验，所以平均响应时间较短，约为5毫秒。假设由于业务需要，用户登录接口的性能指标只需要达到1000tps即可，而令牌验证接口的性能指标则需要达到100,000tps。

通常来说，这两个接口会在同一个微服务类中实现，也通常会被发布到同一个容器中对外提供服务。为了满足业务需要，我们先来算一下需要多少硬件成本？为了简化讨论，我们认为令牌验证接口无需硬件成本，只关注用户登录接口即可。根据利特尔法则， 总线程数量(L) = TPS(λ)*平均响应时间(W)，即：

总线程数量(L) = (1000*0.5) = 500

假设每个计算节点配置为4C8G，那么一共需要 (500/4)=125台计算节点。区区的1000tps竟然需要125台计算节点！你以为这就完了吗？ 1000tps只是日常的请求压力，如果考虑峰值情况呢？假设峰值请求是10, 000tps，并且会持续10秒，那么在这10秒内系统也可以看做是稳定状态，那么根据利特尔法则，就需要部署1250台计算节点。还有更坏的情况，如果某个节点由于数据库延迟或网络抖动等情况，导致用户登录请求积压，则用户登录请求会耗尽所有请求处理线程，导致原本可以快速响应的令牌验证请求无法被及时处理，而令牌验证接口的tps是100,000，这意味着1秒钟就会积压100,000个令牌验证请求，系统已经处在危险边缘，随时都会崩溃。

为了解决令牌验证接口的快速响应问题，我们只能调整架构，将登陆和验证拆分成两个单独的微服务，并且各自部署到独立的容器中。这样是不是就万事大吉了呢？很不幸，单点登录迎来了一个新需求，针对员工账户需要远程调用LDAP进行认证，而远程调用LDAP也是一个同步阻塞操作，这意味着每一个LDAP远程调用都会挂起一个线程，大量的远程调用也会耗尽所有线程，这些被挂起的线程啥都不做，就在那傻傻的等待远程响应。这其实就是微服务调用链雪崩的罪魁祸首。两个微服务之间调用已经如此棘手了，那如果调用链上有10个甚至更多的微服务调用呢？那将是一场噩梦！

其实所有问题的根源都可以归结为传统的同步阻塞编程方式。尤其是在微服务场景下，随着调用链长度的不断增长，风险也将越来越高，其中任何一个节点同步阻塞操作都会导致其下游所有节点线程被阻塞，如果问题节点的请求产生积压，则会导致所有下游节点线程被耗尽，这就是可怕的雪崩。

2.5 异步编程示例

我们说异步编程通常是指异步非阻塞的编程方式，即要求系统中不能有任何阻塞线程的代码。在现实情况下，想实现完全的异步非阻塞非常困难，因为还有很多第三方的库或驱动仍然采用同步阻塞的编程方式。我们需要为这些库或驱动指定独立的线程池，以免影响到其他服务接口。

利用Java 8提供的CompletableFuture和Lambda两个特性，我们对2.2节的示例进行异步化改造，改造后代码如下：

private PhonePlanCache cache;  
 
public CompletableFuture<PhonePlan> retrievePhonePlan(String phoneNo) { PhonePlan cachedPlan = cache.get(phoneNo); if (cachedPlan != null) { return CompletableFuture.completedFuture(cachedPlan); } CompletableFuture<Long> leftTalkFuture = readLeftTalk(phoneNo); CompletableFuture<Long> leftTextFuture = readLeftText(phoneNo); CompletableFuture<Long> leftDataFuture = readLeftData(phoneNo); CompletableFuture<PhonePlan> planFuture = leftTalkFuture.thenCombine(leftTextFuture, (leftTalk, leftText) -> { PhonePlan plan = new PhonePlan(); plan.setLeftTalk(leftTalk); plan.setLeftText(leftText); return plan; }).thenCombine(leftDataFuture, (plan, leftData) -> { plan.setLeftData(leftData); return plan; }); return planFuture; }

我们发现虽然异步编程可以获得性能上的提升，但是编码复杂度却提升了很多，并且如果异步调用链太长，还容易导致回调地狱。

ES2017 在编程语言级别提供了async/await关键字用于简化异步编程，让开发者以同步的方式编写异步代码，例如：

const leftTalk = await readLeftTalkPromise(phoneNo);    
const leftText = await readLeftTextPromise(phoneNo);    
const leftData = await readLeftDataPromise(phoneNo); const phonePlan = new PhonePlan(leftTalk, leftText, leftData);

在Scala中使用 for 语句也可以简化异步编程，例如：

for {  
  leftTalk <- leftTalkFuture  
  leftText <- leftTextFuture leftData <- leftDataFuture } yield new PhonePlan(leftTalk, leftText, leftData)

看到在其它语言中异步编程如此简单，是不是很羡慕？别急，在下一篇文章中，我们将会看到如何利用反应式编程简化异步调用问题。

莆田SEO总结

Through two section describes the basic concepts of reaction formula for everyone. The first part describes what type reactions, including the development history of reactive and some related projects. The second section describes why reactive to explain the problems and challenges faced by synchronous programming to you by a traditional programming examples, especially in the micro-services scenario, in front of thousands of micro-services interface for complex call chains, in order to avoid the risk of avalanches may lead, we have to transform the existing architecture meaningless, not only increases the cost of development and deployment, and operation and maintenance resulting in increased difficulty, synchronous programming approach has profoundly affected our architecture. But anyway, reactive transformation is a long process, in this process, we need to continue to improve the infrastructure, but also focus on training for developers, because reactive programming is a change to the traditional way of programming mode of thinking and need to be converted, which is also a challenge for developers. Although the transformation of pain, but after a successful transformation will usher in a new life.