10 minutes to understand distributed CAP, BASE theory

CAP theory

In July 2000, Professor Eric Brewer proposed CAP conjecture; 2 years later, Seth Gilbert and Nancy Lynch is proved that CAP; then, the theory became accepted CAP Theorem of distributed computing. CAP's law says that in a distributed computer system, consistency, availability, and fault tolerance of these three partitions guarantee can not be met at the same time, up to two meet. CAP: C: Consistency (Consistency) A: (Availability) Availability P: (Partition Tolerance) partitions fault tolerance

Let us construct a very simple distributed system.

  • G1 and G2 both servers
  • Two servers can communicate with each other
  • Clients can randomly access any server 

Consistency (Consistency)

Gilbert and consistency Lynch describes it.

any read operation that begins after a write operation completes must return that value, or the result of a later write operation

Any read operation after the write operation is completed must return the value.

G1 server to the client initiates a write operation, the variable initial value v0 to v1, to the next client node G1 may also be read to the read node G2;

  • G1 initiates a read operation to obtain the changed value V1. This is to satisfy the consistency 

  • G2 to initiate a read operation, it sends a synchronization message to the G1 G2

  1. If the synchronization is complete, then the read result is v1, which would also meet the consistency 

  2. Yet synchronization is complete, it is G2 or v0, which does not meet the consistency.

(Partition Tolerance) fault-tolerant partition

Gilbert and Lynch partition thus described fault tolerance.

the network will be allowed to lose arbitrarily many messages sent from one node to another

Network allows any number of lost messages from one node to another node

在分布式环境中,节点之间的通信可能出现问题,整个系统就产生所谓的分区。所以我们在设计的时候需要考虑这种情况;剩下来的 A和C满足好,我们就可以说我们的系统有很好的分区容错性。

(Availability)可用性

Gilbert and Lynch 对 availability的描述原文. every request received by a non-failing node in the system must result in a response 系统中非失败节点收到的每个请求都必须导致响应 在可用性系统中,只要服务器没有奔溃,客户端发送请求,服务器必须返回一个相应给客户端。

为什么要CAP不能同时满足

通过上述的定义和描述知道分区无法避免,p总是要考虑的。为什么c和a无法同时做到呢?其实都是分区惹的祸。

如果我们保证一致性;那么G1写入操作之后,必须保证数据同步给G2之后,G2才能对外提供响应,这显然就没有可用性了。

反之 我们保证可用性,那就没法保证一致性了,既生瑜何生亮的悲剧。

小结

经过上面分析,在分布式系统中,我们一般会选择AP而牺牲一致性。牺牲并不意味着不关心一致性,而是首先满足A和P,如何解决C的问题。参考以下BASE理论

BASE 理论

eBay的架构师Dan Pritchett源于对大规模分布式系统的实践总结,在ACM上发表文章提出BASE理论,BASE理论是对CAP理论的延伸,核心思想是即使无法做到强一致性(StrongConsistency,CAP的一致性就是强一致性),但应用可以采用适合的方式达到最终一致性(Eventual Consitency)。

(Basically Available)基本可用

在分布式系统出现故障的时候,允许损失部分可用性,即保证核心可用。

(Soft State)软状态

接受一段时间的状态不同步,及中间状态,而改中间状态不影响系统整体可用性。这里的中间状态就是CAP理论中的数据不一致性。

(Eventually Consistent)最终一致性

上面说软状态,然后不可能一直是软状态,必须有个时间期限。在期限过后系统能够保证在没有其他新的更新操作的情况下,数据最终一定能够达到一致的状态,因此所有客户端对系统的数据访问最终都能够获取到最新的值。

总结

CAP是分布式系统设计理论,BASE是CAP理论中AP方案的延伸,对于C我们采用的方式和策略就是保证最终一致性;

参考

英文版的:https://mwhittaker.github.io/blog/an_illustrated_proof_of_the_cap_theorem/

CAP 理论十二年回顾:"规则"变了:https://www.infoq.cn/article/cap-twelve-years-later-how-the-rules-have-changed

CAP 定理的含义:http://www.ruanyifeng.com/blog/2018/07/cap.html

从零开始学架构PDF

Guess you like

Origin www.cnblogs.com/chengtian/p/11278072.html