2020.1.3 distributed systems theory CAP

Distributed systems theory CAP

About CAP

1998 proposed a distributed system of three indicators CAP (Consistency, Availablity, Partition-tolerance) in which:

Consistency

  • The consistency of
    all the data in the distributed system backup 同一时刻, must be同样的

For example, Figure turndown service: the background is three machines A, B, C do support, upload an image, through the gateway process may eventually be A, followed immediately requested this picture, through the gateway polling algorithm, this request is likely to tribes on a, then he said: No matter falls B, C on which machine, the picture must be present and return it, otherwise, it is not一致性

If there is more than one partition To perform a write operation, if the distributed storage system 分片synchronization, or cluster system 主从synchronization will always be a order, in this order, the question arises inconsistency, this time you:

  • Either accept the same data in the respective storage point of the synchronization process 暂时inconsistent.
  • Either you want to 强一致性, then it will affect your synchronization 可用性.

Availability

  • Availability
    after part of a cluster node fails, the cluster as a whole still be able to read and write in response to client requests, even if a node in the cluster goes down, still it does not affect any of your request.

Partition tolerance

  • Partition fault tolerance
    if the machines in the cluster is divided into two parts, not communicating with each other whether the two parts, the system can continue to work

Analysis of doubt

  • Many people say
    • 一致性、可用性、分区容错性Only two of arbitrary choice. Simple look is right, but to meet the AC thing?
  • Baidu Encyclopedia exact words
    • CAP is the essence of the principle 要么AP,要么CP,要么AC, but there is no CAP
  • AP, CP use
    • AP(Eureka) 选择可可用性, CP(Zookeeper、HDFS)Have opted for consistency but AC model does not exist? How does communication fail to do it?
  • Official website original theory
作者的意思是只能CP或者AP因为由于网络问题,将系统的成员隔离成了2个区域,互相无法知道对方的状态,这在分布式环境下是非常常见的.所以说只能从A和C中选一个.所以P必选。要建一个永远不发生多相关故障的网络,对于分布式系统来说是不切实际的。所以设计者必须在一致性(C)和可用性(A)之间做选择,

to sum up

在分布式环境下面,P是特定存在的,也就是只要我们有多台机器,那么网络隔离分区就一定不可避免,所以在设计系统的时候我们就要选择到底设计的是 AP系统还是 CP系统,但实际上,我们只要深入理解下CAP,就会发现其实有时候系统设计上面没必要这么纠结,主要表现在:

网络分区出现的概率虽然时常有发生但是我们可以通过感知心跳强行剔除下线,将流量平均分往其他节点,毕竟谁也不能保证网络百分百稳定,不出现网络分区。自然而然的没必要刻意A、C 中选一个,而是可以都做得不错。也就是百度百科说的AC系统

CAP 里面规定A是 100% 的可用性,但实际上,我们只需要提供 high availability,也就是像网关一样,全年不宕不可能,但是满足99.99%或者99.999%等几个9就可以了。

Guess you like

Origin www.cnblogs.com/qichunlin/p/12145295.html