[Turn] CAP theorem of meaning

Original link: meaning CAP Theorem 

Author:  Ruan Yifeng

Date:  July 16, 2018

Distributed systems (distributed system) is becoming increasingly important, large sites are almost always distributed.

The most difficult of distributed systems is how to synchronize the status of each node. CAP theorem is a fundamental theorem in this regard, and it is starting point for understanding distributed systems.

This article describes the theorem. It is actually easy to understand, but is obvious. The following mainly with reference to the Michael Whittaker's article .

Three indicators of a distributed system

In 1998, a computer scientist at the University of California, Eric Brewer proposed a distributed system has three indicators.

  • Consistency
  • Availability
  • Partition tolerance

They are the first letter C, A, P.

Eric Brewer said that the three indicators impossible at the same time. This conclusion is called CAP theorem.

二、Partition tolerance

Look at Partition tolerance, Chinese is called "partition fault tolerance."

Most distributed systems are distributed across multiple sub-networks. Each subnetwork is called a zone (partition). Fault-tolerant partition mean, range communication may fail. For example, a server in China, another server in the United States, which is the two areas could not communicate between them.

The figure above, G1 and G2 are two cross-server. Send a message to G1 G2, G2 may not be received. System design, it must take into account this situation.

In general, fault-tolerant partition can not be avoided, it can be considered the P-CAP always true. CAP theorem tells us that the rest of the C and A can not be done at the same time.

三、Consistency

Consistency Chinese called "consistency." You mean, read after write operation, must return the value. For example, a record is v0, the user initiates a write operation to the G1, which is to v1.

接下来,用户的读操作就会得到 v1。这就叫一致性。

问题是,用户有可能向 G2 发起读操作,由于 G2 的值没有发生变化,因此返回的是 v0。G1 和 G2 读操作的结果不一致,这就不满足一致性了。

为了让 G2 也能变为 v1,就要在 G1 写操作的时候,让 G1 向 G2 发送一条消息,要求 G2 也改成 v1。

这样的话,用户向 G2 发起读操作,也能得到 v1。

四、Availability

Availability 中文叫做"可用性",意思是只要收到用户的请求,服务器就必须给出回应。

用户可以选择向 G1 或 G2 发起读操作。不管是哪台服务器,只要收到请求,就必须告诉用户,到底是 v0 还是 v1,否则就不满足可用性。

五、Consistency 和 Availability 的矛盾

一致性和可用性,为什么不可能同时成立?答案很简单,因为可能通信失败(即出现分区容错)。

如果保证 G2 的一致性,那么 G1 必须在写操作时,锁定 G2 的读操作和写操作。只有数据同步后,才能重新开放读写。锁定期间,G2 不能读写,没有可用性不。

如果保证 G2 的可用性,那么势必不能锁定 G2,所以一致性不成立。

综上所述,G2 无法同时做到一致性和可用性。系统设计时只能选择一个目标。如果追求一致性,那么无法保证所有节点的可用性;如果追求所有节点的可用性,那就没法做到一致性。

[更新 2018.7.17]

读者问,在什么场合,可用性高于一致性?

举例来说,发布一张网页到 CDN,多个服务器有这张网页的副本。后来发现一个错误,需要更新网页,这时只能每个服务器都更新一遍。

一般来说,网页的更新不是特别强调一致性。短时期内,一些用户拿到老版本,另一些用户拿到新版本,问题不会特别大。当然,所有人最终都会看到新版本。所以,这个场合就是可用性高于一致性。

(完)

Guess you like

Origin www.cnblogs.com/youring2/p/10991224.html