CAP of Distributed System Design Tradeoffs

to the beginning of metadata

 

Consistency, Availability, Partition Tolerance

 

1. Why learn and document distributed design concepts A series of related things

During the system design review in daily work, some colleagues often throw out some concepts, high availability, consistency, etc. They use these most basic concepts to refute the original design of the system, but many people understand usability, Consistency and other issues are all thought of by oneself, or it is not the same thing as the most original expression. In this case, PK is like people who are no longer in the same frequency band. Substantial progress, so it is necessary to be familiar with its theoretical basis, so as not to laugh. (In fact, there are many similar examples. Domestic technicians like to blur some of these words and talk about them. For example, XX cloud actually sells vps and a small part of saas. This is called cloud computing?)

2. What to say

During the review of distributed system design, the most debated place is actually the famous cap theory. This article also mainly provides its own understanding and application of the CAP theory.

CAP theory

What is a distributed system

A system that works together on different nodes through a network is called a distributed system

What does CAP stand for?

• Consistency 
  • (all nodes see the same data at the same time)
• Availability 
  • Reads and writes always succeed.
• Partition tolerance 
  • (the system continues to operate despite arbitrary message loss or failure of part of the system)

Consistency:  After the update operation is successful and returned to the client, the data of all distributed nodes at the same time is completely consistent

Availability:      Both read and write operations succeed

Partition fault tolerance: Can the system continue to serve when a network failure occurs and the distributed nodes cannot communicate with each other ?

What is the relationship of CAP

It states, that though its desirable to have Consistency, High-Availability and Partition-tolerance in every system, unfortunately no system can achieve all three at the same time
. There are three characteristics of reliability, availability, and partition fault tolerance.

Note: Do not confuse weak consistency and eventual consistency into CAP theory (there are so many pits to confuse the concept)
. There is no relationship, because the C of CAP is that after the update operation is completed, the data seen by any node is completely consistent, weak consistency. The eventual consistency itself is contrary to the C consistency of CAP, so you can see how ridiculous it is for those who falsely claim that their systems have the 3 features of CAP at the same time. Maybe more domestic scenarios are: once an opener takes the stage Speech, immediately transformed into a marketer, even the most basic concepts are not needed .
There is an article with a great title   cap-twelve-years-later-how-the-rules-have-changed  . In fact, the changed of this article is more about the way of thinking, and the CAP theory itself is not changed.

Why it came out like this

Let's look at a simple problem, a DB service is built in two computer rooms (Beijing, Guangzhou), and two DB instances provide writing and reading at the same time

  1.  Assume that the update operation of the DB is to write to the DBs in Beijing and Guangzhou at the same time before returning successfully
      . When there is no network failure, the CA principle is satisfied, C is any one of my writes, and the update operation is successful and returns to the client. , The data of all distributed nodes at the same time is completely consistent, A means that my read and write operations can be successful, but when a network failure occurs, I cannot guarantee CA at the same time, that is, the P condition cannot be satisfied


  2.  Assuming that the DB update operation is to write back to the local computer room successfully, and synchronize to the side computer room through binlog/oplog playback,
      this operation ensures that in the event of a network failure, both computer rooms can provide services, and read and write The operation can be successful, which means that it satisfies AP, but it does not satisfy C, because after the update operation returns successfully, the data seen by the DBs in the bilateral computer rooms will be temporarily inconsistent, and in the event of a network failure, the inconsistent time difference will be very large (only eventual consistency is guaranteed)


  3.  Assume that the update operation of the DB is to write to the DBs in Beijing and Guangzhou at the same time before returning to success and provide downgrade services when the network fails
      . For example, stop writing and only provide the read function, which can ensure that the data is consistent, and It can provide services in the event of network failure and meet the CP principle, but it cannot meet the availability principle

choose tradeoffs

Through the above example, we know that we can never get the three characteristics of CAP at the same time, so how do we weigh the choices?
The key point to choose depends on the business scenario

For most Internet applications (such as NetEase Portal), because of the large number of machines and scattered deployment nodes, network failures are the norm, and availability must be guaranteed, so only the APs that are consistent to ensure the service are set, usually the common high availability. Service boasting 5 9 6 9 service SLA stability is all about giving up C and choosing AP

For scenarios that need to ensure strong consistency, such as banks, the CA and CP models are usually weighed. The CA model is completely unavailable when the network fails, and the CP model has partial availability. The actual choice needs to be weighed by business scenarios (not all cases of CP It is better than CA, you can only view the information and cannot update the information. Sometimes it is better to refuse the service directly from the product level)

extend

BASE (Basically Available, Soft State, Eventual Consistency) is an extension of the CAP AP theory. Redis and many other systems are built on top of this theory.
ACID is a common design concept for traditional databases. ACID and BASE represent Two diametrically opposed design philosophies that sit at the poles of the consistency-availability distribution spectrum.

Extended reading

Daniel Abadi thinks CAP should be called PACELC   http://dbmsmusings.blogspot.jp/2010/04/problems-with-cap-and-yahoos-little.html
Brewer's CAP Theorem   http://www.julianbrowne.com/article/viewer /brewers-cap-theorem
Foundationdb's CAP trade-off options  https://foundationdb.com/white-papers/the-cap-theorem

<!-- <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/"> <rdf:Description rdf:about="http://10.11.112.49:8090/pages/viewpage.action?pageId=3285293" dc:identifier="http://10.11.112.49:8090/pages/viewpage.action?pageId=3285293" dc:title="分布式系统设计权衡之CAP" trackback:ping="http://10.11.112.49:8090/rpc/trackback/3285293"/> </rdf:RDF> -->

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326163738&siteId=291194637