NoSQL database should not give Consistency

Disclaimer: This article is a blogger original article, shall not be reproduced without the bloggers allowed. https://blog.csdn.net/yunqiinsight/article/details/91492195

Speaking NoSQL, will mention consistency (Consistency), according to CAP Theorem, we gave up some NoSQL database consistency, but NoSQL is an inevitable choice to abandon it?

From the 1970's, a relational database (RDB, Relational Database) since the invention is to build a relational database of choice is usually applied. Relational database to provide users with ACID guarantee, very convenient developers. From the beginning of 1990's, NoSQL systems began to appear. NoSQL database system is a relational database system stand, they gave up the traditional relational model relational databases and SQL interfaces from the schema.

And NoSQL systems often accompanies two words are BASE and CAP, these two words have a very profound effect on distributed systems. I believe that under the influence of these two words, many NoSQL systems from initial architecture to give up consistency (consistency) opted for a final consistency (Eventual consistency) and availability (Availability). Although I very much agree with CAP and BASE these two words, but I do not believe in the role of the CAP and BASE, NoSQL systems choose to give consistency is an inevitable thing.

First, let's look at the history of CAP and BASE these two concepts. These two concepts are presented by Eric Brewer's, Brewer is currently vice president of Google's infrastructure sector (Infrastructure) of (VP, Vice President). In 1997, on SOSP (Symposium on Operating Systems Principles), speech named [1] summarized the recent work of Brewer and others, his speech said that they are working on at the time and did not use cluster service has proven ACID properties as a relational database schema, but gave up the ACID properties of a relational database on infrastructure. The architecture and structure of their choice as a new word BASE, BASE select the word has been deliberate composition, ACID acidic meaning in English, and BASE have the basic meaning, it is clear that BASE is? ACID opposites.

ACID and BASE are the first letter of the word following abbreviations:
ACID: Atomicity, Consistency, Isolation, Durability Rev
BASE: Basically the Available, Soft State, Eventual Consistency

BASE advocates to give up ACID, ACID mainly to give up the Consistency, and allow the system to meet the basic available (Basically Available), flexible state (Soft State), the final agreement (Eventual Consistency). System builders can not only choose ACID, BASE also known as an option, that is, choose one of the ACID and BASE. In essence, that is, to make a choice between the two (Availability) availability of consistency on behalf of ACID (Consistency) and BASE representatives. Although at the time of BASE proposed architecture has not explicitly make a choice between consistency and availability, but is well behind the stage for the CAP proposed.

By the year 2000, Brewer in PODC (Principles of Distributed Computing) gave a lecture titled [2], the keynote speech is to clarify how to build robust distributed systems. In this talk, Brewer closer analysis and comparison of the ACID and BASE, and abstract characteristics of the core ACID and BASE, which is ACID consistency (Consistency), BASE availability (Availability), and extends the 3rd dimension , which is the network partition (network partition), thus proposed CAP conjecture, the conjecture says:

In a distributed system, can simultaneously satisfies the following three properties 2:
C (Consistency), A (Availability), P (Tolerance to Network Partitions)

According to this speculation, there will be three types of systems:

P abandoned, CA system characteristics, such as a single database such systems
abandon A, CP system characteristics, such as a distributed database system, distributed lock
abandon C, the system having AP characteristics, such systems such as web caching, DNS
availability is a very important feature, especially in the Internet industry, the impact of downtime on the service business is very large, so according to CAP theorem give up consistency is the natural choice. Especially in the Amazon CTO Werner Vogels details the paper Eventually Consistent [5] and Amazon's Dynamo system [12] after the publication of a large number of usability abandon the pursuit of consistency of NoSQL systems appear.

By the year 2002, GilBert and Lynch [3], redefined CAP these three attributes (re-defined attributes much smaller than the range Brewer conjecture attributes), and proved CAP these three attributes can not be achieved at the same time, so the CAP became a suspect CAP theorem.

CAP theorem defines three attributes are as follows [3,6]:
Consistency: Consistency refers to the atom (Atomic consistency) or a linear coherence (linearizable consistency), which is a very high level of consistency, few systems It can be achieved.
Availability: refers to the complete availability, that is, each read and write requests arrive on each node can no downtime in a reasonable amount of time to return a response. The key point here is that each request reaches the node each non downtime. This is also a very high level of availability, few systems can be achieved.
Partition Tolerant: refers to a system capable of in the event of a network partition, continue to respond correctly, that is, to maintain some characteristic of the system, or consistency or availability.

Glibert and Lynch CAP Theorem redefine the very strict, but proved only three attributes can not have both. The reason However, after Brewer Conjecture three-defined attributes, 3 election described 2, 3 points classification (AP, CP, CA3 species classification) is not very strict, which is CAP occurs, many people doubt and challenges the CAP . Brewer in 2012, re-wrote an article [4], also admitted that the original CAP expressed very misleading. In fact, the scope of CAP theorem is very small. Although the CAP from birth, there are many problems, but it still promoted the NoSQL movement, many system architectures are based on CAP theorem, give up the consistency, but in fact, these systems are often not satisfied with the scope of the CAP Theorem .

CAP is not the end to this story, in 2017, Brewer has been vice president of Google's infrastructure (Infrastructure) departments (VP, Vice President), and then Google and the company's first-generation system has been born Spanner [9 ]. Brewer wrote an article about the Google's Spanner system [7], and taking a step forward elaborated in accordance with the CAP theorem Spanner is a system of what characteristics. In this paper, Brewer noted Spanner system said to be "de facto CA" (effectively CA) system. Architecturally speaking, Spanner is a CP system, that is to say when there is a network partition, Spanner choice is to ensure data consistency, abandon availability. But in fact, Spanner is having a very high availability system effects, from the architecture Spanner did not reach that full availability requirements of the CAP Theorem, but also reached a very high availability, the use of multiple copies of the design, individual copies of the emergence of network partition It does not affect the users perceived availability. By definition CAP theorem, when these individual copies of the emergence of network partitions, these nodes are not available, that is, the system does not reach full availability. But this time the user requests can be serviced other copy services are available at this time, which means that users still perceive Spanner is available. So usability and user perception of the CAP theorem availability is not a concept. We should be pursuing user perceived availability.

User perceived availability, SLA usually expressed, that is, we usually say that the availability of several 9. Brewer in the article also gives Google data on SLA Spanner system, we can see from the data, because the proportion of the services available due to network partition is relatively small, a large part of the cause of such services are unavailable software bug, configuration errors, operation and maintenance as a result of misuse. That is, even in the framework adopted to achieve the required availability of CAP theorem, the user can feel the actual service availability, SLA will not improve much. This is my experience of so many years of a practitioner, the more unstable the system daily turnovers from the system developers, strengthen code quality, strengthen the development of standardized processes to enhance production operation and maintenance standards, more greatly improve system availability. So, at the architectural level, to give up because of the availability of consistency is often not worth the candle.

Cloud computing tide, do not give up consistency is also very wise. A hosted data storage service in the cloud, and if you give up consistency selective availability, the user experience is not obvious, because users will not pay for the use of the availability of CAP theorem of reach of the architectural design, the user only for your service meet SLA pay. However, the data storage service is consistent, the user is able to feel very obvious. Dynamo [12] Amazon's internal architecture can be achieved on the CAP Theorem availability requirements, but on the Amazon AWS cloud is not sold DynamoDB using this architecture, perhaps for this reason [10].

So we choose the benefits of consistency is what it? In many cases, when it comes to consistency, the money would take the financial and examples related to the necessity of consistency, but I believe that the financial industry does not depend on strong consistency [10]. I think consistency brought me is the convenience of development. Although Brewer proposed BASE concept, but he did not elaborate on this concept. In 2008 EBay's Dan Pritchett, again wrote the article [8], by way of example elaborated in later abandoned ACID, how to use the framework to achieve the same demand BASE, BASE recommended to us this architectural pattern. Through this article, I can see that if we give up the choice ACID BASE, then, would have been a very simple function, the need for tools such as message queue in order for the system to achieve eventual consistency, overall application architecture complex a lot.

Pritchett described as similar to the article, which does not have the consistency of a NoSQL system, you need to use your scene carefully screened to determine whether your usage scenario can make you give up consistency. Even if you want to use BASE architecture, nor is it simply using a NoSQL system with eventual consistency, replace the ACID database just fine, you need to design a variety of means, disposed of abnormal NoSQL systems have brought eventual consistency and let your entire application to achieve flexible state and a final agreement. BASE mentioned in the final agreement, and many NoSQL system has finally agreed some subtle differences. The difference is simple, BASE mentioned in the final agreement is to ensure that the system state is correct; and many NoSQL systems eventually agreed only to ensure that the final agreement, but does not guarantee that this state is the correct state what you want [11].

Finally, a personal point of view is that if a NoSQL system to use as a cache, in order to pursue low-latency, they can give consistency scene, big data and offline computing is similar to this scenario, many NoSQL systems are very suitable; however If a NoSQL database system to use, then the NoSQL system is best not to give up because of the availability of consistency, the same time through the multi-copy technology and good transport Vader to the actual availability, that is the point of virtual CA (effectively CA), which can greatly reduce the burden on the user to use.

Due to space limitations, this article describes many technical details about consistency, CAP, BASE, ACID's failure to detail, to be written separately discussed. Written hastily, there are errors and omissions in the god to welcome you to correct me.


Original link
This article Yunqi community original content may not be reproduced without permission.

Guess you like

Origin blog.csdn.net/yunqiinsight/article/details/91492195