From ACID to CAP/BASE

From the development of centralized system to distributed system, the principle of transaction has also developed into CAP/BASE from the original ACID.

ACID
transaction (Transaction) is a program execution logic unit (Unit) composed of a series of operational locks for accessing and updating data in the system. Transactions in a narrow sense refer to database transactions.
Transactions have four characteristics, namely Atomicity, Consistency, Isolation and Durability, referred to as ACID characteristics of transactions.

Atomicity: A transaction must be an atomic unit of sequence of operations that either all succeed or all fail.
Consistency: The execution of a transaction cannot destroy the integrity and consistency of database data. Before and after a transaction is executed, the database must be in a consistent state.
Isolation: Concurrent transactions are isolated from each other. The execution of one transaction cannot be interfered by other transactions. Four transaction isolation levels are defined.
Persistence: Once a transaction is committed, its state changes to the corresponding data in the database should be permanent.

Transactions in CAP and BASE theoretical
centralized systems can easily satisfy ACID characteristics, but for a high-traffic, high-concurrency Internet distributed system, if we expect to implement a set of distributed transactions that strictly satisfy ACID characteristics , it is very likely that there will be a conflict between the availability of the system and strict consistency, because when we require a distributed system to have strict consistency, it is likely to sacrifice the availability of the system. Availability and consistency are two indispensable attributes of distributed systems. There can never be a best of both worlds solution between availability and consistency, so there are classic theories of distributed systems such as CAP and BASE.

CAP Theorem
A distributed system cannot satisfy the three basic requirements of Consistency, Availability and Partition tolerance at the same time, and can only satisfy two of them at most.

Consistency: All nodes have the same data at the same time, Note: The consistency here is not the same as the consistency in ACID
Availability: The service is always available, and it is normal response time
Partition fault tolerance: The distributed system encounters When a node or network partition fails, it can still provide external services that meet consistency and availability.

CAP Proof
As to why only two of the CAP theorem can be satisfied at the same time, here is a simple proof:
Premise: For a distributed system, partition fault tolerance can be said to be a most basic requirement. Because since it is a distributed system, the components in the distributed system must be deployed to different nodes, otherwise there is no such thing as a distributed system.

Suppose there are 2 nodes N1 and N2 in the network. Databases D1 (main) and D2 (standby) are installed on N1 and N2 respectively to form the active-standby mode. D1 (main) is responsible for writing and reading, and D2 (standby) shares part of the read
normally . Situation: D1 (main) finishes writing data, synchronizes to D2 (standby), and reads D2 to read the latest data.
Abnormal situation: As a distributed system, the biggest difference between it and a stand-alone system lies in the network. Now suppose an extreme situation, the network between N1 and N2 is disconnected; D1 (master) finishes writing data, D2 (standby) does not update to the latest data; what to do at this time, there are two options: first, Sacrifice data consistency and respond to old data to users; second, sacrifice availability, block and wait until the network connection is restored and the data update operation is completed, and then respond to the latest data to users.

BASE theory
BASE is the abbreviation of the three phrases Basicly Available (basically available), Soft state (soft state) and Eventually consistent (eventually consistent). Obviously, the BASE theory is more inclined to satisfy the AP in the CAP theory. Systems that satisfy both availability and partition tolerance may generally have lower requirements for consistency.

The relationship between CAP, ACID and BASE
I have seen a picture on the Internet, and I feel that the relationship between the three is very clearly described, as shown in the following figure:

According to the different tendencies in CAP theory:
CA – single point cluster, system that satisfies consistency, availability, usually not very powerful in scalability (ACID)
AP – system that satisfies availability, partition tolerance, usually may not be consistent Lower requirements (BASE)
CP – a system that satisfies consistency and partition tolerance, usually the performance is not particularly high (BASE) Personally think it is inside? can also be written as BASE

In fact, we can also look at it from the perspective of the database:
relational databases and non-relational databases can also be reflected in the CAP theory, as shown in the following figure (source online):

Relational databases (RDBMS) follow the ACID principle, and non-relational databases follow the BASE principle

Summarize
the ACID strong consistency model, and BASE proposes to sacrifice strong consistency to obtain availability but eventually achieve a consistent state; in actual distributed scenarios, different business units and components have different requirements for data consistency, so in specific In the design of distributed system architecture, ACID characteristics and BASE theory are often used together.

 

https://my.oschina.net/OutOfMemory/blog/796248

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326645569&siteId=291194637