1. Overall introduction

In the big data storage system, to improve the high availability of the system, the same data will be stored as multiple copies. In engineering, there are usually 3 backups. In addition to increasing high availability, it also increases the concurrency of read operations, but at the same time it will cause data consistency problems: multiple copies of the same data, in the case of concurrent read/write, how to maintain a consistent view of the data is very difficult Importantly, that is to say, no matter how many copies of the same data, the external performance is the same as a single data.

This section describes the consistency issues and solutions that arise from data replication. First introduce the integrated theoretical models: CAP, ACID, BASE, and then introduce several typical replica data update strategies.

2. Basic theoretical model

2.1 Conservative CAP

Strong consistencyConsistency : In the case of multiple copies of a distributed system, the effect of updating data is consistent with that of a single copy of data;
AvailabilityAvailability : A specific response time needs to be met for both read and write operations;
Partition tolerancePartition Tolerance : In a distributed system, when machines between partitions cannot communicate, the service is guaranteed to work normally;

The three elements of CAP cannot be achieved at the same time. The same system can realize at most two of them, and the third element has to be relaxed.

If there is only one copy of data v, we obviously satisfy strong consistency C in this case, but if some machines in the cluster go down, availability A will definitely not be obtained.

If there are multiple copies of data v, on X and Y machines, update the data on X machines. If the network partition between machines X and Y is unreachable at this time, you need to make a trade-off between ACs. If you choose availability A, Then the value of v is in an inconsistent state at this time; if strong consistency C is selected, then the read request for the value of v in machine Y should be rejected at this time. So no matter which one you choose, you must discard the other.

2.2 Reinterpreted CAP

Eric Brewer, the proponent of the CAP principle, questioned the above theory in 2012. It is actually misleading to have to choose 2 out of 3.

Because the probability of network partition P is very small, strong consistency C and availability A should not be given up for P at the beginning.

Therefore, at the beginning of the design, we should work hard to satisfy the CA at the same time. If there is a network partition problem, as a special case, we will provide services in an inconsistent state. When the network partition phenomenon is over, re-enter the CA to provide services.

2.3 ACID principle

ACID is the principle used by relational databases. It has high reliability and strong consistency. The representative meanings are as follows:

AtomicityAtomicity : means that a transaction is either fully executed or not executed at all;
ConsistencyConsistency : A transaction should always satisfy the consistency constraints when it starts and ends.
IndependenceIsolation : Two transactions will not be interleaved, because this may lead to data inconsistency.
PersistenceDurability : After the transaction is successfully executed, the update of the system state is permanent and will not be rolled back for no reason.

The understanding of atomicity, such as transfer, the transfer transaction either succeeds or fails, it is impossible for the sender to deduct the money, but the payee does not receive the money in the end, or the transfer party does not deduct the money, but the payee finally receives the money (if this is the case, then the bank may really be going bankrupt).

Regarding the understanding of consistency, for example, if you transfer 100 yuan, you should deduct 100 yuan from the card. If the transaction changes the value of the transfer, then the deduction value also needs to be changed to meet this consistency requirement. At the same time, the meaning of C here is different from that of C in CAP, so you need to pay attention.

2.4 BASE principle

The cloud storage system and NoSQL system in the big data environment follow the BASE principle, which is quite different from the ACID principle.

The BASE principles are as follows:

Basically availableBasically Available : the system is available most of the time, allowing occasional failures;
Soft stateSoft State : the data state does not require to be completely synchronized at any time;
Final ConsistencyEventual Consistency : Although the soft state does not require the data to be consistent and synchronized at any time, the final consistency requires that the data will reach a consistent state within a given time window.

In principle, BASE satisfies high availability by sacrificing strong consistency.

At present, the scheme adopted by the general distributed system satisfies the BASE principle globally, satisfies ACID locally, absorbs the benefits of each, and establishes a balance.

2.5 CAP/ACID/BASE relationship

ACID puts more emphasis on data consistency, which is the design idea of traditional databases. BASE puts more emphasis on usability and weakens the concept of strong data consistency, which is a requirement of large-scale distributed data systems in the Internet era.

C in CAP refers to the strong consistency of data, which is a subset of C in ACID.
When a network partition occurs, it is impossible to satisfy transactions because serialization of transactions requires network communication.
When a network partition occurs, ACID should be implemented on each partition as much as possible, and recovery after the network partition is over.

2.6 Idempotency

In abstract algebra, for unary operations, f(f(x))=f(x)operations that satisfy the conditions are called idempotence, such as absolute value operations. For binary operations, if f(x,x)=xthe operation satisfies the conditions, it is called idempotence, such as real number set operations max(x,x)=x.

Idempotency in a distributed system means that the caller performs the same operation repeatedly and has the same effect as performing the operation correctly only once. If the call does not receive a successful response under network conditions, it will be considered a failure and called again, which may be executed repeatedly. In this case, maintaining idempotency can ensure the correctness of the system.

Zookeeper and Raft support the idempotency of many operations.

3. Consistency Model

Consistency models include: strong consistency, weak consistency, and weak consistency includes: eventual consistency, causal consistency, "read what you write" consistency, session consistency, monotonic read consistency, and monotonic write consistency. In the actual system, the appropriate consistency model should be selected according to the specific situation. The following figure shows the relationship between them:

Consistency Diagram

3.1 Strong Consistency

For all processes connected to the database, the data value seen for a certain data is consistent. No matter how many copies exist for a piece of data, the result of accessing any copy at any time is the same.

3.2 Final Consistency

Final consistency cannot guarantee that after a certain value x is updated, all subsequent operations on x can immediately read the new value. It takes a time slice to ensure data consistency. Before that, the data may be inconsistent.

3.3 Causal Consistency

Causal consistency occurs when there are interdependencies between processes. For example, the two processes of A and B depend on each other, so if A updates a variable, he will notify B after the update, and what B sees at this time is the new value, but if there is still process C, then the value seen by C may be Or the old value.

3.4 "Read What You Write" Consistency

"Read what you wrote" consistency is a special case of causal consistency. After process A updates the data, it will immediately send a notification to itself, so the subsequent operations of process A are based on the new value.

3.5 Session Consistency

When process A connects to the database through a session, in the same session, the consistency of reading and writing can be guaranteed. In the inconsistency window, if the session is terminated due to system failure and other reasons, process A may still read the old value.

3.6 Monotonic read consistency

If a process reads the value of a certain version V2 of the data, all subsequent read operations of the system cannot see the value of a version older than V2.

3.7 Monotonic Write Consistency

For a process, monotonic write consistency can guarantee the serialization of its multiple write operations. If it cannot be guaranteed, it will be difficult for application developers to develop programs.

4. Replica update strategy

In a large-scale distributed storage system, there are three strategies for replica update: simultaneous update strategy, master-slave update strategy, and arbitrary node update strategy.

4.1 Simultaneous update

Type A: Directly update multiple copies of data simultaneously without any consensus protocol. At this time, there is a potential data inconsistency problem: Suppose two clients send update1 and update2 update requests at the same time, the system cannot determine the sequence, and the copy may be the result of update1 or update2.
Type B: Preprocessing through a certain consistency protocol determines the execution sequence of update operations to ensure data consistency. The consistency protocol has processing costs, so the demonstration will increase.

4.2 Master-slave update strategy

If there is a master copy in the multiple copies of the data, and the other copies are slave copies, all update operations on this data are first submitted to the master copy, and then the master copy notifies the slave copy to update. According to the mechanism of the master copy notifying the slave copy, it is divided into The following 3 types:

Synchronously

The master copy waits for all slave copies to be updated before confirming the completion of the update operation, ensuring strong data consistency. Increased delay

asynchronous mode

The master replica can confirm the update operation before the slave replica is updated. In this case, if the slave replica crashes before receiving the update, there will be a data consistency problem. Generally, the update operation will be recorded in another reliable storage location first to prevent this from happening.

For the request delay and consistency trade-off, it is divided into the following two situations:

1. If all read requests are responded to by the primary copy, the consistency can be guaranteed and the delay will increase;

2. If any copy can respond to the read request, the delay is reduced, and there will be data consistency problems;

Hybrid

Using a synchronous and asynchronous hybrid method, the master copy first updates part of the slave copy data synchronously, confirms that the update operation is completed, and other copies are updated asynchronously. It can be seen that the trade-off between request latency and data consistency also depends on the response methods of the two read operations.

1. If the data of the read operation is read from at least one synchronous update node, the data consistency is guaranteed, and the request delay increases, but the delay is lower than the synchronous one;

2. If the read operation does not require reading from at least one synchronous node, the second case of asynchronous mode will still occur, causing data consistency problems;

4.3 Any node update strategy

The data update request is sent to any node of the multi-copy, and this node is responsible for notifying other copies to update the data. The difference from the master-slave type is that the master copy is not fixed, and any node can respond. The particularity is that there may be two different clients sending data update requests for the same data at the same time, and at this time, two different copies may respond to each other.

Under such an update strategy, there are two situations as follows:

Type A: Synchronously notify other replicas, similar to master-slave synchronous update, with a large delay;
Type B: Notify other replicas asynchronously. The existing and simultaneous update strategy is similar to the asynchronous method of the master-slave update strategy, and data consistency may be problematic;

Big Data Foundation (2) - Copying Data and Consistency

Article Directory