Distributed base (5) - CAP distributed and consistency of the theory BASE

In this paper, reference books and online articles from, along with some of their own understanding, if any omission or mistake, forgive us and pointed. Thank you!

A. Distributed consistency

Consistency refers distributed in a distributed system, data having consistency. ACID start stand-alone mode looks:

1. The traditional stand-alone mode ACID

In traditional Web applications, many projects still use the stand-alone mode architecture project. The ACID properties are single relational database, firms need to ensure that the four major characteristics of:

1.Atomicity (atomicity)

All operations of a transaction, either completed or not completed all.

The so-called transaction, is the full range of data by a logical operation process thereof. Such as bank transfer transactions consist of two operations: the amount of the deduction from the source account, as well as increase the amount of the target account.

2.Consistency (consistency)

It refers to a transaction begins and after the end of the transaction, integrity constraints data is not destroyed before.

It contains two meanings:

a) a database level mechanism, performed before and after the transaction, the data set can meet constraints, such as the only constraint, foreign key;

b) the operational level, by the application developer to ensure service consistency. Or bank transfer, for example, A, B two accounts, and after, A, B two accounts total balance must agree before transfer.

3.Isolation (isolation)

Database can be prevented more concurrent transactions executed result in cross inconsistent data.

4.Durability (persistence)

After the transaction refers to modifications to the data it is permanent, will not be rolled back to a previous state.

ACID relational database model is a classic stand-alone model, when there is a distributed architecture, ACID model will be mistakes, so as to produce a distributed architecture, the new theories and models began.

2. Distributed Architecture Evolution

We know that with the increase in applications requesting large and concurrent increase in pressure, system architecture evolution began to start a distributed architecture.

As storage system infrastructure on a single machine persistence is not enough, we need to copy the data to improve the availability and reliability of the system on multiple machines.

For example, in FIG, MySQL copy made from the master framework, forming a distributed system consisting of four machines:

Here Insert Picture Description

Since the copy data exists, then it would certainly appear from between the server and the primary server data inconsistencies, such as network outages resulted from the server can not synchronize master data in a timely manner, which creates a distributed consistency.

Here Insert Picture Description

3. distributed consensus rating

1. Strong consistency

Is the most stringent level of consistency require any read operation can read the latest value, that is, require any write operation to immediately synchronize all machines.

2. Weak Consistency

This level of consistency restraint system after writing success, does not promise immediate value can read the writing, but also how long commitment shortly after the data can reach consensus, but to ensure a certain level of time (such as second level as much as possible after), the data can be consistent state. Zookeeper cluster to ensure that the weak consistency.

3. The final consistency

The final consistency is the special case of a weakly consistent, the system will guarantee within a certain time, to reach a consistent state of the data. For example Redis cluster is the ultimate guarantee of consistency.

Two .CAP theory

CAP is one of the most important principle of the basic principles of distributed systems, was proposed by computer scientist Eric Brewer, that can not provide all three of the following services at the same time in a distributed computer system, while at the same time ensure a maximum of three among both established.

Here Insert Picture Description

  1. Strong consistency (Consistency): at the same time can be seen in all the distributed computer system is the same data
  2. Availability (Available): a distributed system can provide external services available
  3. Fault tolerance partition (Partition tolerance): between the machine system may be a network partition exists (network failure)

It should be noted that the consistency of CAP theory refers to the strong consistency, that is, any time, from any machine to read a distributed cluster data are consistent.

Common model of CAP

1.CA model

CA model refers to the absence of a network partition, you can ensure consistency with the A C availability. But this is not true in the case of a distributed system, because the network failure is very likely to occur, it is usually CA This model only in stand-alone service, in a distributed system is usually selected AP CP model or model.

1.CP model

CP refers to the model P when the network partitioning occurs, to ensure a strong consistency C, A not guarantee availability.

This model is used in the data required to ensure full right place, for example, to ensure that the CP zookeeper cluster, i.e., any time access can be obtained zookeeper request data consistent results while dividing the system fault tolerant network, but it does not guarantee that every the availability of services. From the analysis of the actual situation, when using the zookeeper get the list of services, if zk or zk elections are more than half of the machines in the cluster is not available, then the data will not be available. So, zk A can not guarantee service availability.

2.AP model

AP model refers to the network partition P occurs, to ensure the availability of A, C and not ensure consistency.

When this is used in the model does not need to ensure that the data is correct, but requires high availability applications, for example cluster Redis separate read and write using sentinel mechanism, the presence of the primary node is down, but the data has not been synchronized from the node, which sentinel when the second node is set to live the node, then the emergence of data inconsistencies. Note that, Redis can guarantee (data will be synchronized connection is restored) eventual consistency.

Three .BASE theory

Here Insert Picture Description

BASE CAP further optimization theory is based on the theory BASE in theory, allow the network partitioning occurs, the sacrificial strong consistency, so that a distributed system having a weak consistency and availability, rather than CAP theory as only strong consistency and selecting one between availability.

  1. BA (Basically Available): The system can provide the most basic functions, ensure the basic available
  2. S (Soft State): soft state, i.e., some time the system may not be synchronized
  3. E (Eventually Consistent): To ensure that the final data are consistent

1. Basic Available

What are the basic are available? Assume that the system, there has been unpredictable fault, but still can be used, in terms compared to the normal system, variance in response time or function:

  1. Response time increases

The response time increases, probably because the system is being re-connection failover or the like

  1. Only guarantee the basic functions

In the event of an abnormal situation, not all features are sure to fall off the assembly line, it may be an exception page, or by requesting a downgrade to default service page

2. soft state

Soft state is a state in terms of relative hard. Hard copy of the data state requires a plurality of nodes are the same.

And it refers to a soft state: the system allows the data in an intermediate state, and that the state does not affect the overall availability of the system, i.e. the system allows the presence of multiple copies of the data in the delay data to different nodes.

3. The final consistency

Soft state can exist only in the short period of time, and could not have been a soft state, there must be a period of time. After this deadline, should ensure that all copies of data consistency, so as to achieve the final data consistency. The time period depends on the network latency, system load, data replication design, among other factors.

ACID requires strong consistency, usually used in the traditional stand-alone database system. BASE requirements and eventual consistency, strong consistency is achieved by sacrificing availability, often used in large-scale distributed systems.

In a distributed real scenario, different business units and components are different requirements for consistency, and thus ACID BASE often used together.

Published 309 original articles · won praise 205 · Views 300,000 +

Guess you like

Origin blog.csdn.net/pbrlovejava/article/details/105000013