CAP theory of distributed systems

Distributed systems (distributed system) is becoming increasingly important, large sites are almost always distributed.

The most difficult of distributed systems is how to synchronize the status of each node. CAP theorem is a fundamental theorem in this regard, and it is starting point for understanding distributed systems.

Three indicators of a distributed system

In 1998, a computer scientist at the University of California, Eric Brewer proposed a distributed system has three indicators. Eric Brewer said that the three indicators impossible at the same time. This conclusion is called CAP theorem, also known as Brewer Theorem (Brewer's theorem).

1. Consistency (consistency)

In a distributed environment, consistency refers to whether data between multiple copies of the same characteristics can be maintained. For any data read request sent from the client to the distributed system, or read the latest data, or fail. If able to do an update operation for the successful implementation of a data item, all users can read the latest value of which, that such a system was considered to have strong consistency (or strict consistency).

Features: either read the latest data, or fail, the emphasis is on data correctly .

2. Availability (availability)

Availability refers to services provided by the system must always be in a usable state, but also for the operation of each user's request, to always be able to obtain a non-response error for a limited period of time - but does not guarantee the acquired data to the latest data.

Such as system stability has been done three months 9,4 9, i.e. 99.9%, 99.99%, where N is the number 9 for a description of the availability, called the SLA, i.e. the service level agreements. For example, we say that 99.95% monthly SLA, the mean time to failure of the service each month only 0.05% of the total time, if this month is 30 days, it is 21.6 minutes

Features: will return data, does not return an error, but does not guarantee the latest data, the emphasis is not wrong at any time to read and write are successful.

3. Partition tolerance (partition tolerance)

Most distributed systems are distributed across multiple sub-networks. Each subnetwork is called a zone (partition). Partition tolerance refers to the "node occurs when part of the message is lost or partition failure, distributed systems will continue to operate," that is tolerant network system partition appear, and in the case of an encounter between a network node or network partitions unreachable under still be able to provide to meet the external consistency and availability of services.

Features: I will always run, no matter what kind of data synchronization my internal appearance, the emphasis is not hang up .

In a distributed system, since the layers split system, P is determined, the CAP application model architecture is CP and AP architecture. Distributed systems are concerned, that is the premise PartitionTolerance, how to achieve better A, and ...

Second, the theory proved CAP

CAP proof theory a number of ways, by way of evidence to the contrary is the most intuitive. CAP contradiction to prove theorems, was first proposed by Lynch, through a practical scenario, if the CAP can simultaneously satisfy the three, by allowing the existence of P, then the packet loss between the Server must exist, so it can not be guaranteed C.

Stand-alone system

First, a stand-alone system configured as shown above, ClientA may send an instruction to the X Server and sets the value of the updated, reads the value from Server Client1 is, in the case where a single point, i.e., a case where there is no network partition, by a simple transaction mechanism, can guarantee Client 1 is always read the latest value, consistency is not a problem.

Distributed Systems

We added a set of nodes in the system, because the zoning allows fault tolerance, Write operation may succeed on Server1, fails on Server2, this time for Client1 and Client2, will read the values ​​do not match, the same situation. If you want to maintain the consistency of X values, Write operations must fail, that is, reduce the usability of the system.

It can be seen in a distributed system can not meet the CAP's law of "consistency", "availability" and "Partition fault tolerance" three at the same time.

Three, CAP theory applications

CAP theory reminds us, in architecture design, do not waste energy on how to design distributed to meet the three perfect system, but to a reasonable trade-off, similar to CAP theory impossible triangle on mathematics, only three of its election Second, not all available.

For consistency of different business requirements are different. For example in terms of comment and thumbs up on the microblogging, users of inconsistency is not sensitive, can tolerate a relatively long time inconsistency, as long as do local interaction, and does not affect the user experience; and we when the electricity supplier shopping, product price data is to require strong consistency, if the business can not change prices with immediate effect, it will have a very big impact on transaction success rate.

It should be noted, CAP theory is to ignore network latency, that is, when the transaction commits, the data must be replicated between nodes takes time. Even, copy the same room from node A to node B, because the reality is not real-time network, so there is always a certain amount of time inconsistency.

Four, BASE theory

BASE meaning theory is that we do not have to choose between A or C can be achieved Part A and C.

BASE is Basically Available (available basic), abbreviations three phrases Soft state (soft state) and Eventually consistent (final consistency), the CAP is an extension of the AP.

1. Basic Available

CAP is not available basic pursuit of "any time, reading and writing are successful." Available emphasized the basic distributed systems in times of unpredictable failures, allowing loss of partial availability, compared to the normal system, the response time may be extended to ensure that core functionality is available, or service is downgraded.

For example, in the two-eleven spike activity, if the number of people buying too much more than the peak QPS system might prompt line or flow restrictor, which is through reasonable means to protect the stability of the system, to ensure that the main service is normal, ensure the basic available.

2. soft state

Soft state may correspond to atomic ACID transaction, the ACID transaction, strong consistency is achieved either do or do not do the whole, all users see the same data. Wherein atomicity (Atomicity) copy of the data requires a plurality of nodes are the same, emphasizes data consistency.

ACID consistency model is a strong emphasis atomicity, consistency, isolation and durability, mainly used in the database implementation.

Atomic can be understood as a "hard state", the soft state is to allow the data in the system there is an intermediate state, and that the state does not affect the overall availability of the system, i.e. the system allows the presence of multiple copies of the data in the extension data of different nodes Time.

Base-oriented theory is highly available, scalable, distributed systems, ACID for traditional financial and other services, in the actual scene, different business requirements for data consistency is not the same.

3. The final agreement

Data can not always be a soft state, must achieve consistency after each node (this time as "inconsistency window") in a period of time, after the time limit should ensure that all copies of data consistency, data is reached The final consistency.

In the system design, the time to achieve a final consistency depends on the network latency, system load, selection of different storage, various design factors such as data replication.

BASE CAP in theory does not solve the network latency, soft state and ultimately consistent with BASE ensure the consistency of the delay. ACID and BASE is reversed, it is completely different model ACID strong consistency, but availability is obtained by sacrificing a strong consistency and allows data to be inconsistent over time, but eventually reaches a consistent state.

Consistency Model

Depending on the eventual consistency model to ensure that it provides can be divided into more models, including the causal consistency and consistency session.

Causal consistency

Causal consistency requires a causal sequence of operations to be assured that the sequence of operations is non-causal relationship does not matter.

A process at the end of updating an item of data to inform the process of B, then the access to the data item after the process B should be able to get the latest update values ​​of process A, B and if the process is to be updating that data item so, be sure the latest value of process a is updated based.

Causal consistency of application scenarios can give an example, when commenting on Weibo micro-channel or, for example, you send a photo in the circle of friends, friend to you commented on, and your friends comment that respond, and this display in the circle of friends, your response must follow friends, this is a causal relationship, while other data is no causal relationship can be allowed inconsistent.

Session consistency

Process block access session data consistency will set the system in a conversation among the agreed system to ensure the realization of "the same active session read written by our own identity", that is your first visit, execution after the update, the client can always read the latest value of the data item in the same session.

The actual development has distributed Session consistency, it can be considered a consistent application of the session.

Five, CP and AP architecture of choice

The business requirements of consistency will be directly reflected in the system design, it is the typical structure of CP and AP.

CP architecture

CP architecture: For CP, the availability to give up the pursuit of consistency and partitions fault tolerance.

ZooKeeper is the use of CP consistency, ZooKeeper is a distributed service framework, mainly used to solve problems in distributed cluster coordination and consistency in the application of the system. Its core algorithm is Zab, are all designed for consistency. In the CAP model, ZooKeeper is CP, which means that when faced with network partitioning, in order to maintain consistency, it is not available. After partition, for A, only the sub-region node is greater than the quorum was Foreign Service

AP architecture

AP architecture: For the AP, the strong consistency to give up (here say consistency is the strong consistency), the pursuit of partitions fault tolerance and availability, this is the choice of many distributed system design, Base also be extended according to the AP.

And ZooKeeper opposite of Eureka, Eureka is SpringCloud micro-services technology stack service discovery component, each node Eureka are equal, hang several nodes of node does not affect the normal work, the remaining section can still provide registration and inquiry service, as long as there is still a Eureka, we can guarantee registration service is available, but may not be found in the latest version information, does not guarantee consistency to achieve eventual consistency.

Reference Source

Distributed CAP theorem, why can not meet the three characteristics?
Distributed technical principles and practical 45 say: pull hook column
CAP theorem - Distributed theory (a)
CAP theorem and proof
meaning CAP theorem
CAP theorem

Guess you like

Origin juejin.im/post/5e89e3bb518825736512cd39