Distributed CAP theorem

According to the definition of Baidu Encyclopedia, also known as CAP CAP Theorem principle , referring in a distributed system, Consistency (consistency), Availability (availability), Partition tolerance (partitions fault tolerance), while a maximum of three properties two, the three can not have both.

A, CAP definition

Consistency (Consistency):

"All nodes see the same data at the same time", that is updated after a successful operation and returned to the client, exactly the same data from all nodes at the same time, which is distributed consistency. The issue of consistency in concurrent systems is inevitable, for the client, consistency refers to the question of how the updated concurrent access to data acquisition. From the server point of view, it is how to copy update distributed throughout the system to ensure that the final data is consistent.

Availability (availability):

Availability refers to "Reads and writes always succeed", a service has been available, and is a normal response time. Good availability mainly refers to the system can be very good for customer service, there are no user operation failed or timeout accessing user experience bad situation.

Partition Tolerance (partitions fault tolerance):

That is a distributed system in the face of a node failure or network partitioning, still be able to provide to meet the external consistency and availability of services.

Fault tolerance requires the ability to partition the application although it is a distributed system, but it looks like you can in a normal operation as a whole. For example, there is now distributed system one or a few machines shoot down, the other the rest of the normal operation of the machine can also meet the system requirements for the user and has no effect on what experience.

Second, prove CAP Theorem

Now we have to prove it, why can not meet the three properties at the same time?

Suppose there are two servers, and a database application A stood V, a database, and the application B stood V, the network can communicate between them, it is equivalent to two portions of the distributed system.

When satisfied consistency, both servers N1 and N2, a data start two servers is the same, DB0 = DB0. When satisfied the availability, whether a user requests N1 or N2, will get immediate response. In the case of the partition to meet the fault tolerance, N1 and N2 have either down, or when the network does not make sense, would not affect the normal operation between the N1 and N2 each other.                                               

When the user requests to update the data to the application server DB0 through N1 is A, then the server DB0 N1 becomes DBl, the data synchronization update the distributed system, the database server V0 N2 is also updated for DBl, then the user initiates a request to the database through the B data is the data obtained after DB1 real-time updates.

The above is the normal operating conditions, but a distributed system, the biggest problem is the network transmission problems, now assume an extreme case, the network between the N1 and N2 off, but we still support this network anomalies, also is to meet the zoning fault tolerance, then this will not satisfy consistency and availability of it?

It assumed that the communication between the N1 and N2 when the network suddenly fails, the user has data to send update requests to the N1, N1 data DB0 that will be updated as DBl, since the network is disconnected, the database is still N2 DB0;

If this time, there is a user sends a data read request to the N2, because the data has not been synchronized, applications can not immediately return to the latest data DB1 to the user, how to do it? There are two options, first, sacrificing data consistency, old data DB0 response to the user; second, sacrificing usability, blocks until a network connection is restored, the data after the update is complete, give the user the latest response data DB1.

The above process is relatively simple, but it also shows the need to satisfy the partition fault-tolerant distributed systems, only in both consistency and availability, select one of them. That is a distributed system can not simultaneously meet three characteristics. This requires us to choose when building a system, then, is how to choose a better strategy?

Third, the policy trade-offs

CAP can only meet two of the three properties, then the choice of strategy on a total of three types:

The without P the CA : not required if P (partition allowed), then C (strong consistency) and A (availability) is guaranteed. But at the same time give up P also means giving up the scalability of the system, which is distributed nodes is limited, no way to deploy a child node, which is contrary to the original intention of the design of distributed systems.

A the without the CP : If you do not require A (available), corresponding to each request needs to maintain strong consistency between servers, and P (partition) will lead to the indefinite extension of time synchronization (that is, waiting for data synchronization complete access to normal service) then, once the network fault occurs or the message is lost and so on, it is necessary to sacrifice the user experience, waiting for all the data all the same let the user access to the system. The system is designed to CP in fact many, the most typical is the distributed database, such as Redis, HBase and so on. For these distributed databases, data consistency is a basic requirement, because if even this standard are not, then the direct use of a relational database like, no need to waste resources to deploy a distributed database.

 C wihtout the AP : To allow high availability and partition, you need to give up consistency. Once the partition occurs, you may lose the links between nodes for high availability, each node can only provide services with local data, and this can lead to inconsistencies global data. Typical applications just like buying mobile phone scene of a meter, it may be a few seconds before you see the products page when tips are in stock, and when you have finished selecting merchandise ready at a single time, the system prompts you single failure, merchandise is sold out . In fact, this is the first in A (availability) to ensure the system can be normal in terms of service, and then made some sacrifices in terms of consistency of the data, although how much will affect some users experience, but that does not cause serious obstruction to the user's shopping process.

Third, the summary

Today, most large Internet applications to the scene, many host, deploy dispersed, and now the cluster size is growing, more and more only because the destination node failure, network failure is the norm, so it partitions fault tolerance becomes a problem in a distributed system is bound to face. Then it can only be a trade-off between C and A. But for traditional project might be different, take the bank transfer system, the money involved for data consistency can not make the slightest concession, C must ensure that a network failure, it would rather stop services, as in A and make trade-offs between P.

All in all, not the best strategy, a good system should be architected according to the business scene, only for the best.

Guess you like

Origin www.cnblogs.com/LoveShare/p/11011454.html