Zhang Taipan and CAP Theorem

Zhang Taipan and CAP Theorem

There are many terms on the computer industry tall and difficult to understand, CAP is one of them, what consistency (Consistency), availability (Availability), partition fault tolerance (Partition tolerance) it is difficult to understand, plus the CAP theorem is people foggy, today we are trying to popular interpretation of it.

Zhang big fat hard work in the company, after years of effort, finally attained the rank of architect.

 

Architect chair not take the heat, quickly we came to do a project architecture.

 

Taipan boss called and inculcate said: big fat ah, our data is a valuable asset, you can be sure to design a system to ensure that data can not be lost ah!

 

Taipan said boss assured, I have experience in this area, in general, we have to do a redundant data processing, simple terms is to make multiple copies of data to store. I will design a distributed system, back up data to multiple nodes to the machine.

 

A few days later, to Taipan made a map, showing the distributed system is how it works:

 

Make copies of data on different machines redundancy, data center replication, to ensure the synchronization of data.

 

Although only two machines, but also constitute a simple distributed environment.

 

The boss did not understand technology, but to see the data backup between different machines, also relieved.

 

After several months of development and testing, the successful on-line system, but soon found: a distributed system like stand-alone systems that simple, because of the network, or why a machine can easily lead to communication failure, or node is not use.

 

One day, the user's first visit to the left of the machine A, a written data, then the machine A very unfortunate, sad reminder of the cable network to be kicked off, which led directly to two serious consequences:

 

1. Load balancing can not find the machine A, it is believed to be dead, and put on the next visit to the user's machine B to go.

 

2. Copy the data could not find the machine A, we had no choice but to strike. User data can not just written to the copy machine B, or old data B on the machine

 

How to do? Although this is a chance, the network gave it, plug the Internet cable on it, but who can guarantee that after two machine communication is consistent and smooth it?

 

Wang said the group of: our machines B still alive, but also provide service, data replication less than machine B, not that little look at a few data Well, harmless, does not affect the overall situation, barely usable, plug the Internet cable after data replication will work, everything will return to normal.

 

Wang inadvertently selected the system's availability (Availability, referred to as A) , the system can provide services like, data inconsistencies can be tolerated.

 

Taipan Zhang said: No, the boss said, the data of our system is extremely important, if inconsistent data would have serious consequences, and so these critical functions related to data on the machine B must also be stopped, must wait until the machine A plug the Internet cable , data synchronization can be started later

 

Obviously, Zhang boss Taipan follow instructions, the consistency (Consistency, referred to as C) put on top.

 

So the question is quite clear, not in the case of communication between network nodes, and data replication related functions, or selective availability (A), or select the consistency (C), not both.

 

Taipan carefully thought a moment, in fact, behind these two options is actually hides another fact that can not be the case for communication between network nodes, nodes are isolated, resulting in a network partitioning, the entire system can still work, Taipan gave it a name:  partitions fault tolerance (partition tolerance, referred to as P) .

 

If the Selective Availability (A) + fault tolerance partition (P), we must abandon the consistency (C).

 

If the selected consistency (C) + fault tolerance partition (P), have to give up the availability of (A), to the, in this case, although some functions of the system can not be used, because of the need to wait for synchronization data, but and those not related to data synchronization function is accessible, equivalent function of the system to do a downgrade.

 

Since there are AP and CP, it will not appear merely CA (consistency availability +) This combination of it? That is, no partitions fault tolerance, leaving only the availability and consistency? Think about this case in fact devolved into a stand-alone application, does not make sense.

 

Taipan feel seems to have found a pattern: In a distributed computer system, the consistency (C), availability (A) and partitions fault tolerance (P) to ensure that these three can not be satisfied at the same time, to meet the maximum of two.

 

He decided to find a law called the CAP theorem, the sounds relatively tall, makes it appear unfathomable.

 

If you really do not understand this CAP, Zhang will tell you a big fat more easily understandable version: In a distributed system can not communicate (network partition generated) between nodes appear, you can select only the availability or consistency , while not choose them.

 

(Finish)

Guess you like

Origin www.cnblogs.com/wjlwo2ni/p/11353539.html