Thinking about Distributed Systems

Abstract: Talk about some theories and ideas of distributed systems, including CAP, BASE, NWR, etc. And briefly analyze the pros and cons of some mainstream database distributed solutions, so that we can think, choose and design more deeply and comprehensively during development.

Summary

Talk about some theories and ideas of distributed systems, including CAP, BASE, NWR , etc. And briefly analyze the pros and cons of some mainstream database distributed solutions, so that we can think, choose and design more deeply and comprehensively during development.

text

Before discussing common architectures, let's briefly understand the CAP theory:

CAP is an acronym for Consistency, Availablity, and Partition-tolerance. respectively refer to:

  1. Consistency: Each read operation can ensure that the latest data is returned;
  2. Availability: Any node that does not fail will return a normal result within a reasonable time;
  3. Partition-tolerance: When a network partition occurs between nodes, services can still be provided.

CAP theory points out: CAP can only take two of the three, not both. In fact, this is easy to understand:

  • First of all, a stand-alone system can only guarantee CP.
  • With two or more nodes, when a network partition occurs, the two nodes in the cluster cannot communicate with each other. At this time, if the data consistency C is guaranteed, then there must be a node marked as unavailable, which violates the requirement of availability A and can only guarantee CP.
  • On the contrary, if the availability A is guaranteed, that is, the two nodes can continue to process requests separately, then the data cannot be synchronized due to the network failure, which will inevitably lead to data inconsistency, and only the AP can be guaranteed.

case

A single instance

Obviously, a single-machine system can only guarantee CP, at the expense of availability A. The stand-alone version of MySQL, Redis, MongoDB and other databases are all in this mode.

single

In practice, we need a system with high availability that can continue to provide services even after some machines fail.

Two, multiple copies

multi

Compared with the single instance, there is one more node to back up the data.

For read operations, availability is improved because either of the two nodes can be accessed.

For write operations, there are three cases according to the update strategy:

  1. Synchronous update: that is, the write operation needs to wait for both nodes to be updated successfully before returning. In this way, in the event of a network partition failure, write operations are unavailable at the expense of A.
  2. Asynchronous update: That is, the write operation returns directly without waiting for the node to update successfully, and the node updates the data asynchronously.
    In this way, C is sacrificed to guarantee A. That is, there is no guarantee whether the data is updated successfully, and there may be data inconsistencies due to network failures and other reasons.
  3. Compromise: Return after updating some nodes successfully.
    Here, first introduce the following Dynamo system's strategy for controlling the consistency level in distributed storage system - NWR :

    • N: the number of copies of the same data
    • W: The number of replicas required to ensure a successful write operation
    • R: The number of replicas that need to be read for read operations

    When W+R>N, since the replica sets covered by the read and write operations will definitely have intersections, the read operations only need to compare the modification time or version number of the replica set data to select the latest one, so the system is strongly consistent;
    Conversely, when W+R<=N is weakly consistent.

    For example: (N,W,R)=(1,1,1) is a stand-alone system, which is strongly consistent; (N,W,R)=(2,1,1) is a common master-slave mode, is weakly consistent.

nwr

Example:

  • For example, a compromise solution like Cassandra QUORUMreturns as long as more than half of the nodes are updated successfully, and returns a consistent value for most replicas when reading. Then, for inconsistent copies, it can be resolved by read repair.

    read repair: When reading a piece of data, query this piece of data in all replicas, compare whether the data is consistent with the latest data of most replicas, and if not, perform consistency restoration.

    Among them, W+R>N, so it is strongly consistent.

  • Another example is the master-slave mode of Redis. Once the update is successful, a node returns, and other nodes back up data asynchronously. This approach only guarantees eventual consistency.

    最终一致性: Compared with the strong consistency of data that is consistent at all times, eventual consistency allows data to be inconsistent for a certain period of time. But over time, the data will eventually reach a consistent state.

    Among them, W+R<N, so only eventual consistency can be guaranteed.

    In addition, the larger N, the better the data reliability. However, since the larger W or R is, the larger the write or read overhead is, and the worse the performance is. Therefore, it is generally necessary to comprehensively consider consistency, availability, and read and write performance, and set both W and R to be N/2 + 1.

In fact, the way of compromise and asynchronous update is essentially the same, both lose a certain amount of C in exchange for the improvement of A. Moreover, there will be a 'split-brain' problem - that is, when the network is partitioned, the nodes process requests separately and cannot synchronize data. When the network is restored, it will lead to inconsistency.

In general, the database will provide solutions for partition recovery:

  1. Solve from the source: if the timeout period for node communication is set, the 'minority' nodes will not provide services after the timeout. This prevents data inconsistencies, but reduces availability.
  2. Resolve from recovery: For example, when the communication is recovered, the data of different nodes is compared and merged, so that the availability is guaranteed. But until recovery is complete, the data is inconsistent and data conflicts may occur.

This is not enough. When the amount of data is large, due to the limited resources of one machine and cannot accommodate all the data, we will want to divide the data into several machines for storage.

3. Fragmentation

sharding

Compared with a single instance, there is one more node to split the data.

Since there is only one copy of all data, consistency is guaranteed; no communication between nodes is required, and partition tolerance is also available.

However, when any node hangs up and loses part of the data, the system availability cannot be guaranteed.

To sum up, this is the same as the stand-alone version, which can only guarantee CP.

So, what are the benefits?

  1. The failure of a node will only affect some services, that is, service degradation;
  2. Due to the fragmented data, the load can be balanced;
  3. After the data volume increases/decreases, the capacity can be expanded/shrinked accordingly.

Most database services provide sharding. Such as Redis's slots, Cassandra's partitions, MongoDB's shards, etc.

Based on sharding, the problem of large amount of data is solved, but we still want our system to be highly available. So, how to sacrifice a certain consistency to ensure availability?

4. Cluster

clustercluster2

It can be seen that the above method combines the first two methods. The same analysis as above, using different data synchronization strategies, the CAP guarantee of the system is different. However, general database systems provide optional configurations, and we choose different strategies to achieve different features according to different scenarios.

In fact, for most non-financial Internet companies, the requirement is not strong consistency, but a guarantee of availability and eventual consistency. This is also a major reason why NoSQL is popular in Internet applications. Compared with the ACID principle of strong consistency systems, it is more inclined to BASE :

  • Basically Available: Basically available, that is, the partition is allowed to fail, and only the service is degraded if there is a problem;
  • Soft-state: Soft state, i.e. allow asynchrony;
  • Eventual Consistency: Eventual consistency, allowing data to be consistent eventually, rather than always consistent.

V. Summary

Basically, the methods discussed above cover most of the distributed storage systems. We can see that these solutions always need to sacrifice one part in exchange for another part, and can never achieve 100% CAP.

Which solution is chosen is based on which features are more important in a specific scenario.

 

https://my.oschina.net/MaxLeap/blog/737574

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326539855&siteId=291194637