The difference between state machine replication and primary backup system

Insert picture description hereThis work is licensed under the Creative Commons Attribution-Non-Commercial Use-Share 4.0 International License Agreement in the same way .
Insert picture description hereThis work ( Lizhao Long Bowen by Li Zhaolong creation) by Li Zhaolong confirmation, please indicate the copyright.

introduction

The first time I saw this question was when I was watching some technical blogs discussing the difference between ZAB, Paxos, and Raft. This problem has troubled me for a long time, and I have never been clear about the difference between them. To be honest, all the articles I read on the Internet describe this problem vaguely. This article gives my own opinions on this issue based on the content on the wiki. The opinions are not guaranteed to be correct. Please correct any errors.

Active replication 与 Passive replication

To figure out state machine replicationthe primary backup systemdifference between us must first understand Active replicationand Passive replication. We can see the following description in [2]:

Computer scientists further describe replication as being either:

  • Active replication, which is performed by processing the same request at every replica
  • Passive replication, which involves processing every request on a single replica and transferring the result to the other replicas

Computer science further describes replication as one of two:

  • Perform replication by processing the same request on each replica
  • Involves processing each request on a single copy and transferring the results to other copies

In fact, it is already very clear that the latter emphasizes that there must be a single point to process the request and then transfer the result to other copies; the former has a more relaxed restriction, emphasizing that there is only one state machine with the same input and the same output.

[2] State machine replicationdescribes it as a kind of replication model in distributed systems, let’s take a look at State machine replicationthe description in [2] :

assumes that the replicated process is a deterministic finite automaton and that atomic broadcast of every event is possible. It is based on distributed consensus and has a great deal in common with the transactional replication model. This is sometimes mistakenly used as a synonym of active replication. State machine replication is usually implemented by a replicated log consisting of multiple subsequent rounds of the Paxos algorithm. This was popularized by Google’s Chubby system, and is the core behind the open-source Keyspace data store.

We look at [8] on Active replicationthe Passive replicationprecise definition:

  • With active replication, also known as state machine replication, each replica implements a deterministic state machine. All replicas process the same operations in the same order.
  • With passive replication, also known as primary backup, a primary replica runs a deterministic state machine, while backups only store states. The primary computes a sequence of new application states by processing operations and forwards these states to each backup in order of generation.
  • Using passive replication, also known as state machine replication, each copy implements a deterministic state machine. All copies process the same operations in the same order.
  • With passive replication, also known as primary backup, the primary copy runs a deterministic state machine, while the backup only stores state. The main database calculates a series of new application states through processing operations, and forwards these states to each backup in the order of generation.

It further corroborates what we said above. Having said so much, it is better to take a picture to see more clearly [8]:
Insert picture description here

Clearly it describes Active replicationthe Passive replicationrelationship, i.e., without the main former, the latter has a master. We also know multi-Paxosthat the reason for the existence of the master is to improve efficiency and eliminate the generation of proposals in the first stage. If the master node is not used, it will not have any impact on security, but it may affect the activity (livelock). And ZAB and Raft must need a master node.

Let's look at a very famous picture [8]:
Insert picture description here
you can clearly see the relationship between the various agreements.

From [4] we can also see the following text, we can see that many bloggers on the Internet have quoted this text:

What is the difference between primary-backup and state machine replication?

  • A state machine is a software component that processes a sequence of requests. For every processed request, it can modify its internal state and produce a reply. A state machine is deterministic in the sense that, given two runs where it receives the same sequence of requests, it always makes the same internal state transitions and produces the same replies.
  • A state machine replication system is a client-sever system ensuring that each state machine replica executes the same sequence of client requests, even if these requests are submitted concurrently by clients and received in different orders by the replicas. Replicas agree on the execution order of client requests using a consensus algorithm like Paxos. Client requests that are sent concurrently and overlap in time can be executed in any order. If a leader fails, a new leader that executes recovery is free to arbitrarily reorder any uncommitted request since it is not yet completed.
  • In the case of primary-backup systems, such as Zookeeper, replicas agree on the application order of incremental (delta) state updates, which are generated by a primary replica and sent to its followers. Unlike client requests, state updates must be applied in the exact original generation order of the primary, starting from the original initial state of the primary. If a primary fails, a new primary that executes recovery cannot arbitrarily reorder uncommitted state updates, or apply them starting from a different initial state.
  • In conclusion, agreement on state updates (for primary-backup systems) requires stricter ordering guarantees than agreement on client requests (for state machine replication systems).
  • A state machine is a software component that processes a series of requests. For each processed request, it can modify its internal state and generate a response. The state machine is deterministic in a sense, that is, given two runs to receive the same request sequence, the state machine always performs the same internal state transition and produces the same reply.
  • The state machine replication system is a client-server system that ensures that all state machine copies execute client requests in the same order, even if these requests are submitted by the client at the same time and received by the copies in a different order. Replicas use consensus algorithms such as Paxos to agree on the execution order of client requests. Client requests that are sent concurrently and overlap in time can be executed in any order. If the leader fails, the new leader performing the recovery can reorder any unsubmitted requests at will, because the request has not yet been completed.
  • For primary backup systems such as Zookeeper, the replicas agree on the application sequence of incremental (incremental) state updates, which are generated by the primary replica and sent to its followers. Different from the client request, the state update must start from the original initial state of the main database and be applied in the original original generation order of the main database. If the primary database fails, the new primary database performing the recovery cannot reorder uncommitted state updates arbitrarily, nor can they be applied from other initial states.
  • In short, compared with the client request protocol (for the state machine replication system), the state update protocol (for the primary backup system) requires stricter sequence guarantees .

I personally think that the above text actually explains a problem, and this problem is also the reason why zookeeper does not use Paxos as a consensus protocol. I also described this problem in [9], that is, multi-Paxos only guarantees all nodes. The log sequence is exactly the same, but for each node itself, it can be considered that its log has no so-called "order" . The reason for this problem is actually that Paxos allows multiple nodes to write. Two requests AB may be submitted at two nodes, but logically the second request B may have a smaller index in the log because it gets the proposal first and submits it. If it succeeds, or invalidates Proposal A, submit it first.

Of course, it can be seen that it primary backup systemis actually a special one state machine replication. You can refer to Figure1.

Consensus Algorithm and State Machine

So what is the relationship between the consensus algorithm and the state machine? [10] gave this explanation: It
Insert picture description here
is very clear:

The consensus algorithm manages the replication log from client instructions. The state machine processes the same instructions in the same order from the log, so the result is the same.

The meaning of the state machine is to input the operation represented by the log to ensure that the state of the state machine on different machines is the same, and to ensure that the replicated log is the same is the work of the consistency algorithm. On a server, the consistency module receives the instructions sent by the client and adds it to its log. It communicates with consistency modules on other servers to ensure that the logs on each server eventually contain the same requests in the same order , thereby ensuring that the state of the state machine is consistent.

As far as the ChubbyGo experience I have written, the essence of the state machine is actually to execute the various data structures of the Commit log.

It seems that one of Raft's optimization directions may be to remove redundant logs, similar to Redis's AOF rewrite, when different log points are detected during master-slave synchronization, fewer logs are sent based on the state machine, and logs are compressed. It can also be rewritten to send fewer logs.

The difference between sending logs

Insert picture description here
You can see that the biggest difference between them is that one sends the log directly, and the other sends it after evaluation. I think this is actually very different in implementation. The former is equivalent to the state machine only storing results and not participating in decision-making; the latter state machine is involved in decision-making because the leader has the most authoritative log. It can also be seen that the former cannot implement Randominstructions such as this. Of course, we can also find the Primary backupfollowing problems in [1] :

  1. If an operation is not submitted at the end, then the state machine must roll back the operation
  2. If a new node becomes the primary, then the state machine of the old primary must roll back the most recent uncommitted operation

This is similar to the problem of ghost reappearance, discussed in [9].

In fact, it can also be seen Primary backupthat a stricter order guarantee is required. Otherwise, the problem of data loss may occur. For example, Redis is a Primary backupmodel. The above problem is that the data may not be synchronized yet backup, but it is Primarydown. In this case, data loss may occur. From the user's point of view, the operation is obvious It succeeded, but the result was not seen, because the rollback operation was not performed. Of course, Redis master-slave does not have any consistency guarantee at all.

reference:

  1. 博客《state machine replication vs primary backup system
  2. Wiki " Replication (computing) "
  3. Wiki " State machine replication "
  4. Blog " Zab vs. Paxos "
  5. StackOverflow《Relationship between primary-backup and state machine replication
  6. 课件《KAU Replication State Machines via Primary-Backup
  7. Blog " Notable details of the crowd agreement Paxos/ZAB/Raft/VR "
  8. 论文《Vive La Difference: Paxos vs. Viewstamped Replication vs. Zab
  9. Blog " The Difference of Paxos, Raft, ZAB "
  10. 论文《In Search of an Understandable Consensus Algorithm(Extended Version)》

Guess you like

Origin blog.csdn.net/weixin_43705457/article/details/113994669