fabric consensus mechanism

Since the fabric is a distributed system, thus requiring a consensus mechanism to guarantee each node in the same order the state to save the books, to reach consistency. In the current version fabric1.4, there are three common mechanisms are solo, kafka, etcdraft. Consensus transaction processing, including three stages: the proposal stage, the packaging phase and validation phase.

1.Solo consensus mode

Consensus Solo mode refers to a network environment is only one sorting node, the message sent from Peer node and generating the sorting block is composed of a sorting node; since only a sorted ordering service node for all nodes Peer service, no high availability and scalability not suitable for production environments, typically used for development and test environments.

Solo mode consensus procedure call Description:

Peer nodes are connected by ordering service gPRC, after a successful connection, send transaction information.
Recv ordering services through the interface, monitor information sent from Peer node, after receiving information data block processing.
The service generates sorting information received data block, the data block is written books (Ledger), returns the processing information.
Peer node deliver interfaces, data acquisition block generated ordering services.

2.Kafka consensus mode

Hyperledger Fabric core cluster to achieve consensus algorithm Kafka, simply put, is to sort (if there are multiple channel system, the sort for each channel separately) for all transaction information by Kafka.
Kafka is a distributed streaming information processing platform, the goal is to provide a unified real-time data, high throughput, low-latency performance.
Kafka role consists of the following categories:

Broker: a message processing node, the main task is to transmit the message received producers, and then write the partition corresponding to the topic, and the message sent to the subscribing consumers sorted in the topic. Broker a large number of nodes increases data throughput, and do each other to partition data redundancy (RAID technology similar).
Zookeeper: Brokers to provide cluster management service and consensus algorithms service (paxos algorithm), for example, the election leader node processes the message and the result is synchronized to other followers nodes, remove the failed node and adding a new node and the latest network topology synchronous transmission to all Brokers.
Producer: news producer, the application will be sent to Brokers by calling Producer API messages.
Consumer: consumer news, subscribe to the application by the Consumer API topic and receive messages after treatment.

Kafka saved message classified into a plurality of topic, each topic is included in a plurality of partition, partition information is written continuously added to form a type of the directory structure. Subscribe to a topic can be multiple consumers. Briefly, partition pipe is a FIFO message, end message written by the producer, and the other end consumer removed message (note, removed here and does not remove the message, but moves the position pointer consumer).

Kafka Hyperledger Fabric actual operating logic is as follows:

For each strand, there is a corresponding partition
Each chain corresponds to a single partition theme
Sort node is responsible for the transaction from a particular chain (received through broadcast RPC) relayed to the corresponding partition
Sort node can read the partition and get a list of transactions sorted agreed between all nodes sorting
A chain transaction is timed batch process, that is when the first transaction of a new batch comes in, starts counting
When the maximum number of transactions or a timeout Batch segmentation, generating a new block
Timing is another transaction the transaction, generated by the above described timer
Each node maintains a local log sorting for each chain, resulting in the local block stored in the books
Trading Block by distributing client RPC return
When a crash occurs, you can use different sorting node distribution block, because all have sort node maintains a local log

3. Etcdraft consensus mode

Raft v1.4.1 is introduced, which is based on a fault-tolerant etcd collapse (CFT) ordering service. Raft follow the "leader and follower" model, where the leader is selected dynamically (This set is called a node "consenter set") passage between the orderer node, which copies the message to the leader node follower. Because the system can withstand the node (including the leader node) is lost, as long as most of the rest of the sort node (the so-called "Arbitration"), Raft is called "crash fault-tolerant" (CFT). In other words, if a node has three channels, it can withstand the loss of a node (the remaining two nodes).

3.1 raft concepts

Log entry: Raft ordering service in the main unit of work is "log entry", the complete sequence of these entries is called a "log." If the majority of members (a quorum, in other words) members to agree on their order entry, we assume that the log is the same.
Consenter provided: a sorting node consensus engagement mechanism given channel and receive channel replication log. This may be all available nodes (in a single cluster or multiple clusters of systems concentrated contribute to the channel), or a subset of these nodes.
Finite state machine (FSM): Raft in each sorting node has a FSM, which together used to ensure that each sorting node log sequence is deterministic (prepared in the same order).
Quorum: Describe the need to confirm a minimum number of those who agreed to the proposal, so that it can submit the transaction. For each consenter set, which is most of the nodes. In the group with a five-node cluster must have three nodes can exist arbitration. If for any reason a quorum of nodes is unavailable, the orderer can not be used for read and write operations on the channel, and can not submit a new log.
Leader: Leader responsible for extracting new log entries, copy them to the followers order node, and when management believe that entry has been submitted. This is not a special type orderer people. In the case of decisions, it's just orderer role at some point may have, rather than the other roles.
Follower: Follower Leader received from the log there and copy them deterministically, to ensure consistency log. Follower will receive a "heartbeat" message from the Leader of. If the Leader stop sending these messages at a configurable time, the chase will initiate elections, a Follower will be selected as the new Leader.

3.2 raft process in the transaction

Each channel runs on a separate instance Raft protocol, which allows each instance to select a different leader. Sort node embodiment also allows the use of a control cluster consisting of different organizations further dispersed services. While all Raft nodes must be part of the channel system, but they do not necessarily have to be part of all application channels. Channel creator (and channel administrator) can select a subset of the available orderer, and add or delete orderer as needed (you can only add or remove a node).

In Raft, the transaction (or proposal configurations updated) received from the transaction orderer current node is automatically routed to the leader channel. This means that the peer and applications do not need to know who the leader yes. Only orderer node needs to know.

3.3 raft logs transport nodes election

raft node is always in one of three states: follower, candidate or leader. All nodes are initially follower. In this state, they can accept log entries from the leader of the (if elected), or vote for the leader. If no heartbeat or log entry (e.g., five seconds) within the set time, to improve the self-node candidate state. In the candidate state, voting node requests from other nodes. If a candidate received the required number of votes, it is promoted to leader. leader to accept the new log entries and copy it to the follower.

Although you can indefinitely retain all logs, but in order to save disk space, Raft use a process called "snapshotting", the user can define how many bytes of data retained in the log. Each snapshot will contain a certain number of blocks).

Rookie Series Fabric - Fabric 1.4 consensus mechanism (5)