Gossip agreement

1 What is the Gossip Protocol

Gossip protocol is also called Epidemic protocol (epidemic protocol). Originally used for node synchronization data in distributed databases, it has been widely used for database replication, information diffusion, cluster membership confirmation, fault detection, etc.

It can be seen from the word gossip, which means gossip, gossip, etc. in Chinese, and we can imagine the spread of gossip (or the spread of epidemics); the working principle of the gossip protocol is similar to this. The gossip protocol uses a random method to spread information to the entire network, and makes all nodes in the system consistent in a certain period of time. Gossip is actually a distributed protocol with a decentralized idea, which solves the two problems of state propagation in the cluster and state consistency guarantee.

image-20230403170921823

2 Gossip principle

There are two ways of message dissemination in the Gossip protocol: anti-entropy dissemination and rumor dissemination

  1. anti-entropy propagation

    All data is propagated with a fixed probability. All participating nodes have only two states: Suspective (pathogen) and Infective (infection). The process is that the seed node will share all the data with other nodes in order to eliminate any inconsistency of data between nodes, and it can guarantee the final and complete consistency. The disadvantage is that the number of messages is very large and unlimited; it is usually only used for data initialization of newly joined nodes.

  2. Rumor propagation
    is to propagate only newly arrived data with a fixed probability. All participating nodes have three states: Suspective (pathogen), Infective (infection), Removed (cure). The process is that the message only contains the latest update, and the rumor message will be marked as removed after a certain point in time and will no longer be propagated. The disadvantage is that there is a certain probability that the system will be inconsistent, and it is usually used for incremental data synchronization between nodes.

3 communication methods

The ultimate goal of the Gossip protocol is to distribute data to every node in the network. According to different specific application scenarios, there are three communication modes between two nodes in the network: push mode, pull mode, push/pull mode

  1. Push

    Node A pushes the data (key, value, version) and the corresponding version number to Node B, and Node B updates the data in A that is newer than itself

    Push

  2. Pull

    A only pushes the data key, version to B, B pushes the local data (Key, value, version) newer than A to A, and A updates the local

    Pull

4 Advantages and disadvantages

To sum up, we can conclude that Gossip is a decentralized distributed protocol, and data spreads one by one through nodes like a virus. Because it is exponential propagation, the overall propagation speed is very fast.

  1. advantage

    • Scalability: Any increase and decrease of nodes is allowed, and the state of the newly added nodes will eventually be consistent with other nodes

    • Fault tolerance: The downtime and restart of any node will not affect the propagation of Gossip messages, which has natural fault tolerance characteristics of distributed systems

    • Decentralization: No central node is required, all nodes are equal, any node does not need to know the entire network status, as long as the network is connected, any node can spread the message to the entire network

    • Final Consistency: The Gossip protocol realizes exponentially fast dissemination of information, so when there is new information to be disseminated, the message can be quickly sent to the global nodes, and all nodes can have the latest data within a limited time.

  2. shortcoming

    • Message delay: Nodes randomly send messages to a few nodes, and the messages finally reach the entire network through multiple rounds of dissemination; this inevitably causes message delays.
    • Message redundancy: Nodes regularly randomly select surrounding nodes to send messages, and nodes that receive messages will repeat this step; it will inevitably cause the same node to receive multiple messages, increasing the pressure on message processing

Due to the above advantages and disadvantages, the Gossip protocol is suitable for data consistency processing in AP scenarios . Common applications include: P2P network communication, Redis Cluster, and Consul.

Guess you like

Origin blog.csdn.net/agonie201218/article/details/129944150