Solana leader rotation

Translated solana document original link  https://docs.solana.com/cluster/leader-rotation

leadership rotation

At any given moment, the cluster expects only one validator to produce ledger entries. By having only one leader at a time, all validators are able to replay the same copy of the ledger. However, the disadvantage of having only one leader is that a malicious leader can delete votes and transactions. Since pruning cannot be distinguished from network packet loss, the cluster cannot simply elect a single node to assume the leader role indefinitely. Instead, the cluster minimizes the impact of a malicious leader by rotating leader nodes.

Each validator selects the intended leader using the same algorithm as below. When a validator receives a new signed ledger entry, it can be sure that the entry was generated by the intended leader. Assign a slot to each leader and sort the slots. This process is called leader scheduling.

Leader rotation scheduling

Validators reject blocks that are not signed by the slot leader. The identity list of all slot leaders is called the leader list. The leader list is recalculated locally periodically. During a period called an epoch, a leader is assigned to the slot. The inventory must be calculated long before it is allocated a slot to ensure that the state of the ledger is deterministic when the inventory is calculated. This period is called the leader list offset. Solana sets the offset to the slot duration until the next epoch. In other words, the leader list of an epoch is calculated based on the ledger state at the beginning of the previous epoch. The offset of an epoch is fairly arbitrary and is assumed to be long enough that all validators will bring the ledger to consensus before generating the next manifest. The cluster may choose to shorten the offset to reduce the time between staking changes and leader list updates.

Since the system will run for more than one epoch without partitioning, manifests only need to be generated when the root fork crosses the epoch boundary. Since the manifest is for the next epoch, any new pledges submitted to the root fork will not be activated until the next epoch. The block used to generate the leader list is the first block that crosses an epoch boundary.

If no partition lasts longer than one epoch, the cluster will work as follows:

Validators continuously update their root fork when voting.
Every time the slot height crosses the epoch boundary, the validator updates its leader list.
For example:

Assume an epoch duration is 100 slots, which is actually an order of magnitude higher. The root fork is updated from the fork calculated at slot height 99 to the fork calculated at slot height 102. Forks with slot heights 100 and 101 were skipped due to faults. The new leader list is calculated using a fork at slot height 102. It is active starting at slot 200 until updated again.

There will be no inconsistency since every validator voting with the cluster skips 100 and 101 when its root passes 102. All validators, regardless of voting mode, will submit to the root of 102 or its descendants.

Leader list rotation for partitions with Epoch size.

The duration of leader list drift is directly related to the likelihood that the cluster has an inconsistent view of the correct leader list.

Consider the following scenario:

Two partitions, each generating half of the blocks. Neither reached an absolute supermajority split. Both will span epochs 100 and 200 without actually committing the root node, so there is a cluster-wide commitment to the new leader schedule.

In this unstable scenario, there are multiple effective leader schedules.

Generate a leader plan for each direct parent's fork at the previous epoch.
The leader schedule is valid after the start of the next epoch for the descendant fork until it is updated.
The schedule for each partition will diverge after the partition lasts for more than one period. For this reason, the epoch duration should be chosen to be much larger than the slot time and the expected length of the fork submitted to root.

After observing the cluster long enough, the leader scheduling offset can be selected based on the median partition duration and its standard deviation. For example, an offset that lasts longer than the median partition plus six standard deviations can reduce the chance of an inconsistent ledger plan in the cluster to one in a million.

Generate leader list in creation configuration

The first leader of the first epoch is declared in the genesis configuration. The leader in the first two epochs is this, because the leader list for the next epoch will be generated at slot 0. The length of the first two epochs can be specified in the genesis file. The minimum length of the first epoch must be greater than or equal to the maximum rollback depth defined in Tower BFT. ,

leader list generation algorithm

The leader list is generated using predefined seeds. Proceed as follows:

  1. Periodically use the PoH clock height (a monotonically increasing counter) to construct a stable pseudo-random algorithm.
  2. At that height, all staking accounts in the bank that have leader status and vote within the number of ticks configured in the cluster are taken as samples. This sample is called the active set.
  3. Sort activity sets by stake weight.
  4. Using a random seed, nodes are selected based on stake weight to create a stake-weight ranking.
  5. This ordering takes effect after several ticks of the cluster configuration.

Checklist attack vectors

seed

The seeds chosen are predictable so there is no ambiguity in the results. No attack will affect its outcome.

activity set

The leader can affect the active set by pruning validator votes. There are two possible ways for a leader to delete an active set:

  • Ignore validator votes
  • Refuse to vote for blocks with validator votes

To reduce the possibility of pruning, active sets are calculated at leader list offset boundaries during the active set sampling duration. The active set sampling duration is long enough that votes will be collected by multiple leaders.

pledge

The leader can delete new pledge transactions or refuse to verify blocks with new pledges. This attack is similar to the pruning of validator votes.

Validator operation key lost

Leaders and validators should operate using ephemeral keys, and stake owners authorize validators to use their stake through delegation.

The cluster should be able to recover from the loss of all ephemeral keys used by the leader and validators, which may occur through a common software vulnerability shared by all nodes. Even if stake is currently delegated to a validator, stake owners should be able to vote directly by co-signing the validator vote.

Append entry


The life cycle of the leader list is called epoch. The epoch is divided into multiple slots, where the duration of each slot is T PoH ticks.

The leader transmits entries in the slots it is responsible for. After T ticks, all validators switch to the next scheduled leader. Validators MUST ignore entries sent outside the slots assigned by the leader.

The next leader must observe all the ticks of the previous leader in order for it to build its own entries on them. If the entry is not observed (the leader is down) or the entry is invalid (the leader is faulty or malicious), the next leader must generate ticks to fill the vacancy of ticks in the previous leader's slot. Note that the next leader should execute repair requests in parallel and defer sending ticks until it is confident that other validators have also failed to observe the previous leader's entries. If a leader mistakenly builds on its own ticks, the leader following it must replace all ticks.

Guess you like

Origin blog.csdn.net/HardRedStone/article/details/124842668