Introduction to FastCFS binlog mechanism

FastCFS adopts the classic Master/Slave structure and data synchronous replication. If the slave is online, the master calls the slave synchronously; otherwise, the slave will enter the data recovery phase. After catching up with the latest progress of the master, the slave will switch to the online state, and then the master will synchronously replicate the data to the slave.

FastCFS uses binlog to record data change operations. Binlog does not record the contents of files that have been changed (such as writing). Binlog is equivalent to a data index, which is very concise. There are two main purposes of binlog in FastCFS: 1. To realize persistent storage of data index, load data index by replaying binlog when the program starts; 2. Pull missing binlog from master during slave data recovery, and then perform data recovery based on the binlog. Let’s intuitively experience the binlog fragment shown below:

The first column is the change time (unix timestamp), and the second column is the data version number. To ensure strong data consistency in a distributed system with multiple data copies, the order of change operations must be guaranteed. Using a monotonically increasing data version number is an effective way to strictly guarantee the order of changes, which is the answer to the suspense left at the end of the previous article.

FastCFS supports automatic failover when the master fails. In extreme cases, after the master switch occurs, the binlog data on the original master and the new master may be inconsistent, and the inconsistent binlog data will only be at the end of the binlog file. We only need to Verify the last N (such as 3) binlogs. If the last N binlogs of the slave fail to reconcile with the master, the slave will quit running, and the slave needs to be manually repaired (usually only the last binlog needs to be deleted) before the slave can start normally.

The FastStore module has two sets of binlogs: replica binlog for slave data recovery and slice binlog for data indexing . For an update operation, write the slice binlog first , and then write the replica binlog . One update operation corresponds to two binlog writes. How to ensure the transactionality of the two writes? FastCFS does not allow failure to write binlog, only when the program terminates abnormally ( such as power failure, program hangs), the two may be inconsistent. Therefore, no verification is performed when writing to the binlog. When the program starts, the two binlogs are reconciled, and the redundant binlog records at the tail can be removed.

In FastCFS , the master synchronously calls the slave before writing the binlog . Why not choose to write the binlog before calling the slave ? The timing of master writing to binlog is particular, and this problem is left to everyone.

Finally, to sum up, binlog in FastCFS is an important mechanism to ensure data consistency. It is very important for binlog to achieve consistency. FastCFS uses the binlog reconciliation mechanism.

This article is shared from the WeChat public account - FastDFS Sharing and Exchange (fastdfs100).
If there is any infringement, please contact [email protected] to delete it.
This article participates in the " OSC Yuanchuang Project ", you are welcome to join and share with us.

Introduction to FastCFS binlog mechanism

Guess you like