The basic structure of FastCFS data consistency model

     Recently, I have been busy with the research and development of FastCFS v1.2.0. I mainly improved the data recovery and master appointment mechanism, and fixed 5 stability bugs. The reliability and stability of FastCFS has reached a new level. The improvement work of v1.2.0 is closely related to data consistency. This article will introduce the data consistency model and infrastructure adopted by FastCFS.


     When it comes to data consistency, everyone will think that distributed systems must conform to CAP theory: a distributed system cannot fully satisfy CAP, but can only achieve two of them, namely CA, AP or CP.


    The goal of FastCFS is to support running databases, and ensuring data consistency is a basic requirement; ensuring availability is also a basic requirement of distributed systems. Therefore, FastCFS chooses to fully implement CA, weakening P but not giving up P. Severe network partitions can destroy data consistency and system availability, often requiring human intervention to handle abnormal data. FastCFS has data checksum and split-brain repair mechanisms. FastCFS can completely self-heal if short-lived network partitions do not cause data inconsistencies. The leader and master of FastCFS are not based on the central node, but are self-consistent with server grouping as a unit. It can be seen that FastCFS adopts a localized approach of divide and conquer in terms of architecture and implementation mechanism, which avoids the risk of network partition to the greatest extent.


    FastCFS adopts master/slave structure for data grouping and leader/follower structure for server grouping. Careful friends will have such confusion: FastCFS actually has two roles of leader and master, as long as one of them is not enough? Both are well-known concepts. I believe that there must be a leader + master approach in the industry, but coexisting the two in a set of servers and implementing them in a native way may be the original creation of FastCFS.


    FastCFS adopts a strong data consistency model. The update operation of the client can only be performed on the master, and then the master synchronizes the update operation to the slave in the ACTIVE state (only in this state can provide online services) through RPC calls. If the slave is disconnected due to an abnormality such as service restart or severe network jitter, the slave will enter the data recovery phase and can switch to the ACTIVE state only after catching up with the master's data.


    So what is the purpose of introducing leader/follower? Because a group of servers usually contains multiple data groups (larger data groups, such as 1024, are pre-allocated to facilitate cluster expansion; it is recommended that no less than 64 data groups be configured on a group of servers), each data group If the master election is performed according to the election process, this consumption is too large. Therefore, FastCFS innovatively introduces the role of leader, and the leader directly appoints the master of several data groups under its jurisdiction. Summarize the generation mechanism of the leader and the master in one sentence: the leader is elected by the servers in the group, and the master is directly appointed by the leader.


    Finally, to summarize, this article introduces the data consistency model and infrastructure used by FastCFS, including leader/follower and master/slave. Subsequent articles will describe the key points and core solutions of FastCFS to ensure data consistency , so stay tuned.

This article is shared from the WeChat public account - FastDFS Sharing and Exchange (fastdfs100).
If there is any infringement, please contact [email protected] to delete it.
This article participates in the " OSC Yuanchuang Project ", you are welcome to join and share with us.

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324085172&siteId=291194637