FastDFS working mechanism and advantages and disadvantages analysis

1. Introduction to FastDFS

FastDFS is an open source lightweight distributed file system, first written in ali, written by Yu Qing, an architect of Easy Road Vehicles. It is implemented in pure C language and supports Linux systems such as Linux, FreeBSD, and AIX. FastDFS is tailor-made for Internet applications, pursues high performance and high scalability, and is more suitable for application-level distributed file storage services. It can only access and access files through a proprietary API, does not support the POSIX interface mode, and is less versatile.

In terms of language support, APIs for C, java, php, and .NET are currently provided.

Now it is used in jd, taobao, 58, uc, 51cto.

2. Uses of FastDFS

1) FastDFS mainly solves the problems of large-capacity file storage and high concurrent access, and achieves load balancing during file access.

2) FastDFS implements software RAID, which can be stored using inexpensive IDE hard drives and supports online expansion of storage servers.

3) FastDFS is especially suitable for large and medium-sized websites to store resource files (such as pictures, documents, audio, video, etc.).

3. Working mechanism of FastDFS

1. Cluster architecture diagram

Insert picture description here

FastDFS cluster architecture by the tracking server (tracker server), storage server (storage server) and clients (client) of three parts.

tracker server : mainly do scheduling work, play load balancing role during access, record the state information of the group and storage server in the cluster in the memory, it is the hub connecting the client and the storage server, because the relevant information is all in the memory, the tracker server The performance is very high (it requires very little load itself), and 3 in a larger cluster (such as hundreds of groups) is enough.

Storage server : Files and file attributes (such as metadata) are stored on the server. Storage services include file storage, file synchronization, file access interface, and file metadata management by key-value.

Interpretation of FastDFS architecture

Two roles, tracker server and storage server, do not need to store file index information.
All servers are peer-to-peer and there is no Master-Slave relationship.
The storage servers are grouped, and the files on the storage servers in the same group are identical (RAID 1).
In the storage cluster, the group (or volume) and the group are not independent of each other. The storage actively reports the status to the tracker. The cumulative capacity of all groups is the file capacity of the entire storage system. A group can consist of one or more units. The storage server consists of the same files in all single storage servers in a group. Multiple storage servers in a group play the role of redundant backup and load balancing. When adding servers in the group, the existing ones are synchronized. The file is automatically completed by the system. After the synchronization is completed, the system automatically switches the newly added server to provide services online. When the storage space is insufficient or is about to be exhausted, the group can be added dynamically. Only one or more servers need to be added, and the They are configured as a new group, which expands the capacity of the storage system.
To expand the capacity of the entire FastDFS cluster, just add a group to the storage.

2. Upload and download mechanism

2.1 Upload mechanism

Insert picture description here

1. Select the tracker server

When there is more than one tracker server in the cluster, because the trackers are completely equal, the client can choose a trakcer when uploading files.
Select the stored group
When the tracker receives the upload file request, it will assign a group to the file that can store the file. The following rules for selecting a group are supported:

1. Round robin, polling among all groups
2. Specified group, specify a certain group
3. Load balance, more storage group priority

2. Select storage server

When the group is selected, the tracker will select a storage server in the group to the client, supporting the following rules for selecting storage:

1. Round robin, poll among all storages in the group
2. First server ordered by ip, sort by ip
3. First server ordered by priority, sorted by priority (priority is configured on storage)

3. Choose storage path

When the storage server is allocated, the client will send a file write request to the storage, and the storage will allocate a data storage directory for the file, supporting the following rules:

1. Round robin, polling among multiple storage directories
2. Priority with the most remaining storage space

4. Generate the file name

After selecting the storage directory, storage will generate a file name for the file, which is generated by the storage server ip, file creation time, file size, file check code, and a random number.

Choose two levels of directories. There are two levels of 256 * 256 subdirectories in each storage directory. Storage will hash twice according to the file name, route to one of the subdirectories, and then store the files in the subdirectory.

5. Generate file index

When the file is stored in a subdirectory, it is considered that the file is successfully stored, and then a file index, Fileid, is generated for the file. Fileid is composed of group, storage directory, two-level sub-directory, file name, file suffix name (specified by the client, mainly used to distinguish file types).

2.2 Download mechanism

Insert picture description here

The client brings the file name information and requests the Tracker service to obtain the IP address and port of the storage server. The client then requests to download the file based on the returned IP address and port number. The storage server returns the file to the client after receiving the request.

3. File index analysis

Example of file index:

group1/M00/00/0C/wKjRbExx2K0AAAAAAAANiSQUgyg37275.h

"Group1" is the group name, "/ M00 / 00 / 0C /" is the disk and directory, and "wKjRbExx2K0AAAAAAAANiSQUgyg37275.h" is the file name and suffix.

The file name is generated from the source storage server ip, the time stamp when it was created, the file size, the file check code, and a random number after hash calculation. Finally, it is text-encoded based on base64 and converted into printable characters.

4. File synchronization mechanism

The storage server uses binlog files to record update operations such as file upload and deletion. binlog does not record file contents, only metadata information such as file names.

In the storage server, a dedicated thread synchronizes files according to binlog. In order to avoid mutual influence to the greatest extent and for the sake of system simplicity, the storage server will start a thread for each server in the group except itself to synchronize files. The file synchronization adopts the incremental synchronization method, and the system records the synchronized position (binlog file offset) to the identification file. Identification file name format: {dest storage IP} _ {port} .mark

File synchronization is only performed between storage servers in the same group, using the push method, that is, the source server is synchronized to the target server. Only the source data needs to be synchronized, and the backup data does not need to be synchronized again, otherwise it will form a loop.

When a new storage server is added, an existing storage server synchronizes all existing data (including source data and backup data) to the newly added server.

After the client uploads a file to a storage server, the file upload is complete. The storage server synchronizes this file to other storage servers in the same group according to the upload records in the binlog. This file synchronization method is asynchronous, and the asynchronous method brings about the problem of file synchronization delay. After a new file is uploaded, the file cannot be found when accessing the file on a storage server that has not been synchronized.

Four, pros and cons analysis

1. Advantages

The system does not need to support POSIX (portable operating system), reducing the complexity of the system and processing efficiency is higher
Support online expansion mechanism to enhance the scalability of the system
Implemented soft RAID, enhanced the system's concurrent processing capability and data fault tolerance recovery capability
Support master-slave file, support custom extension
Support multiple standby Trackers to enhance system availability
Support Nginx and Apache extension, http download available

What is a master-slave file?

主从文件是指文件ID有关联的文件，一个主文件可以对应多个从文件。
主文件ID = 主文件名 + 主文件扩展名
从文件ID = 主文件名 + 从文件后缀名 + 从文件扩展名
使用主从文件的一个典型例子：以图片为例，主文件为原始图片，从文件为该图片的一张或多张缩略图。
FastDFS中的主从文件只是在文件ID上有联系。FastDFS server端没有记录主从文件对应关系，因此删除主文件，FastDFS不会自动删除从文件。
删除主文件后，从文件的级联删除，需要由应用端来实现。
主文件及其从文件均存放到同一个group中。
主从文件的生成顺序：
1）先上传主文件（如原文件），得到主文件ID
2）然后上传从文件（如缩略图），指定主文件ID和从文件后缀名（当然还可以同时指定从文件扩展名），得到从文件ID

2. Disadvantages

Does not support breakpoint resume, will be a nightmare for large files (FastDFS is not suitable for large file storage)
Does not support POSIX universal interface access, lower versatility
The synchronization mechanism does not support file correctness verification, which reduces the availability of the system. There is a large delay for file synchronization across the public network, and the corresponding fault tolerance strategy needs to be applied.
Download via API, there is a single point of performance bottleneck

5. Actual problems and current solutions

1. File synchronization delay

The file synchronization method is asynchronous. After a new file is uploaded, the file cannot be found on the storage server that has not been synchronized in the past.

solution:

It should be noted that the storage server included in a group is not set by the configuration file, but obtained by the tracker server. The client and the storage server actively connect to the tracker server. The storage server actively reports its status information to the tracker server, including the remaining disk space, file synchronization status, file upload and download times and other statistical information. The storage server will connect to all the tracker servers in the cluster and report their status to them. The storage server starts a separate thread to complete the connection and timing report to a tracker server. In addition, each storage server will regularly report to the tracker server the file timestamp it synchronizes with other storage servers in the same group. When the tracker server receives a file synchronization report from a storage server, it will sequentially find the minimum value of the file timestamp to which each storage server in the group is synchronized and record it as a storage attribute in memory. According to the above situation, FastDFS provides the following simple solutions:

1. As with the file update, the source storage server is preferred to download the file. This can be set in the configuration file of the tracker server, and the corresponding parameter name is download_server.

2. The method of selecting the storage server is round-robin. When the client asks the tracker server which storage servers can download the specified file, the tracker server returns a storage server that meets one of the following four conditions:

The source storage server to which the file is uploaded, and the file is directly uploaded to the server;
File creation timestamp <file timestamp to which the storage server is synchronized, which means that the current file has been synchronized;
File creation timestamp = file timestamp to which the storage server is synchronized, and (current time-file creation timestamp)> the maximum time required for a file synchronization to complete (such as 5 minutes);
(Current time-file creation timestamp)> File synchronization delay threshold, for example, if we set the threshold to 1 day, it means that file synchronization can definitely be completed in one day.

2. Data security

A copy is successful : within the time window from when the source storage finishes writing the file to when it is synchronized to other storage in the group, once the source storage fails, it may lead to user data loss, which is usually unacceptable to the storage system. of.
Lack of automatic recovery mechanism : When a storage disk fails, you can only swap the disk and then manually restore the data; due to machine backup, it seems impossible to have an automatic recovery mechanism, unless there is a pre-prepared hot spare disk, lack Automated recovery mechanism will increase system operation and maintenance work.
Low data recovery efficiency : When recovering data, it can only be read from other storages in the group. At the same time, due to the low access efficiency of small files, the efficiency of file-based recovery is also low. Low recovery efficiency means data Stay in an unsafe state for longer.
Lack of multi-machine room disaster recovery support : At present, to do multi-machine room disaster recovery, only additional tools can be used to synchronize data to the backup cluster, and there is no automated mechanism.

3. Storage space utilization

The number of files stored on a single machine is limited by the number of inodes

Each file corresponds to a file in the storage local file system. On average, each file will have a waste of storage space of block_size / 2.

The combined storage of files can effectively solve the above two problems, but because the combined storage does not have a space recovery mechanism, the space for deleting files cannot be guaranteed to be reused, and there is also a problem of waste of space.

4. Load balancing

The group mechanism itself can be used for load balancing, but this is only a static load balancing mechanism, which needs to know the access characteristics of the application in advance; at the same time, the group mechanism also makes it impossible to migrate data between groups for dynamic load balancing.

6. References

Introduction of FastDFS [https://www.cnblogs.com/shenxm/p/8459292.html]

Detailed explanation of the distributed file system FastDFS [https://www.cnblogs.com/ityouknow/p/8240976.html]

FastDFS features and problem thinking [https://www.cnblogs.com/jym-sunshine/p/6397705.html]

FastDFS synchronization problem discussion [https://www.cnblogs.com/snake-hand/p/3161528.html]

Program a monkey

Published 40 original articles · 25 praises · 100,000+ views

Private letter concerns