Introduction 1. fastDSF

FastDFS is written in c language an open source distributed file system, which is written by Taobao Yuqing senior architect and open source. FastDFS tailored specifically for the Internet, full account of redundancy, load balancing, and other mechanisms linear expansion, and focus on high availability, high performance and other indicators, it is easy to use FastDFS to build a high-performance file server clusters provide file upload, download services.

Why use fastDFS it?

NFS introduced above, GFS distributed file systems are common, the advantages of a common distributed file system development experience is good, but high system complexity, performance in general, and the special distributed file system development experience though poor , but the system of low complexity and high performance. fastDFS ideal for storing those small files such as pictures, fastDFS not file into blocks, so it does not block the merger cost, fastDFS network communication using socket, communication speed quickly.

2. fastDSF works

2.1 fastDSF architecture

FastDFS architecture includes Tracker server and Storageserver. Tracker server client requests for file upload, download, upload and download files finalized by the Storage server by Tracker server scheduling.

As shown below:

1）Tracker

Tracker Server role is load balancing and scheduling can be found Storage server to provide file upload service to Tracker server when the file is uploaded, according to some policies. tracker track server or may be referred to dispatch server.

FastDFS cluster Tracker server can have multiple, mutually equal relationship while providing services, Tracker server there is no single point of failure between the Tracker server. Tracker server client request polling mode, if the requested service can not provide the tracker for another tracker.

2）Storage

Storage Server role is a file stored on Storage server, Storage server does not implement the client to upload files final storage file system but uses its own operating system's file system to manage files. storage can be referred to as a storage server.

Storage Cluster uses a packet storage. storage cluster is composed of one or more groups, clustered storage capacity of the total storage capacity of the cluster and all groups. A group consists of one or more storage servers, are equal in the relationship between Storage server group does not communicate with each other between the different groups of Storage server, file synchronization may be interconnected between the Storage server within the same group, in order to ensure that the same files on each storage within the same group. The storage capacity of the storage capacity of the server group for a minimum set of the software and hardware configurations within the group shows that the storage server is preferably the same.

The benefits of using packet storage is flexible, strong controllability. For example upload file, it can be specified directly by the client to be uploaded to the group selected by the scheduling tracker. When a packet server access memory pressure, the storage server may be increased in the group to expand the service capacity (vertical expansion). When the system capacity is insufficient, the group can be increased to expand storage capacity (horizontal expansion).

3) Storage status collection

Storage server cluster will connect all of the Tracker server, report regularly to their own state, including disk space, file synchronization status, such as the number of file upload download statistics.

2.2 file upload process

After the client upload a file storage servers will file ID is returned to the client, this ID file for later access index information for the file. File index information includes: group name, virtual disk paths, two data directory, file name.

Group name: where to upload the file storage group name, after uploading the file storage server has successfully returned, the client needs to save itself.

虚拟磁盘路径：storage配置的虚拟路径，与磁盘选项store_path*对应。如果配置了store_path0则是M00，如果配置了store_path1则是M01，以此类推。

数据两级目录：storage服务器在每个虚拟磁盘路径下创建的两级目录，用于存储数据文件。

文件名：与文件上传时不同。是由存储服务器根据特定信息生成，文件名包含：源存储服务器IP地址、文件创建时间戳、文件大小、随机数和文件拓展名等信息。

2.3 文件下载流程

tracker根据请求的文件路径即文件ID 来快速定义文件。

比如请求下边的文件：

1.通过组名tracker能够很快的定位到客户端需要访问的存储服务器组是group1，并选择合适的存储服务器提供客户端访问。

2.存储服务器根据“文件存储虚拟磁盘路径”和“数据文件两级目录”可以很快定位到文件所在目录，并根据文件名找到客户端需要访问的文件。

3.fastDFS安装

如果想自己尝试安装fastDSF也可以按照文档一步一步安装。

tracker和storage使用相同的安装包，fastDFS的下载地址在：https://github.com/happyfish100/FastDFS

本教程下载安装包：FastDFS_v5.05.tar.gz

FastDFS是C语言开发，建议在linux上运行，本教程使用CentOS7作为安装环境。

安装细节请参考 “fastDFS安装教程.doc”。

3.1 Tracker配置

本小节介绍Tracker的配置文件内容。

fastDFS的配置文件目录：/etc/fdfs

主要的配置文件：/etc/fdfs/tracker.conf （tracker配置文件）；storage.conf（storage配置文件）

tracker.conf配置内容如下：

端口：port=22122

存储策略：store_lookup=

取值范围：0（轮询向storage存储文件）、1（指定具体的group）、2负载均衡，选择空闲的storage存储

指定具体的group：store_group= 如果store_lookup设置为1则这里必须指定一个具体的group。

tracker 基础目录：base_path=/home/fastdfs，tracker在运行时会向此目录存储storage的管理数据。

3.2 storage配置

本小节介绍storage的配置文件内容。

storage.conf配置内容如下：

组名：group_name=group1

端口：port=23000

向tracker心跳间隔（秒）：heart_beat_interval=30

storage基础目录：base_path=/home/fastdfs

磁盘存储目录，可定义多个store_path：

store_path0=/home/fastdfs/fdfs_storage 此目录下存储上传的文件，在/home/fastdfs/fdfs_storage/data下

store_path1=...

...

上报tracker的地址：tracker_server=192.168.101.64:22122

如果有多个tracker则配置多个tracker，比如：

tracker_server=192.168.101.64:22122

tracker_server=192.168.101.65:22122

....

3.3 启动停止

fastDFS启动/停止脚本目录：

fdfs_trackerd：tracker脚本，通过此脚本对 tracker进行启动和停止

/usr/bin/fdfs_trackerd /etc/fdfs/tracker.conf restart

fdfs_storaged：storage脚本，通过此脚本对 storage进行启动和停止

/usr/bin/fdfs_storaged /etc/fdfs/storage.conf restart

fastDFS principle and environmental structures