Distributed file storage system-FastDFS

        Foreword: FastDFS is a distributed file storage system, mainly used to store and manage large-scale file data, such as pictures, videos, documents, etc. It is a system developed by Taobao's former architect in C language to store pictures.


        The server has two components Tracker Server and Storage Server, corresponding to two roles:

Tracker: Manage and schedule clusters , and trackers can also implement clusters. Each tracker node has equal status . Collect the status of the Storage cluster.
Storage: actually save files , Storage is divided into multiple groups, and the files saved between each group are different. There can be multiple members in each group, the contents saved by the members in the group are the same, the status of the group members is the same, and there is no concept of master and slave .

Working principle process:

        The client client requests the Tracker server to upload and download files, and the Tracker server schedules and finally the Storage server completes the file upload and download.
        In detail, Storage will regularly report status information to Tracker. When Tracker receives a request from Client to upload or download files, Tracker will query the available Storage and return the corresponding available IP and port . Then Client will actually upload or download files to Storage . And it will bring back some file information (file name, file ID) to the Client.

Design features:

(1) Intra-group backup , Storager will perform backup operations in the same group (real-time data synchronization)
(2) Horizontal expansion (linear expansion) , there is such a scenario: currently too many people visit the Storager1-1 server to download, then at this time Storing files in it will lead to performance degradation, then you will find the files stored in Storager1-2 (feeling of separation of reading and writing), and their data is synchronized in real time, which can achieve the goal. If all storage is full, scale out to Storager2-1.
(3) Load balancing capability.

inquiry mode:

        FastDFS locates and accesses files by file name or file ID . It does not support complex query operations.

Has the characteristics:

  1. High performance: FastDFS adopts technologies such as multi-level cache and distributed storage , and has high I/O performance.
  2. High availability: FastDFS supports file copies and can automatically perform failover to ensure data reliability.
  3. High concurrency: FastDFS can handle a large number of file upload and download requests at the same time, and is suitable for high concurrency scenarios.
  4. Easy to expand: FastDFS can expand storage capacity by adding storage servers, and supports online expansion .

Applicable scenarios:
        FastDFS is suitable for scenarios such as big data processing, cloud storage, and content distribution , especially scenarios that require large file storage and high concurrent access.

An example Django configuration is given below:

(1) Install FastDFS's Python client library (such as fdfs_client-py)

pip install fdfs_client-py

(2) FastDFS in Django configures related environment variables and configures the default file storage backend:

# django⽂件存储
DEFAULT_FILE_STORAGE = 'xxx.FastDFSStorage'  # 替换为实际的存储后端类路径
# FastDFS
FDFS_BASE_URL = 'http://系统IP:对应开放端口/'  # FastDFS 访问地址(注意结尾的斜杠)
FDFS_CLIENT_CONF = 'xxx/client.conf'  # FastDFS 客户端配置文件路径

(3) Create a Django storage backend class for uploading files to FastDFS, such as:
 

from django.core.files.storage import Storage
from django.conf import settings
from fdfs_client.client import Fdfs_client

class FDFSStorage(Storage):
    def __init__(self, client_conf=None, base_url=None):
        if client_conf is None:
            client_conf = settings.FDFS_CLIENT_CONF
        if base_url is None:
            base_url = settings.FDFS_URL

        self.client_conf = client_conf
        self.base_url = base_url

    def _open(self, name, mode='rb'):
        pass

    def _save(self, name, content):
        # 创建 FastDFS 客户端
        client = Fdfs_client(self.client_conf)

        # 上传文件
        result = client.upload_by_buffer(content.read())

        # 检查上传结果
        if result.get('Status') != 'Upload successed.':
            raise Exception('Upload file to FastDFS failed')

        # 返回文件名
        return result.get('Remote file_id')

    def exists(self, name):
        return False

    def url(self, name):
        return self.base_url + name

Guess you like

Origin blog.csdn.net/lxd_max/article/details/132259793