DFS- Distributed File System Comparison

Demand expressed as follows in order of priority:

1)存放3TB以上中小型文件,图片为主,平均在500~700k,一般在1M以内。

2)要集群化,支持负载均衡,高可用高性能。有大企业使用背书最好。

3)提供Java程序上传文件的手段。Java代码可以在Windows环境下调试。

4)必须开源,作者能保持更新。

5)有运维监控手段,能快速定位出问题的服务器。

6)(加分项)新增存储服务器时,不需要更改Nginx负载均衡和Java程序的配置。

Read a lot of information, not a candidate for the perfect solution, only slightly in line with the 3:

In summary, I think simple and approachable, seaweedfs ecosystem relatively complete, and the author has been updated. Of course FastDFS also a good choice.

frame Brief introduction File storage methods Support high availability Capacity expansion means Browser support gzip Browser cache Program access
FastDFS

I knew that many start-up companies are using, and he spent a period of time, relatively stable, many Chinese netizens write data, but almost no official document. Recommended fastdfs-nginx-module spit flow, rather than the reverse proxy direct access to a storage server.

Operation and maintenance aspects: the event log can not be accessed, to check spread across multiple locations (nginx> nginx fastdfs module> storage> tracker), are not familiar with the investigation who is not a good reason.

Home: https://github.com/happyfish100/fastdfs

Deployment Description: http://blog.csdn.net/xifeijian/article/details/38567839

Docker:https://hub.docker.com/r/hhland/fastdfs/ ,https://hub.docker.com/r/season/fastdfs/

A good reference scenario: https://github.com/daniellitoc/xultimate-resource

Store key-value method.

No upload directory concept, development / test environments need to deploy a separate file server for personal use.

Does not support the file list, does not support FUSE

Clusters can be more than one Storage Tracker with multiple servers.

No clustering across multiple tracker, by the client failover to solve the problem.

storage server group as the unit, a different storage server file in the same group are identical, mainly for load balancing and fault tolerance, similar to the hard disk RAID10 solution.

It does not support multiple rooms.

TB class storage solutions

1. The server can specify the same storage in a plurality of configurations to increase store_path hard disk.

2. Support to the group as a unit to increase server capacity to expand.

3. The total capacity of all group together

Temporary compressed by a reverse proxy It supports the If-Modified-Since

Language SDK: a dedicated SDK, support Java.

REST Interface: No

http read files: through storage server http service can also be installed by nginx fastdfs-nginx-module is provided. The latter is recommended .

Baidu BFS
(to be studied)

Powerful, but very few documents online, Baidu search found no experience articles. Baidu said the entire company are in use.

Home: https://github.com/baidu/bfs

Docker: providing Dockerfile, but did not put Docker Hub.

Storage directory mode.

Support file list, support for FUSE

Cluster consists NameServer, MetaServer, ChunkServer composition.

NameServer raft used algorithm, or rely Neuxs Zookeeper selected Leader, fail-safe automatic reselection Leader

ChuckServer high availability: to be analyzed

Multi-room, multi-server support is best.

PB-class storage solutions

 

 To be analyzed  To be analyzed

Language SDK: SDK for access, does not support Java, but can be achieved through FUSE bridge, estimated under Windows with Cygwin need to access.

REST Interface: No

http file reading: NameServer provide access.

seaweedfs

Powerful, it seems to be very promising, few Chinese data. Doc said, "in through courier" in use.

Because not used, hard to say whether the operation and maintenance easy.

Home: https://github.com/chrislusf/seaweedfs

Deployment and usage instructions: http://blog.chinaunix.net/uid-25057421-id-5676348.html

官方Docker:https://hub.docker.com/r/chrislusf/seaweedfs/

Store key-value method.

filer can be uploaded to the specified directory. However, due to the java sdk are directly connected with the master volume, so the development / test environments can not share the same file server.

filer support the file list, but does not support FUSE

Cluster consists of more than one master and multiple volume composed server. copying between the volume of replication strategy is determined.

Support for multi-room, multi-replication strategy.

PB-class storage solutions

Increase volume to increase server capacity, but how much more effective replication strategies related.

Supports pre-compressed into gzip format file, spit directly flow into gzip Support etag, If-Modified-Since, etc.

Language SDK: SDK are actually on all accessible via REST interface. There are versions of Java.

REST Interface: Volume and filer server provides an interface at different levels, volume is the key-value approach, filer is similar to the way the directory.

http file read: Filer server

 

发布了295 篇原创文章 · 获赞 37 · 访问量 3万+

Guess you like

Origin blog.csdn.net/tianshan2010/article/details/104764720