Demand expressed as follows in order of priority:
1)存放3TB以上中小型文件,图片为主,平均在500~700k,一般在1M以内。
2)要集群化,支持负载均衡,高可用高性能。有大企业使用背书最好。
3)提供Java程序上传文件的手段。Java代码可以在Windows环境下调试。
4)必须开源,作者能保持更新。
5)有运维监控手段,能快速定位出问题的服务器。
6)(加分项)新增存储服务器时,不需要更改Nginx负载均衡和Java程序的配置。
Read a lot of information, not a candidate for the perfect solution, only slightly in line with the 3:
In summary, I think simple and approachable, seaweedfs ecosystem relatively complete, and the author has been updated. Of course FastDFS also a good choice.
frame | Brief introduction | File storage methods | Support high availability | Capacity expansion means | Browser support gzip | Browser cache | Program access |
FastDFS | I knew that many start-up companies are using, and he spent a period of time, relatively stable, many Chinese netizens write data, but almost no official document. Recommended fastdfs-nginx-module spit flow, rather than the reverse proxy direct access to a storage server. Operation and maintenance aspects: the event log can not be accessed, to check spread across multiple locations (nginx> nginx fastdfs module> storage> tracker), are not familiar with the investigation who is not a good reason. Home: https://github.com/happyfish100/fastdfs Deployment Description: http://blog.csdn.net/xifeijian/article/details/38567839 Docker:https://hub.docker.com/r/hhland/fastdfs/ ,https://hub.docker.com/r/season/fastdfs/ A good reference scenario: https://github.com/daniellitoc/xultimate-resource |
Store key-value method. No upload directory concept, development / test environments need to deploy a separate file server for personal use. Does not support the file list, does not support FUSE |
Clusters can be more than one Storage Tracker with multiple servers. No clustering across multiple tracker, by the client failover to solve the problem. storage server group as the unit, a different storage server file in the same group are identical, mainly for load balancing and fault tolerance, similar to the hard disk RAID10 solution. It does not support multiple rooms. |
TB class storage solutions 1. The server can specify the same storage in a plurality of configurations to increase store_path hard disk. 2. Support to the group as a unit to increase server capacity to expand. 3. The total capacity of all group together |
Temporary compressed by a reverse proxy | It supports the If-Modified-Since | Language SDK: a dedicated SDK, support Java. REST Interface: No http read files: through storage server http service can also be installed by nginx fastdfs-nginx-module is provided. The latter is recommended . |
Baidu BFS (to be studied) |
Powerful, but very few documents online, Baidu search found no experience articles. Baidu said the entire company are in use. Home: https://github.com/baidu/bfs Docker: providing Dockerfile, but did not put Docker Hub. |
Storage directory mode. Support file list, support for FUSE |
Cluster consists NameServer, MetaServer, ChunkServer composition. NameServer raft used algorithm, or rely Neuxs Zookeeper selected Leader, fail-safe automatic reselection Leader ChuckServer high availability: to be analyzed Multi-room, multi-server support is best. |
PB-class storage solutions
|
To be analyzed | To be analyzed | Language SDK: SDK for access, does not support Java, but can be achieved through FUSE bridge, estimated under Windows with Cygwin need to access. REST Interface: No http file reading: NameServer provide access. |
seaweedfs | Powerful, it seems to be very promising, few Chinese data. Doc said, "in through courier" in use. Because not used, hard to say whether the operation and maintenance easy. Home: https://github.com/chrislusf/seaweedfs Deployment and usage instructions: http://blog.chinaunix.net/uid-25057421-id-5676348.html |
Store key-value method. filer can be uploaded to the specified directory. However, due to the java sdk are directly connected with the master volume, so the development / test environments can not share the same file server. filer support the file list, but does not support FUSE |
Cluster consists of more than one master and multiple volume composed server. copying between the volume of replication strategy is determined. Support for multi-room, multi-replication strategy. |
PB-class storage solutions Increase volume to increase server capacity, but how much more effective replication strategies related. |
Supports pre-compressed into gzip format file, spit directly flow into gzip | Support etag, If-Modified-Since, etc. | Language SDK: SDK are actually on all accessible via REST interface. There are versions of Java. REST Interface: Volume and filer server provides an interface at different levels, volume is the key-value approach, filer is similar to the way the directory. http file read: Filer server |