Software version:
software | version | Compression bags Name |
---|---|---|
seaweedfs | seaweedfs-1.11 | linux_amd64.tar.gz |
GitHub:
https://github.com/chrislusf/seaweedfs
Related Definition Description:
Define Name | Explanation |
---|---|
master | Provide volume => location services and location mapping file id serial number |
Node | Abstract node system, abstract DataCenter, Rack |
DataNode | Storage node for managing, storing logical volume |
DataCenter | Data center, corresponding to the reality of different racks |
Rack | Rack, corresponding to the reality of the cabinet, a rack belonging to a particular data center, the data center may comprise a plurality of racks. |
Volume | The logical structure of the logical volume, the storage, the logical volume storing Needle, A VolumeServer contains one Store |
Needle | Logical volume Object, corresponding to the stored files, Needle file size is limited to 4GB for now. |
Collection | Set of files can be distributed across multiple logical volumes, if the file is stored when not specified collection, then use the default "" |
Filer | File Manager, to upload data Filer Weed Volume Servers, and large file into blocks, and block metadata information into the storage area Filer. |
Mount | User space, when used together with the mount filer, filer to retrieve only the metadata file, read the actual contents of the file and the volume directly between mount server, so does not require multiple filer |
Use $ ./weed -h command to view and illustrates
the use of $ ./weed [command] -h command to view the parameters and description
Deployment planning:
node | master | volume | filer |
---|---|---|---|
cdh1 | √ | √ | √ |
cdh2 | √ | √ | √ |
cdh3 | √ | √ | √ |
Decompression:
$ tar -zxvf ./linux_amd64.tar.gz
得到 weed 文件
Start command:
Create a folder:
$ mkdir seaweedfd_master
$ mkdir seaweedfd_data
Start master command:
$ ./weed master -ip cdh1 -maxCpu 1 -mdir ./seaweedfd_master -peers cdh1:9333,cdh2:9333,cdh3:9333 -port 9333 -pulseSeconds 5 -defaultReplication 001
$ ./weed master -ip cdh2 -maxCpu 1 -mdir ./seaweedfd_master -peers cdh1:9333,cdh2:9333,cdh3:9333 -port 9333 -pulseSeconds 5 -defaultReplication 001
$ ./weed master -ip cdh3 -maxCpu 1 -mdir ./seaweedfd_master -peers cdh1:9333,cdh2:9333,cdh3:9333 -port 9333 -pulseSeconds 5 -defaultReplication 001
Avoid split brain: Only odd number of masters are supported!
后台运行:$ nohup ./weed master -ip cdh3 -maxCpu 1 -mdir ./seaweedfd_master -peers cdh1:9333,cdh2:9333,cdh3:9333 -port 9333 -pulseSeconds 5 -defaultReplication 001 > weed_master.out &
Want to provide services must survive two master
Start volume:
$ ./weed volume -dataCenter dc1 -dir ./seaweedfd_data -ip cdh1 -ip.bind cdh1 -maxCpu 1 -mserver cdh1:9333,cdh2:9333,cdh3:9333 -port 9222 -port.public 9222 -publicUrl cdh1 -rack rack1
$ ./weed volume -dataCenter dc1 -dir ./seaweedfd_data -ip cdh2 -ip.bind cdh2 -maxCpu 1 -mserver cdh1:9333,cdh2:9333,cdh3:9333 -port 9222 -port.public 9222 -publicUrl cdh2 -rack rack1
$ ./weed volume -dataCenter dc1 -dir ./seaweedfd_data -ip cdh3 -ip.bind cdh3 -maxCpu 1 -mserver cdh1:9333,cdh2:9333,cdh3:9333 -port 9222 -port.public 9222 -publicUrl cdh3 -rack rack1
dataCenter: Name Data center
rack: Rack name
backstage start: $ nohup ./weed volume -dataCenter dc1 -dir ./seaweedfd_data -ip cdh1 -ip.bind cdh1 -maxCpu 1 -max 200 -mserver cdh1: 9333, cdh2: 9333 , cdh3: 9333 -port 9222 -port.public 9222 -publicUrl cdh1 -rack rack1> weed_volume.out &
Access master webUI:
http://cdh3:9333/
Command to upload the file directory:
$ ./weed upload -dataCenter dc1 -master=cdh3:9333 -dir="./dir/"
File allocation key:
# 基本使用:
$ curl http://cdh1:9333/dir/assign
# 指定复制类型:
$ curl "http://cdh1:9333/dir/assign?replication=001"
# 指定保存时间
$ curl "http://cdh1:9333/dir/assign?count=5"
# 指定数据中心
$ curl "http://cdh1:9333/dir/assign?dataCenter=dc1"
Upload file example:
# 获取file key
$ curl "http://cdh1:9333/dir/assign?dataCenter=dc1"
# 返回JSON
{"fid":"2,016beb339d","url":"cdh2:9222","publicUrl":"cdh2","count":1}
# 上传一个文件指定fid
$ curl -F file=@./file http://cdh2:9222/2,016beb339d
# 返回JSON
{"name":"file","size":41629428}
Get file:
$ curl http://cdh2:9222/2,016beb339d
Configuration start filer:
# 查看配置文件 filer.toml
$ ./weed scaffold filer
Leveldb default file management
# 生成配置文件
$ ./weed scaffold -config filer -output="."
# 示例使用postgres作为元数据存储
# 创建表
=========================================
CREATE TABLE IF NOT EXISTS filemeta (
dirhash BIGINT,
name VARCHAR(1000),
directory VARCHAR(4096),
meta bytea,
PRIMARY KEY (dirhash, name)
);
=========================================
# 配置 filer.toml 中的[postgres]
$ vi filer.toml
start up:
$ ./weed filer -master cdh1:9333,cdh2:9333,cdh3:9333 -port 8888 -port.public 8889
Backgrounding $ nohup ./weed filer -master cdh1: 9333, cdh2: 9333, cdh3: 9333 -port 8888> weed_filer.out &
It recommended to start more than one, more than one share one database
upload files:
$ curl -F "[email protected]" "http://cdh1:8888/path/to/sources/"
Access webUI page:
http://cdh1:8888/
Compatible with Hadoop:
# MavenCentral下载最新版本
https://mvnrepository.com/artifact/com.github.chrislusf/seaweedfs-hadoop-client
# 确认有 mapred-site.xml 文件
# 测试 ls
==================================================================================
../../bin/hdfs dfs -Dfs.defaultFS=seaweedfs://cdh1:8888 \
-Dfs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem \
-libjars ./seaweedfs-hadoop-client-1.0.2.jar \
-ls /
# 返回
Found 2 items
drwxrwx--- - 0 2018-12-13 10:29 /path
drwxrwx--- - 0 2018-12-13 14:17 /weed
# 测试上传文件
==================================================================================
../../bin/hdfs dfs -Dfs.defaultFS=seaweedfs://cdh1:8888 \
-Dfs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem \
-libjars ./seaweedfs-hadoop-client-1.0.2.jar \
-put ./slaves /
# 测试下载文件夹
==================================================================================
../../bin/hdfs dfs -Dfs.defaultFS=seaweedfs://cdh1:8888 \
-Dfs.seaweedfs.impl=seaweed.hdfs.SeaweedFileSystem \
-libjars ./seaweedfs-hadoop-client-1.0.2.jar \
-get /path
Configuration Hadoop:
$ vi core-site.xml
<property>
<name>fs.seaweedfs.impl</name>
<value>seaweed.hdfs.SeaweedFileSystem</value>
</property>
<!--可选 seaweedfs filer 的地址-->
<property>
<name>fs.defaultFS</name>
<value>seaweedfs://cdh1:8888</value>
</property>
# 配置SeaweedFS HDFS客户端jar
$ bin/hadoop classpath
$ cp ./seaweedfs-hadoop-client-1.0.2.jar /hadoop/share/hadoop/common/lib/
$ scp ./seaweedfs-hadoop-client-1.0.2.jar cdh2:/hadoop/share/hadoop/common/lib
$ scp ./seaweedfs-hadoop-client-1.0.2.jar cdh3:/hadoop/share/hadoop/common/lib
$ scp ./core-site.xml cdh2:/hadoop/etc/hadoop/
$ scp ./core-site.xml cdh3:/hadoop/etc/hadoop/
# 查看
$ ../../bin/hdfs dfs -ls seaweedfs://cdh3:8888/
# 返回
Found 3 items
drwxrwx--- - 0 2018-12-13 10:29 seaweedfs://cdh3:8888/path
-rw-r--r-- 1 dpnice dpnice 15 2018-12-13 14:41 seaweedfs://cdh3:8888/slaves
drwxrwx--- - 0 2018-12-13 14:17 seaweedfs://cdh3:8888/weed
API:
Master Server API:
File allocation Key:
# Basic Usage:
curl http://localhost:9333/dir/assign
# To assign with a specific replication type:
curl "http://localhost:9333/dir/assign?replication=001"
# To specify how many file ids to reserve
curl "http://localhost:9333/dir/assign?count=5"
# To assign a specific data center
curl "http://localhost:9333/dir/assign?dataCenter=dc1"
Find the address volume:
curl "http://localhost:9333/dir/lookup?volumeId=3&pretty=y"
{
"locations": [
{
"publicUrl": "localhost:8080",
"url": "localhost:8080"
}
]
}
# Other usages:
# You can actually use the file id to lookup, if you are lazy to parse the file id.
curl "http://localhost:9333/dir/lookup?volumeId=3,01637037d6"
# If you know the collection, specify it since it will be a little faster
curl "http://localhost:9333/dir/lookup?volumeId=3&collection=turbo"
Garbage collection:
curl "http://localhost:9333/vol/vacuum"
curl "http://localhost:9333/vol/vacuum?garbageThreshold=0.4"
Garbage collection will create a copy of the .dat and .idx files, skip deleted files, keep a copy of the original file deletion.
garbageThreshold is optional.
Pre-allocated volumes:
# specify a specific replication
curl "http://localhost:9333/vol/grow?replication=000&count=4"
{"count":4}
# specify a collection
curl "http://localhost:9333/vol/grow?collection=turbo&count=4"
# specify data center
curl "http://localhost:9333/vol/grow?dataCenter=dc1&count=4"
# specify ttl
curl "http://localhost:9333/vol/grow?ttl=5d&count=4"
On behalf count generate several empty volume
To delete a collection:
# delete a collection
curl "http://localhost:9333/col/delete?collection=benchmark&pretty=y"
Check the system status:
# 集群状态
curl "http://10.0.2.15:9333/cluster/status?pretty=y"
{
"IsLeader": true,
"Leader": "10.0.2.15:9333",
"Peers": [
"10.0.2.15:9334",
"10.0.2.15:9335"
]
}
# 拓扑状态
curl "http://localhost:9333/dir/status?pretty=y"
{
"Topology": {
"DataCenters": [
{
"Free": 567,
"Id": "dc1",
"Max": 600,
"Racks": [
{
"DataNodes": [
{
"Free": 190,
"Max": 200,
"PublicUrl": "cdh2",
"Url": "cdh2:9222",
"Volumes": 10
},
{
"Free": 190,
"Max": 200,
"PublicUrl": "cdh1",
"Url": "cdh1:9222",
"Volumes": 10
},
{
"Free": 187,
"Max": 200,
"PublicUrl": "cdh3",
"Url": "cdh3:9222",
"Volumes": 13
}
],
"Free": 567,
"Id": "rack1",
"Max": 600
}
]
}
],
"Free": 567,
"Max": 600,
"layouts": [
{
"collection": "",
"replication": "001",
"ttl": "5d",
"writables": [
15,
16,
17,
18
]
},
{
"collection": "",
"replication": "000",
"ttl": "",
"writables": [
13,
14,
10,
11,
12,
19,
20,
21,
22
]
},
{
"collection": "",
"replication": "001",
"ttl": "",
"writables": [
6,
3,
7,
2,
4,
5
]
},
{
"collection": "turbo",
"replication": "001",
"ttl": "",
"writables": [
8,
9
]
}
]
},
"Version": "1.11"
}
Volume Server API:
# 上传文件
curl -F file=@/home/chris/myphoto.jpg http://127.0.0.1:8080/3,01637037d6
Pre-need to master key assignment file
# 直接上传文件自动分配key( master的端口)
curl -F file=@/home/chris/myphoto.jpg http://localhost:9333/submit
{"fid":"3,01fbe0dc6f1f38","fileName":"myphoto.jpg","fileUrl":"localhost:8080/3,01fbe0dc6f1f38","size":68231}
# 删除文件
curl -X DELETE http://127.0.0.1:8080/3,01637037d6
# 查看分块大文件的列表文件内容
curl http://127.0.0.1:8080/3,01637037d6?cm=false
# 检查 Volume Server 的状态
curl "http://localhost:8080/status?pretty=y"
{
"Version": "0.34",
"Volumes": [
{
"Id": 1,
"Size": 1319688,
"RepType": "000",
"Version": 2,
"FileCount": 276,
"DeleteCount": 0,
"DeletedByteCount": 0,
"ReadOnly": false
},
{
"Id": 2,
"Size": 1040962,
"RepType": "000",
"Version": 2,
"FileCount": 291,
"DeleteCount": 0,
"DeletedByteCount": 0,
"ReadOnly": false
},
{
"Id": 3,
"Size": 1486334,
"RepType": "000",
"Version": 2,
"FileCount": 301,
"DeleteCount": 2,
"DeletedByteCount": 0,
"ReadOnly": false
},
{
"Id": 4,
"Size": 8953592,
"RepType": "000",
"Version": 2,
"FileCount": 320,
"DeleteCount": 2,
"DeletedByteCount": 0,
"ReadOnly": false
},
{
"Id": 5,
"Size": 70815851,
"RepType": "000",
"Version": 2,
"FileCount": 309,
"DeleteCount": 1,
"DeletedByteCount": 0,
"ReadOnly": false
},
{
"Id": 6,
"Size": 1483131,
"RepType": "000",
"Version": 2,
"FileCount": 301,
"DeleteCount": 1,
"DeletedByteCount": 0,
"ReadOnly": false
},
{
"Id": 7,
"Size": 46797832,
"RepType": "000",
"Version": 2,
"FileCount": 292,
"DeleteCount": 0,
"DeletedByteCount": 0,
"ReadOnly": false
}
]
}
Filer Server API:
# Basic Usage:
# create or overwrite the file, the directories /path/to will be automatically created
curl -F [email protected] "http://localhost:8888/path/to"
{"name":"report.js","size":866,"fid":"7,0254f1f3fd","url":"http://localhost:8081/7,0254f1f3fd"}
# get the file content
curl "http://localhost:8888/javascript/report.js"
# upload the file with a different name
curl -F [email protected] "http://localhost:8888/javascript/new_name.js"
{"name":"report.js","size":866,"fid":"3,034389657e","url":"http://localhost:8081/3,034389657e"}
# list all files under /javascript/
curl -H "Accept: application/json" "http://localhost:8888/javascript/?pretty=y"
{
"Directory": "/javascript/",
"Files": [
{
"name": "new_name.js",
"fid": "3,034389657e"
},
{
"name": "report.js",
"fid": "7,0254f1f3fd"
}
],
"Subdirectories": null
}
# 分页查看文件列表
curl "http://localhost:8888/javascript/?pretty=y&lastFileName=new_name.js&limit=2"
{
"Directory": "/javascript/",
"Files": [
{
"name": "report.js",
"fid": "7,0254f1f3fd"
}
]
}
# 删除文件
curl -X DELETE "http://localhost:8888/javascript/report.js"