Article Directory
One, GlusterFS overview
1.1 Introduction to GlusterFS
Open source distributed file system
Consists of storage server, client and NFS/Samba storage gateway
No metadata server
1.2 Features of GlusterFS
Scalability and high performance
High availability
Global unified namespace
Flexible volume management
Based on standard protocol
1.3, commonly used terms
Brick: The storage unit in GFS, through an export directory of a server in a trusted storage pool. Can be identified by host name and directory name, such as'SERVER:EXPORT'
Volume: Volume
FUSE: Kernel file system, Filesystem Userspace is a loadable kernel module, which supports non-privileged users to create their own file system without modifying the kernel code. By running the file system code in the user space, the FUSE code bridges the kernel.
VFS: Virtual File System
Glusterd: Gluster management daemon, to run on all servers in the trusted storage pool.
1.4, GFS structure and flexible HASH algorithm
■ Modular stacked architecture
Modular, stacked architecture
Through the combination of modules, complex functions are realized.
■Elastic HASH algorithm
Obtain a 32-bit integer through the HASH algorithm and
divide it into N continuous subspaces. Each space corresponds to a Brick
. The advantages of the flexible HASH algorithm.
Ensure that the data is evenly distributed in each Brick. It
solves the dependence on the metadata server and solves the problem. Single point of failure and access bottleneck
Two, GlusterFS working principle
1. Clients or applications access data through the mount point of GlusterFS
2. Linux system kernel receives requests and processes them through VFS API
3. VFS submits data to FUSE kernel file system, and fuse file system passes data through /dev The /fuse device file is submitted to the GlusterFS client
4. After the GlusterFS client receives the data, the client processes the data according to the configuration file configuration
5. The data is transferred to the remote GlusterFS Server through the network, and the data is written to the server Storage device
Three, GlusterFS volume type
Distributed volume
striped volume
replicated volume
distributed striped volume
distributed replicated volume
striped replicated volume
distributed striped replicated volume
3.1, distributed volume
The file is not divided into blocks. The
HASH value is saved by extending the file attributes. The
supported underlying file systems are EXT3, EXT4, ZFS, XFS, etc.
Distributed volume characteristics:
Files are distributed on different servers, and there is no redundancy. It
is easier and cheaper to expand the size of the volume. A
single point of failure will cause data loss.
Rely on the underlying data protection
Create distributed volume command:
gluster volume create dis-volume server1:/dir1 server2:/dir2
3.2 Strip roll
The file is divided into N blocks (N stripe nodes) according to the offset, and the polled storage is stored in each Brick Server node. When
storing large files, the performance is particularly outstanding.
No redundancy, similar to
the characteristics of Raid0 striped volumes:
Data is divided into smaller pieces and distributed to different strips in the block server group. The
distribution reduces the load and the smaller files accelerate the speed of access.
No data redundancy
Create striped volume command:
gluster volume create stripe-volume stripe 2 transport tcp server1:/dir1 server2:/dir2
3.3, copy volume
Keep one or more copies of the same file
The copy mode needs to save the copy, so the disk utilization is low
If the storage space on multiple nodes is inconsistent, the capacity of the lowest node will be taken as the total capacity of the volume according to the barrel effect
. Features of the replicated volume:
All servers in the volume store a complete copy
. The number of copies of the volume can be determined when the customer creates it. At
least two block servers or more servers
have redundancy.
Create a copy volume command:
gluster volume create rep-volume replica 2 transport tcp server1:/dir1 server2:/dir2
3.4, distributed strip volume
Taking into account the functions of distributed volumes and striped volumes
Mainly used for large file access processing
At least 4 servers are required
Command to create distributed striped volume:
gluster volume create rep-volume replica 2 transport tcp server1:/dir1 server2:/dir2
3.5, distributed replication volume
Take into account the functions of distributed and replicated volumes
Commands used to create distributed replicated volumes when redundancy is required :
gluster volume create rep-volume replica 2 transport tcp server1:/dir1 server2:/dir2
3.6, command supplement
When deleting a volume, no hosts in the trust pool can be down, otherwise the deletion will not succeed
gluster volume stop dis-vol
yes
gluster volume delete dis-vol
#######访问控制##########
//仅拒绝
gluster volume set dis-vol auth.reject 20.0.0.25
//仅允许
gluster volume set dis-vol auth.allow 20.0.0.25
Fourth, GlusterFS platform deployment
4.1. Experimental environment
VMware software
A centos7.4 virtual machine, IP address: 20.0.0.21, host name: node1, add 4 disks, all of which are 20G in size
A centos7.4 virtual machine, IP address: 20.0.0.22, host name: node2, add 4 disks, all of which are 20G in size
A centos7.4 virtual machine, IP address: 20.0.0.23, host name: node3, add 4 disks, all of which are 20G in size
A centos7.4 virtual machine, IP address: 20.0.0.24, host name: node4, add 4 disks, all of which are 20G in size
A centos7.4 virtual machine, IP address: 20.0.0.25, host name: client, used as a test machine
Firewall, kernel protection is off
4.2. Deployment configuration
All node nodes operate the following steps, here is node1: 20.0.0.21 as the column
[root@node1 ~]# vi /etc/hosts
20.0.0.21 node1
20.0.0.22 node2
20.0.0.23 node3
20.0.0.24 node4
更改主机名
hostnamectl set-hostname node1
hostnamectl set-hostname node2
hostnamectl set-hostname node3
hostnamectl set-hostname node4
Create automatic formatting, automatic permanent mounting script
[root@node1 ~]# vi gsf.sh
#!/bin/bash
for V in $(ls /dev/sd[b-z])
do
echo -e "n\np\n\n\n\nw\n" |fdisk $V
mkfs.xfs -i size=512 ${V}1 &>/dev/null
sleep 1
M=$(echo "$V" |awk -F "/" '{print $3}')
mkdir -p /data/${M}1 &>/dev/null
echo -e "${V}1 /data/${M}1 xfs defaults 0 0\n" >>/etc/fstab
mount -a &>/dev/null
done
[root@node1 ~]# chmod +x gsf.sh
[root@node1 ~]# ./gsf.sh
[root@node1 ~]# scp gsf.sh 20.0.0.22:/
[root@node1 ~]# scp gsf.sh 20.0.0.23:/
[root@node1 ~]# scp gsf.sh 20.0.0.24:/
Build a network yum warehouse
[root@node1 abc]# cd /etc/yum.repos.d/
[root@node1 yum.repos.d]# mkdir bak
[root@node1 yum.repos.d]# mv C* /bak
[root@node1 yum.repos.d]# vim local.repo
[centos]
name=CentOS
baseurl=http://mirror.centos.org/centos/$releasever/storage/$basearch/gluster-3.12/
gpgcheck=0
enabled=1
[root@node1 yum.repos.d]# yum clean all
[root@node1 yum.repos.d]# yum list
[root@node1 yum.repos.d]# yum -y install glusterfs glusterfs-server glusterfs-fuse glusterfs-rdma
[root@node1 yum.repos.d]# ntpdate ntp1.aliyun.com
4.3, configure any node information
[root@node1 ~]# gluster peer probe node2 ##添加池子,peer匹配, probe信任+节点名称
peer probe: success.
[root@node1 ~]# gluster peer probe node3
peer probe: success.
[root@node1 ~]# gluster peer probe node4
peer probe: success.
View node pool
[root@node1 ~]# gluster peer status
Number of Peers: 3
Hostname: node2
Uuid: 542a7be9-1a0c-43be-89db-57fe4db5a56f
State: Peer in Cluster (Connected)
Hostname: node3
Uuid: 2ca567f1-e92e-4215-9b09-e6c5e1f08f35
State: Peer in Cluster (Connected)
Hostname: node4
Uuid: 9110ff49-ab25-45d0-85fb-ad67fc266d7c
State: Peer in Cluster (Connected)
[root@node1 ~]# gluster peer status
Number of Peers: 3
You can see the pool when you change any node
4.4, create a distributed volume
[root@node2 ~]# gluster volume create dis-vol node1:/data/sdb1 node2:/data/sdb1 force
volume create: dis-vol: success: please start the volume to access data
查看分布式卷的详细信息
[root@node2 ~]# gluster volume info dis-vol
Volume Name: dis-vol
Type: Distribute
Volume ID: 028c2554-a6d6-48cd-a3ad-8778998c42da
Status: Created
Snapshot Count: 0
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: node1:/data/sdb1
Brick2: node2:/data/sdb1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
开启分布式卷组
[root@node2 ~]# gluster volume start dis-vol
volume start: dis-vol: success
查看是不是开启状态
[root@node2 ~]# gluster volume info dis-vol
Volume Name: dis-vol
Type: Distribute
Volume ID: 028c2554-a6d6-48cd-a3ad-8778998c42da
Status: Started
Snapshot Count: 0
4.5, create a striped volume
[root@node2 ~]# gluster volume create stripe-vol stripe 2 node1:/data/sdc1 node2:/data/sdc1 force
volume create: stripe-vol: success: please start the volume to access data
[root@node2 ~]# gluster volume info stripe-vol
Volume Name: stripe-vol
Type: Stripe
Volume ID: 4b9fe354-a14c-4cfa-9dbc-b887cf101d7c
Status: Created
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: node1:/data/sdc1
Brick2: node2:/data/sdc1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
[root@node2 ~]# gluster volume start stripe-vol
volume start: stripe-vol: success
[root@node2 ~]# gluster volume info stripe-vol
Volume Name: stripe-vol
Type: Stripe
Volume ID: 4b9fe354-a14c-4cfa-9dbc-b887cf101d7c
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
4.6, create a replicated volume
[root@node2 ~]# gluster volume create rep-vol replica 2 node3:/data/sdb1 node4:/data/sdb1 force
volume create: rep-vol: success: please start the volume to access data
[root@node2 ~]# gluster volume info rep-vol
Volume Name: rep-vol
Type: Replicate
Volume ID: bb87f9dc-8260-44b8-8ba3-53aab9ae10be
Status: Created
Snapshot Count: 0
Xlator 1: BD
Capability 1: thin
Capability 2: offload_copy
Capability 3: offload_snapshot
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: node3:/data/sdb1
Brick1 VG:
Brick2: node4:/data/sdb1
Brick2 VG:
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
[root@node2 ~]# gluster volume start rep-vol
volume start: rep-vol: success
4.7, create a distributed strip
[root@node2 ~]# gluster volume create dis-stripe stripe 2 node1:/data/sdd1 node2:/data/sdd1 node3:/data/sdd1 node4:/data/sdd1 force
volume create: dis-stripe: success: please start the volume to access data
[root@node2 ~]# gluster volume info dis-stripe
Volume Name: dis-stripe
Type: Distributed-Stripe
Volume ID: 37af4c7c-4dcc-47a6-89b7-91443343b0a0
Status: Created
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: node1:/data/sdd1
Brick2: node2:/data/sdd1
Brick3: node3:/data/sdd1
Brick4: node4:/data/sdd1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
[root@node2 ~]# gluster volume start dis-stripe
volume start: dis-stripe: success
[root@node2 ~]# gluster volume info dis-stripe
Volume Name: dis-stripe
Type: Distributed-Stripe
Volume ID: 37af4c7c-4dcc-47a6-89b7-91443343b0a0
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: node1:/data/sdd1
Brick2: node2:/data/sdd1
Brick3: node3:/data/sdd1
Brick4: node4:/data/sdd1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
4.8, create a distributed replication volume
[root@node2 ~]# gluster volume create dis-rep replica 2 node1:/data/sde1 node2:/data/sde1 node3:/data/sde1 node4:/data/sde1 force
volume create: dis-rep: success: please start the volume to access data
[root@node2 ~]# gluster volume info dis-rep
Volume Name: dis-rep
Type: Distributed-Replicate
Volume ID: 12e0a204-b09d-427e-a43d-743fd709a096
Status: Created
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: node1:/data/sde1
Brick2: node2:/data/sde1
Brick3: node3:/data/sde1
Brick4: node4:/data/sde1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
[root@node2 ~]# gluster volume start dis-rep
volume start: dis-rep: success
[root@node2 ~]# gluster volume info dis-rep
Volume Name: dis-rep
Type: Distributed-Replicate
Volume ID: 12e0a204-b09d-427e-a43d-743fd709a096
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: node1:/data/sde1
Brick2: node2:/data/sde1
Brick3: node3:/data/sde1
Brick4: node4:/data/sde1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
Five, test
The test machine (20.0.0.25) only needs these two
[root@localhost ~]# yum -y install glusterfs glusterfs-fuse
[root@localhost ~]# vi /etc/hosts
20.0.0.21 node1
20.0.0.22 node2
20.0.0.23 node3
20.0.0.24 node4
[root@localhost ~]# mkdir -p /test/dis
[root@localhost ~]# mkdir -p /test/strip
[root@localhost ~]# mkdir -p /test/rep
[root@localhost ~]# mkdir -p /test/dis_stripe
[root@localhost ~]# mkdir -p /test/dis_rep
[root@localhost ~]# mount.glusterfs node1:dis-vol /test/dis
[root@localhost ~]# mount.glusterfs node2:stripe-vol /test/strip
[root@localhost ~]# mount.glusterfs node3:rep-vol /test/rep
[root@localhost ~]# mount.glusterfs node4:dis-stripe /test/dis_stripe
[root@localhost ~]# mount.glusterfs node1:dis-rep /test/dis_rep
Create 5 files with a size of 40M
[root@localhost opt]# dd if=/dev/zero of=/opt/1.log bs=1M count=40
[root@localhost opt]# dd if=/dev/zero of=/opt/2.log bs=1M count=40
[root@localhost opt]# dd if=/dev/zero of=/opt/3.log bs=1M count=40
[root@localhost opt]# dd if=/dev/zero of=/opt/4.log bs=1M count=40
[root@localhost opt]# dd if=/dev/zero of=/opt/5.log bs=1M count=40
Copy 5 files and save to the file just created for storage
[root@localhost opt]# cp * /test/dis
[root@localhost opt]# cp * /test/strip/
[root@localhost opt]# cp * /test/rep/
[root@localhost opt]# cp * /test/dis_stripe/
[root@localhost opt]# cp * /test/dis_rep/
View distributed volumes
[root@node1 ~]# cd /data/sdb1/
[root@node1 sdb1]# ll -h
total 160M
-rw-r--r-- 2 root root 40M Oct 27 13:28 1.log
-rw-r--r-- 2 root root 40M Oct 27 13:28 3.log
-rw-r--r-- 2 root root 40M Oct 27 13:28 4.log
-rw-r--r-- 2 root root 40M Oct 27 13:28 5.log
[root@node2 ~]# cd /data/sdb1/
[root@node2 sdb1]# ll -h
总用量 40M
-rw-r--r-- 2 root root 40M 10月 28 01:28 2.log
View strip roll
[root@node1 sdb1]# cd /data/
[root@node1 data]# ll -h sdc1/
total 100M
-rw-r--r-- 2 root root 20M Oct 27 13:28 1.log
-rw-r--r-- 2 root root 20M Oct 27 13:28 2.log
-rw-r--r-- 2 root root 20M Oct 27 13:28 3.log
-rw-r--r-- 2 root root 20M Oct 27 13:28 4.log
-rw-r--r-- 2 root root 20M Oct 27 13:28 5.log
[root@node2 sdb1]# cd /data/
[root@node2 data]# ll -h sdc1/
总用量 100M
-rw-r--r-- 2 root root 20M 10月 28 01:28 1.log
-rw-r--r-- 2 root root 20M 10月 28 01:28 2.log
-rw-r--r-- 2 root root 20M 10月 28 01:28 3.log
-rw-r--r-- 2 root root 20M 10月 28 01:28 4.log
-rw-r--r-- 2 root root 20M 10月 28 01:28 5.log
Start a destructive experiment to verify the volume group
init 0 node1 and go to the test machine 20.0.0.25 to view the results
View the distributed volume
[root@localhost test]# ls -lh dis/
total 40M
-rw-r--r-- 1 root root 40M Oct 27 13:28 2.log
View strip roll
[root@localhost test]# ll
ls: cannot access strip: Transport endpoint is not connected
total 16
drwxr-xr-x 3 root root 4096 Oct 27 13:28 dis
drwxr-xr-x 3 root root 4096 Oct 27 13:28 dis_rep
drwxr-xr-x 3 root root 4096 Oct 27 13:28 dis_stripe
drwxr-xr-x 3 root root 4096 Oct 27 13:28 rep
d????????? ? ? ? ? ? strip
testing successfully