GFS distributed file system deployment

1. Introduction to GlusterFS

The open source distributed file system is
composed of storage server, client (default GFS mount service, RDMA protocol) and NFS/Samba (corresponding protocol needs to be opened before use) storage gateway.
No metadata server (no metadata attribute information)

1.1 GlusterFS workflow

1. The client or application accesses the data through the mount point (VFS) of GlusterFS.
2. The linux system kernel receives the request and processes it through the VFS API. 3. The VFS
submits the data to the FUSE kernel file system, and the fuse file system transfers the data The file is delivered to the GlusterFS client (client) through the /dev/fuse device file. 4. After the GlusterFS client
receives the data, the client processes the data according to the configuration file configuration.
5. Transfers the data to the remote GlusterFS server (server) through the network ), and write the data to the storage device of the server.

1.2 Elastic HASH algorithm

1. Obtain a 32-bit integer through the HASH algorithm
2. Divide into N continuous subspaces, each space corresponds to a Brick
3. The advantage of the elastic HASH algorithm:
Ensure that the data is evenly distributed in each Brick to
solve the metadata The dependence on the server solves the single point of failure and access bottleneck

The GlusterFS volume of the four Brick nodes evenly allocates the range of 2 to the 32th power.
When accessing a file, calculate the HASH value (key value) of the file to correspond to the Brick storage space

Two, environment deployment

IP address corresponding to the device

equipment IP address
node1 20.0.0.31
node2 20.0.0.32
node3 20.0.0.33
node4 20.0.0.34
client 20.0.0.35

2.1 Disk deployment

Format the disk and mount the disk locally (all node nodes need to be operated)

mkfs.ext4 /dev/sdx      格式化
mkdir {/b,/c,/d,/e}    创建目录
mount /dev/sdx /x       挂载

Insert picture description here

2.2 Address mapping

Realize the binding of host and IP address by editing the hosts file (all nodes including clients need to be operated)

vi /etc/hots

Insert picture description here

2.3 Time synchronization

First of all, make sure that all nodes can access the public network through the domain name (all nodes need to be operated)

Synchronize with Alibaba Cloud time server

ntpdate ntp1.aliyun.com

Create task plan (optional)

crontab -e

Time synchronization with Alibaba Cloud time synchronization server every 30 minutes

*/30 * * * * /usr/sbin/ntpdate ntp1.ailyun.com

2.4 SSH password-free login

node1 can log in to other nodes without password

ssh-keygen -t rsa     创建密钥
ssh-copy-id 20.0.0.32    
ssh-copy-id 20.0.0.33
ssh-copy-id 20.0.0.34
ssh-copy-id 20.0.0.14

Three, install the software

(4 nodes need to be installed)

3.1 yum source configuration

1. Build yum warehouse, prepare rpm package of gfs, upload and decompress

unzip gfsrepo.zip

Insert picture description here

2. /etc/yum.repos.dCreate a new .repofile ending with " " in the " " directory

/etc/yum.repos.d/gfs.repo
[gfs]
name=gfs
baseurl=file:///gfsrepo
gpgcheck=0
enabled=1

3. Establish a cache

yum clean all
yum makecache

Insert picture description here

3.2 yum installation

yum -y install glusterfs glusterfs-server glusterfs-fuse glusterfs-rdma

glusterfs: software
glusterfs-server: service
glusterfs-fuse: file system
glusterfs-rdma: rdma protocol
Insert picture description here

Start service

systemctl start glusterd.service    启动服务
systemctl enable glusterd.service   设置服务开机自启
systemctl status glusterd.service   服务状态

Insert picture description here

Fourth, create a volume

Common commands (on node1)

gluster volume info 卷名View the detailed information of the volume,
gluster volume start 卷名start to open the volume to
gluster volume status 卷名view the volume status

4.1 Create Distributed Volume

gluster volume create dis-vol node1:/b node2:/b force

Insert picture description here

4.2 Create a striped volume

Use the c of node1 node and c of node2 to create a striped volume

gluster volume create stripe-vol stripe 2 node1:/c node2:/c force

Insert picture description here

4.3 Create a replicated volume

gluster volume create rep-vol replica 2 node3:/b node4:/b force

Insert picture description here

4.4 Create Distributed Striped Volume

gluster volume create dis-stripe stripe 2 node1:/d node2:/d node3:/d node4:/d force

Insert picture description here

4.5 Create Distributed Replication Volume

gluster volume create dis-replica replica 2 node1:/e node2:/e node3:/e node4:/e force

Insert picture description here

4.6 View the list of created volumes

gluster volume list

Insert picture description here

Five, client test

5.1 Client Mount

yum install gfs command program file system

yum -y install glusterfs glusterfs-fuse

Create mount point

mkdir {/dis,/stripe,/replica,/dis-stripe,/dis-replica}

Mount

mount.glusterfs node1:dis-vol /dis
mount.glusterfs node1:rep-vol /replica
mount.glusterfs node1:stripe-vol /stripe
mount.glusterfs node1:dis-stripe /dis-stripe
mount.glusterfs node1:dis-replica /dis-replica

Check the mounting situation
Insert picture description here

5.2 Testing

Create 5 files on the client for testing

dd if=/dev/zero of=/test1.log bs=40M count=1
dd if=/dev/zero of=/test2.log bs=40M count=1
dd if=/dev/zero of=/test3.log bs=40M count=1
dd if=/dev/zero of=/test4.log bs=40M count=1
dd if=/dev/zero of=/test5.log bs=40M count=1

Copy files to several directories that were previously mounted

[root@client /]# cp test* /dis
[root@client /]# cp test* /dis-replica/
[root@client /]# cp test* /dis-stripe/
[root@client /]# cp test* /stripe/
[root@client /]# cp test* /replica/

Insert picture description here

5.3 View each node

Cluster environment (each node has 4 disks, each disk is 20G, for example: node1(/b) has a capacity of 20G)

Volume name Volume type size of space Brick (mount point in brackets)
dis-volume Distributed volume 40G node1(/b)、node2(/b)
stripe-volume Striped roll 40G node1(/c)、node2(/c)
rep-volume Copy volume 20G node3(/b)、node4(/b)
dis-stripe Distributed striped volume 80G node1 (/ d) 、 node2 (/ d)
node3 (/ d) 、 node4 (/ d)
dis-rep Distributed replication volume 40G node1 (/ e) 、 node2 (/ e)
node3 (/ e) 、 node4 (/ e)

Insert picture description here
Insert picture description here
Insert picture description here

Insert picture description here

5.4 Failure test

1. Distributed volume failure test

When one of the nodes (node2) of the distributed volume fails, the files will be lost (only part of the files can be displayed)
Insert picture description here
2. Strip volume failure test

When one of the nodes (node2) of the distributed volume fails, the file will be lost (all files will not be displayed)
Insert picture description here
3. Replication volume failure test

When one of the nodes (node3) of the replicated volume fails, the file will not be lost
Insert picture description here

4. Distributed strip volume failure test

When one of the nodes (node2) of the distributed striped volume fails, the files will be lost (only part of the files can be displayed)
Resolution: node1 and node2 store 3 of the 5 files, and node1 and node2 store 3 respectively Part of a file
Insert picture description here
5. Distributed replication volume failure test

①Break any one of the hard drives

When one of the nodes (node2) of the distributed striped volume fails, the file will not be lost (all files can be displayed)
Resolution: node1 and node2 store 3 of the 5 files, node1 and node2 will store these 3 Files

Insert picture description here
②Bad specific two hard drives

When two nodes (node2, node3) of the distributed striped volume fail, the files will not be lost (all files can be displayed)

Insert picture description here
to sum up:

Volume type size of space Fault type result
Distributed volume Capacity is the same as physical capacity When one of the two nodes goes down Some files are missing
Striped roll Capacity is the same as physical capacity When one of the two nodes goes down All files will be lost
Copy volume The capacity is only half of the physical capacity When one of the two nodes goes down Files will not be lost
Distributed striped volume The capacity is the same as the physical machine capacity When one of the four nodes goes down All files are lost
Distributed replication volume The capacity is only half of the physical capacity When one of the four nodes goes down Files will not be lost
Distributed replication volume The capacity is only half of the physical capacity When specific two of the four nodes go down Files will not be lost

Guess you like

Origin blog.csdn.net/weixin_50345511/article/details/112218384