GlusterFS分布式存储系统中更换故障Brick的操作记录1

前面已经介绍了GlusterFS分布式存储集群环境部署记录，现在模拟下更换故障Brick的操作：

1）GlusterFS集群系统一共有4个节点，集群信息如下：

分别在各个节点上配置hosts、同步好系统时间，关闭防火墙和selinux
[root@GlusterFS-slave data]# cat /etc/hosts
192.168.10.239 GlusterFS-master 192.168.10.212 GlusterFS-slave 192.168.10.204 GlusterFS-slave2 192.168.10.220 GlusterFS-slave3 分别在各个节点上创建存储目录 首先新建分区 # fdisk /dev/sdb //依次输入p->n->1->回车->回车->w 发现并校验 # partx /dev/sdb # ls /dev/sdb* 创建文件系统 # mkfs.xfs -i size=1024 /dev/sdb1 配置挂载 # mkdir -p /data # echo '/dev/sdb1 /data xfs defaults 1 2' >> /etc/fstab # mount -a && mount 配置存储位置 # mkdir /data/gluster 部署glusterfs集群的中间部分操作在此省略，具体可参考：http://www.cnblogs.com/kevingrace/p/8743812.html [root@GlusterFS-master ~]# gluster peer status Number of Peers: 3 Hostname: 192.168.10.212 Uuid: f8e69297-4690-488e-b765-c1c404810d6a State: Peer in Cluster (Connected) Hostname: 192.168.10.204 Uuid: a989394c-f64a-40c3-8bc5-820f623952c4 State: Peer in Cluster (Connected) Hostname: 192.168.10.220 Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965 State: Peer in Cluster (Connected) [root@GlusterFS-master ~]# gluster volume info Volume Name: models Type: Distributed-Replicate Volume ID: f1945b0b-67d6-4202-9198-639244ab0a6a Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 192.168.10.239:/data/gluster Brick2: 192.168.10.212:/data/gluster Brick3: 192.168.10.204:/data/gluster Brick4: 192.168.10.220:/data/gluster Options Reconfigured: auth.allow: 192.168.* performance.write-behind: on performance.io-thread-count: 32 performance.flush-behind: on performance.cache-size: 128MB features.quota: on 客户端挂载GlusterFS存储 [root@Client ~]# mount -t glusterfs 192.168.10.239:models /data/gluster/

2）测试Gluster卷

写入测试数据
[root@Client ~]# for i in `seq -w 1 100`; do cp -rp /var/log/messages /opt/gfsmount/copy-test-$i; done 写入确认 [root@Client ~]# ls -lA /opt/gfsmount|wc -l 101 在各节点机器上也确认下，发现这100个文件随机地各自分为了两个50份的文件（均衡），分别同步到了第1-2节点和第3-4节点上了。 [root@GlusterFS-master ~]# ll /opt/gluster/data/|wc -l 51 [root@GlusterFS-master ~]# ls /opt/gluster/data/ copy-test-001 copy-test-016 copy-test-028 copy-test-038 copy-test-054 copy-test-078 copy-test-088 copy-test-100 copy-test-004 copy-test-017 copy-test-029 copy-test-039 copy-test-057 copy-test-079 copy-test-090 copy-test-006 copy-test-019 copy-test-030 copy-test-041 copy-test-060 copy-test-081 copy-test-093 copy-test-008 copy-test-021 copy-test-031 copy-test-046 copy-test-063 copy-test-082 copy-test-094 copy-test-011 copy-test-022 copy-test-032 copy-test-048 copy-test-065 copy-test-083 copy-test-095 copy-test-012 copy-test-023 copy-test-033 copy-test-051 copy-test-073 copy-test-086 copy-test-098 copy-test-015 copy-test-024 copy-test-034 copy-test-052 copy-test-077 copy-test-087 copy-test-099 [root@GlusterFS-slave ~]# ll /opt/gluster/data/|wc -l 51 [root@GlusterFS-slave ~]# ls /opt/gluster/data/ copy-test-001 copy-test-016 copy-test-028 copy-test-038 copy-test-054 copy-test-078 copy-test-088 copy-test-100 copy-test-004 copy-test-017 copy-test-029 copy-test-039 copy-test-057 copy-test-079 copy-test-090 copy-test-006 copy-test-019 copy-test-030 copy-test-041 copy-test-060 copy-test-081 copy-test-093 copy-test-008 copy-test-021 copy-test-031 copy-test-046 copy-test-063 copy-test-082 copy-test-094 copy-test-011 copy-test-022 copy-test-032 copy-test-048 copy-test-065 copy-test-083 copy-test-095 copy-test-012 copy-test-023 copy-test-033 copy-test-051 copy-test-073 copy-test-086 copy-test-098 copy-test-015 copy-test-024 copy-test-034 copy-test-052 copy-test-077 copy-test-087 copy-test-099 [root@GlusterFS-slave2 ~]# ll /opt/gluster/data/|wc -l 51 [root@GlusterFS-slave2 ~]# ls /opt/gluster/data/ copy-test-002 copy-test-014 copy-test-036 copy-test-047 copy-test-059 copy-test-069 copy-test-080 copy-test-097 copy-test-003 copy-test-018 copy-test-037 copy-test-049 copy-test-061 copy-test-070 copy-test-084 copy-test-005 copy-test-020 copy-test-040 copy-test-050 copy-test-062 copy-test-071 copy-test-085 copy-test-007 copy-test-025 copy-test-042 copy-test-053 copy-test-064 copy-test-072 copy-test-089 copy-test-009 copy-test-026 copy-test-043 copy-test-055 copy-test-066 copy-test-074 copy-test-091 copy-test-010 copy-test-027 copy-test-044 copy-test-056 copy-test-067 copy-test-075 copy-test-092 copy-test-013 copy-test-035 copy-test-045 copy-test-058 copy-test-068 copy-test-076 copy-test-096 [root@GlusterFS-slave3 ~]# ll /opt/gluster/data/|wc -l 51 [root@GlusterFS-slave3 ~]# ls /opt/gluster/data/ copy-test-002 copy-test-014 copy-test-036 copy-test-047 copy-test-059 copy-test-069 copy-test-080 copy-test-097 copy-test-003 copy-test-018 copy-test-037 copy-test-049 copy-test-061 copy-test-070 copy-test-084 copy-test-005 copy-test-020 copy-test-040 copy-test-050 copy-test-062 copy-test-071 copy-test-085 copy-test-007 copy-test-025 copy-test-042 copy-test-053 copy-test-064 copy-test-072 copy-test-089 copy-test-009 copy-test-026 copy-test-043 copy-test-055 copy-test-066 copy-test-074 copy-test-091 copy-test-010 copy-test-027 copy-test-044 copy-test-056 copy-test-067 copy-test-075 copy-test-092 copy-test-013 copy-test-035 copy-test-045 copy-test-058 copy-test-068 copy-test-076 copy-test-096

3）模拟brick故障

3.1）查看当前存储状态
在GlusterFS-slave3节点机器上操作
[root@GlusterFS-slave3 ~]# gluster volume status Status of volume: models Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 192.168.10.239:/data/gluster 49152 Y 6016 Brick 192.168.10.212:/data/gluster 49152 Y 2910 Brick 192.168.10.204:/data/gluster 49153 Y 9030 Brick 192.168.10.220:/data/gluster 49153 Y 12363 NFS Server on localhost N/A N N/A Self-heal Daemon on localhost N/A Y 12382 Quota Daemon on localhost N/A Y 12389 NFS Server on 192.168.10.204 N/A N N/A Self-heal Daemon on 192.168.10.204 N/A Y 9049 Quota Daemon on 192.168.10.204 N/A Y 9056 NFS Server on GlusterFS-master N/A N N/A Self-heal Daemon on GlusterFS-master N/A Y 6037 Quota Daemon on GlusterFS-master N/A Y 6042 NFS Server on 192.168.10.212 N/A N N/A Self-heal Daemon on 192.168.10.212 N/A Y 2930 Quota Daemon on 192.168.10.212 N/A Y 2936 Task Status of Volume models ------------------------------------------------------------------------------ Task : Rebalance ID : f7bc799f-d8a8-488e-9c38-dc1f2c685a99 Status : completed 注：注意到Online项全部为"Y" 2）制造故障（注意这里模拟的是文件系统故障，假设物理硬盘没有问题或已经更换阵列中的硬盘） 在GlusterFS-slave3节点机器上操作 [root@GlusterFS-slave3 ~]# vim /etc/fstab //注释掉如下行 ...... #/dev/sdb1 /data xfs defaults 1 2 重启服务器 [root@GlusterFS-slave3 ~]# reboot 3）查看当前存储状态 [root@GlusterFS-slave3 ~]# gluster volume status Status of volume: models Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 192.168.10.239:/data/gluster 49152 Y 6016 Brick 192.168.10.212:/data/gluster 49152 Y 2910 Brick 192.168.10.204:/data/gluster 49153 Y 9030 Brick 192.168.10.220:/data/gluster N/A N N/A NFS Server on localhost N/A N N/A Self-heal Daemon on localhost N/A Y 12382 Quota Daemon on localhost N/A Y 12389 NFS Server on GlusterFS-master N/A N N/A Self-heal Daemon on GlusterFS-master N/A Y 6037 Quota Daemon on GlusterFS-master N/A Y 6042 NFS Server on 192.168.10.204 N/A N N/A Self-heal Daemon on 192.168.10.204 N/A Y 9049 Quota Daemon on 192.168.10.204 N/A Y 9056 NFS Server on 192.168.10.212 N/A N N/A Self-heal Daemon on 192.168.10.212 N/A Y 2930 Quota Daemon on 192.168.10.212 N/A Y 2936 Task Status of Volume models ------------------------------------------------------------------------------ Task : Rebalance ID : f7bc799f-d8a8-488e-9c38-dc1f2c685a99 Status : completed 注意：发现GlusterFS-slave3节点（192.168.10.220）的Online项状态为"N"了！ 4）恢复故障brick方法 4.1）结束故障brick的进程 如上通过"gluster volume status"命令，如果查看到状态Online项为"N"的GlusterFS-slave3节点存在PID号（不显示N/A）,则应当使用"kill -15 pid"杀死它！ 一般当Online项为"N"时就不显示pid号了。 4.2）创建新的数据目录（注意绝不可以与之前目录一样） [root@GlusterFS-slave3 ~]# mkfs.xfs -i size=1024 /dev/sdb1 [root@GlusterFS-slave3 ~]# vim /etc/fstab //去掉下面注释 ...... /dev/sdb1 /data xfs defaults 1 2 重新挂载文件系统： [root@GlusterFS-slave3 ~]# mount -a 增加新的数据存放文件夹（不可以与之前目录一样） [root@GlusterFS-slave3 ~]# mkdir -p /data/gluster1 4.3）查询故障节点的备份节点(GlusterFS-slave2)目录的扩展属性（使用"yum search getfattr"命令getfattr工具的安装途径） [root@GlusterFS-slave2 ~]# yum install -y attr.x86_64 [root@GlusterFS-slave2 ~]# getfattr -d -m. -e hex /data/gluster getfattr: Removing leading '/' from absolute path names # file: opt/gluster/data trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x00000001000000007fff7f58ffffffff trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size=0x0000000003c19000 trusted.glusterfs.volume-id=0xf1945b0b67d642029198639244ab0a6a 4.4）挂载卷并触发自愈 在客户端先卸载掉之前的挂载 [root@Client ~]# umount /data/gluster 然后重新挂载GlusterFS-slave3（其实挂载哪一个节点的都可以） [root@Client ~]# mount -t glusterfs 192.168.10.220:models /data/gluster/ 新建一个卷中不存在的目录并删除 [root@Client gfsmount]# mkdir testDir001 [root@Client gfsmount]# rm -rf testDir001 设置扩展属性触发自愈 [root@Client gfsmount]# setfattr -n trusted.non-existent-key -v abc /data/gluster [root@Client gfsmount]# setfattr -x trusted.non-existent-key /data/gluster 4.5）检查当前节点是否挂起xattrs 再次查询故障节点的备份节点(GlusterFS-slave2)目录的扩展属性 [root@GlusterFS-slave2 ~]# getfattr -d -m. -e hex /data/gluster getfattr: Removing leading '/' from absolute path names # file: opt/gluster/data trusted.afr.dirty=0x000000000000000000000000 trusted.afr.models-client-2=0x000000000000000000000000 trusted.afr.models-client-3=0x000000000000000200000002 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x00000001000000007fff7f58ffffffff trusted.glusterfs.quota.dirty=0x3000 trusted.glusterfs.quota.size=0x0000000003c19000 trusted.glusterfs.volume-id=0xf1945b0b67d642029198639244ab0a6a 注意：留意第5行,表示xattrs已经将源标记为GlusterFS-slave3:/data/gluster 4.6）检查卷的状态是否显示需要替换 [root@GlusterFS-slave3 data]# gluster volume heal models info Brick GlusterFS-master:/data/gluster/ Number of entries: 0 Brick GlusterFS-slave:/data/gluster/ Number of entries: 0 Brick GlusterFS-slave2:/data/gluster/ / Number of entries: 1 Brick 192.168.10.220:/data/gluster Status: Transport endpoint is not connected 注：状态提示传输端点未连接（最后一行） 4.7）使用强制提交完成操作 [root@GlusterFS-slave3 data]# gluster volume replace-brick models 192.168.10.220:/data/gluster 192.168.10.220:/data/gluster1 commit force 提示如下表示正常完成： volume replace-brick: success: replace-brick commit force operation successful 注意：也可以将数据恢复到另外一台服务器，详细命令如下（192.168.10.230为新增的另一个glusterfs节点）（可选）： # gluster peer probe 192.168.10.230 # gluster volume replace-brick models 192.168.10.220:/data/gluster 192.168.10.230:/data/gluster commit force 4.8）检查存储的在线状态 [root@GlusterFS-slave3 ~]# gluster volume status Status of volume: models Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 192.168.10.239:/data/gluster1 49152 Y 6016 Brick 192.168.10.212:/data/gluster 49152 Y 2910 Brick 192.168.10.204:/data/gluster 49153 Y 9030 Brick 192.168.10.220:/data/gluster 49153 Y 12363 NFS Server on localhost N/A N N/A Self-heal Daemon on localhost N/A Y 12382 Quota Daemon on localhost N/A Y 12389 NFS Server on 192.168.10.204 N/A N N/A Self-heal Daemon on 192.168.10.204 N/A Y 9049 Quota Daemon on 192.168.10.204 N/A Y 9056 NFS Server on GlusterFS-master N/A N N/A Self-heal Daemon on GlusterFS-master N/A Y 6037 Quota Daemon on GlusterFS-master N/A Y 6042 NFS Server on 192.168.10.212 N/A N N/A Self-heal Daemon on 192.168.10.212 N/A Y 2930 Quota Daemon on 192.168.10.212 N/A Y 2936 Task Status of Volume models ------------------------------------------------------------------------------ Task : Rebalance ID : f7bc799f-d8a8-488e-9c38-dc1f2c685a99 Status : completed 另外，如果更换到其他服务器状态显示如下： [root@GlusterFS-slave ~]# gluster volume status Status of volume: models Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick 192.168.10.239:/data/gluster 49152 Y 6016 Brick 192.168.10.212:/data/gluster 49152 Y 2910 Brick 192.168.10.204:/data/gluster 49153 Y 9030 Brick 192.168.10.220:/data/gluster 49153 Y 12363 NFS Server on localhost N/A N N/A Self-heal Daemon on localhost N/A Y 12382 Quota Daemon on localhost N/A Y 12389 NFS Server on 192.168.10.204 N/A N N/A Self-heal Daemon on 192.168.10.204 N/A Y 9049 Quota Daemon on 192.168.10.204 N/A Y 9056 NFS Server on GlusterFS-master N/A N N/A Self-heal Daemon on GlusterFS-master N/A Y 6037 Quota Daemon on GlusterFS-master N/A Y 6042 NFS Server on 192.168.10.212 N/A N N/A Self-heal Daemon on 192.168.10.212 N/A Y 2930 Quota Daemon on 192.168.10.212 N/A Y 2936 Task Status of Volume models ------------------------------------------------------------------------------ Task : Rebalance ID : f7bc799f-d8a8-488e-9c38-dc1f2c685a99 Status : completed ====================================================================== 注意：上面是新建的独立分区，在这个独立分区上创建存储目录。如果不新建独立分区，直接在/分区上创建存储目录， 如文档http://www.cnblogs.com/kevingrace/p/8743812.html中的四个节点的存储目录是/opt/gluster/data。 如果是GlusterFS-slave3节点的这个存储目录/opt/gluster/data不小心误删除了。 最简单直接的方法可以是： 1）如上面操作，将删除的/opt/gluster/data目录重新mkdir新建出来 2）停止复制卷磁盘：gluster volume stop models 3）删除复制卷磁盘：gluster volume delete models 4）重新创建复制卷（副本卷）。卷名还是之前的models。这里选择4份副本。 # gluster volume create models replica 4 192.168.10.239:/opt/gluster/data 192.168.10.212:/opt/gluster/data 192.168.10.204:/opt/gluster/data 192.168.10.220:/opt/gluster/dataforce 5）删除复制卷磁盘后：gluster volume info 可以查看到四个节点的Bricks信息 6）客户端重新挂载glusterfs存储即可。 这样，发生故障的GlusterFS-slave3节点的存储目录下的数据就会跟另外一个replica组GlusterFS-master、GlusterFS-slave的数据一致。 由于GlusterFS-slave2是GlusterFS-slave3的备份节点，所以GlusterFS-slave2的存储目录下数据会涵盖所有节点的数据之和！ 或者在上面第4步中重新创建副本卷的时候，还是和之前一样创建2个副本 # gluster volume create models replica 2 192.168.10.239:/opt/gluster/data 192.168.10.212:/opt/gluster/data force 然后将另外两个节点添加到复制卷里面 # gluster volume stop models # gluster volume status models # gluster volume add-brick models 192.168.10.204:/opt/gluster/data 192.168.10.220:/opt/gluster/data force

GlusterFS分布式存储系统中更换故障Brick的操作记录1

猜你喜欢