VSAN storage structure section and a data recovery method Overview

[What is VSAN? ]

VSAN is a kind of core vSphere as the basis for the development of scalable distributed storage architecture. VSAN vSphere cluster master installed in a hard disk and a flash memory which is constructed VSAN memory layer, is controlled and managed by the VSAN, a unified shared memory layer for use vSphere cluster.
vSphere storage infrastructure is changing, based on the traditional storage LUN management mechanism will be applied to the data storage capacity of the storage level. Do not understand the underlying storage array virtualization, do not understand the file system. VMware's next-generation policy-driven storage is no longer based on traditional VMFS storage volume, but object-based storage system model based on virtual data storage or distributed data storage.
VSAN storage structure section and a data recovery method Overview
[Analysis] VSAN data storage

VSAN data is stored in an object store, presented in the form of a file system to the vSphere host, this object storage service will load volumes from each host in the cluster-enabled VSAN. They will be presented as a single, visible on all nodes distributed shared data storage, VSAN simplifies storage configuration for virtual machine, there is only one data storage, distributed data storage from this VSAN cluster each vSphere host storage space on.
By configuring disk group, in a separate storage entity stores all virtual machine files, the way that data is stored is relatively very safe, but when the flash drive in the event of disk failure or capacity, the data will go other node metastasis, at the time of the transfer process, there may be other failures, North Asia data recovery engineers to solve a VSAN store crashes, failures virtual machine can not be accessed.
VSAN storage structure section and a data recovery method Overview
[Case Study] VSAN store crashes

出现故障是四台dell的服务器组成的VSAN集群,每台服务器上两个磁盘组,一个磁盘组是一个SSD硬盘带5块SAS硬盘,SSD做闪存,SAS做容量盘,其中一个节点上的一个磁盘组中的容量盘出现故障离线,并且VSAN进行数据重构迁移。
这个时候由于停电导致数据迁移没有完成,在来电的时候,其中另外一个磁盘组中的容量盘也由两块故障离线了,导致整个数据存储出现故障,VSAN的管理控制台可以登陆,但是所有的虚拟机都无法访问了。
【VSAN存储数据恢复过程分析】
先把四个节点的所有硬盘都做个只读的镜像,包含SSD闪存盘和SAS容量盘,还有三块因故障离线的硬盘,备份完成之后,把所有的原盘都还原到服务器上,开始对镜像文件来进行分析底层数据存储结构,来确认虚拟机所在硬盘的分布信息。

因为现有的虚拟化程序没有针对VSAN的架构来恢复虚拟机的,北亚的工程师在分析底层数据存储结构的时候,也在做相应的程序开发,来测试数据分布信息的准确性。

单独分析每个节点上的两个磁盘组,看下磁盘组内的闪存硬盘和容量盘之间的对应关系,每块硬盘都有一个唯一标识进行磁盘间的对应,根据硬盘的ID信息,判读磁盘组里面的硬盘ID信息。
1、 在每块硬盘上获取磁盘的UUID和磁盘组的UUID

2、 根据每个磁盘组中的容量盘的组件信息获取此容量盘的组件信息

3、 根据组件信息中记录的组件的MAP位置提取组件位图。

4、 根据组件的位图提取组件数据和缓存数据

5、 根据组件的描述信息获取组件所属对象及组件顺序,并把组件合并成对象

6、 根据对象,提取数据。

【VSAN存储数据恢复结果分析】

Objects can also be viewed as a roll, may be understood as a logical volume, the data present on each VSAN stored objects are constituted by a plurality of components which are located in the disk group configured on the cluster master, the recovery process, information extraction component is the critical step, because the component is an important part of each object, this little faulty component damage, recovered virtual machines can be started normally, the failure to resolve the analysis component and correspondence between the disk bitmap for a long time, but in the end all the technical problems have been resolved, recovered virtual machines are started normally, a satisfactory solution to the VSAN failure cause data loss recovery.

Guess you like

Origin blog.51cto.com/zhangyu/2429692