A successful case of storage data recovery in a Beijing company; storage crash data recovery method

1. Storage fault description

Our center received an EMC storage data recovery from a company. After preliminary understanding, this storage contains a total of 12 hard disks, forming a raid5 disk array, including 2 hot spare disks, with a single disk capacity of 1TB. The storage crashes due to hard disk failure.

2. Hard disk physical inspection and storage data backup

The engineer performed physical inspections on all hard disks in the customer’s storage, and found no other physical failures such as bad sectors, and then used winhex to back up all the disks in the storage. After the backup was completed, the customer’s original storage was returned to the customer. Perform data recovery operations in the mirror file.A successful case of storage data recovery in a Beijing company; storage crash data recovery method

Three, storage data recovery process

Since the customer’s storage infrastructure is a raid5 disk array, under normal circumstances, if the raid5 array crashes, at least 2 hard disks in the array will be disconnected. During the physical inspection process, no physical failure of the hard disk was found. In this data recovery, only the underlying raid structure needs to be analyzed and virtual reorganization is enough.
1. Analyze the raid structure.
After the recovery center engineer analyzes the image file, the disk sequence, strip size, and step-by-step rules of the raid array are finally obtained. And analysis found that no data was written on the two hot spare disks in the original storage array.
2. Analyze the raid array dropped disks.
According to the basic information of the riad5 disk array obtained from the analysis, the engineer uses the self-developed raid5 array reorganization tool to virtual reorganize the raid array. Then analyze the allocation information of the LUN in the RAID group and the data block MAP allocated by the LUN. In this storage data recovery, there is a lun in the upper storage, and the data recovery engineer can analyze the lun information and export the lun data.
3. Analyze the zfs file system.
Use the self-developed zfs file system analysis tool to analyze the lun file system. After analysis, it is found that the metafiles of some file systems are damaged due to storage paralysis. The engineers manually repair these damaged files. The zfs file system is parsed normally.
4. Export all data
in the storage After the Zfs file system parses successfully, continue to analyze and export the file nodes, directory structure, etc. in the storage. The engineer verifies the exported data and verifies that the data is normal and no errors are reported.
A successful case of storage data recovery in a Beijing company; storage crash data recovery method

Fourth, the storage data is restored successfully

After the customer personally verified the recovery results, it was finally confirmed that all the data in the customer's storage was successfully restored.
Since the customer’s storage has failed to protect the on-site environment without any other operations, this avoids a lot of unnecessary troubles in the later data recovery work, and also increases the success rate of data recovery to a certain extent.

Guess you like

Origin blog.51cto.com/sun510/2540781