EMC Isilon (OneFS) Storage Accidentally Deleted Data Recovery Method [Data Recovery Example]

[Popularization of Isilon's storage structure]
Isilon uses the distributed file system OneFS internally. In the Isilon storage cluster, each node is a single OneFS file system, so Isilon supports horizontal expansion without affecting the normal use of data. When the storage cluster is working, all nodes provide the same functions, and there is no master-slave distinction between nodes and nodes. When a user stores a file in the storage cluster, the OneFS layer will divide the file into 128K segments and store them in different nodes, while the node layer will divide the 128K segment into 8K small segments and store them on different hard disks of the node. middle. The Indoe information, directory entries and data MAP of user files are stored in all nodes respectively, which ensures that users can access all data regardless of which node they are from. Isilon will let the user select the corresponding storage redundancy mode during initialization, and different redundancy modes provide different levels of data security (the default 3 nodes use N+2:1 mode).

[Storage data recovery failure description]

The administrator of a company deleted important data such as the MSSQL database and a large number of MP4, ASF and TS video files from the server due to misoperation. The overall storage architecture of the server that needs data recovery adopts EMC high-end network NAS (Isilon S200). The number of nodes is 3, and each node is configured with 12 3T STAT hard disks and no SSD. The data to be recovered includes the vmware virtual machine (WEB server) and video files. The virtual machine is shared to the ESX host through the NFS protocol, and the video files are shared to the virtual machine (WEB server) through the CIFS protocol. All data shared by NFS (that is, all virtual machines) is deleted while data shared by CIFS is not deleted.

【Backup server data】

Because of data security and avoiding secondary damage to data, all hard disks need to be backed up. However, because the number of disks is too large (12 disks for a single node, 36 disks for 3 nodes) and the capacity of a single disk is too large (3TB for a single disk, a total of 108TB), the backup cycle will be longer. The final customer decides to back up only the existing data in storage, the data recovery company backs up once, and the customer backs up again to ensure the safety of the existing data.

[Server data analysis]

After the server data backup is completed, shut down Isilon normally in the Isilon web management interface. Then label all the hard disks on all nodes, take them out one by one, and put them in the data recovery platform to start analyzing the data in all the hard disks.
Since the customer data is deleted, there is no need to think too much about the redundancy level of storage. The key point is to analyze whether the file Indoe and data MAP change after the file is deleted. The deleted virtual disk files are all 64G or above, and there are no other types of large files in storage. Write a program to scan all file Indoes, and scan all Indoes whose file size is 64G or above. After careful analysis of the scanned Indoes, it is found that the MAP location of the data recorded in the Indoe, the content pointed to by its index is no longer normal data, and the Indoes on all nodes are in the same situation. After careful analysis of the Inode, it is found that the data MAP of the large file will have multiple layers (tree structure), and the unique ID of the file will be recorded in the data MAP, so you can try to find the data MAP at the bottom of the file. With luck, I did a traversal and tracking operation on the data MAP at the bottom of the file, and found that the data MAP at the lowest level was still there.

[Data recovery process]

Take the unique ID of the file from the Inode of the file, and then aggregate all the data MAPs that match the ID. And according to the VCN number in the data MAP, it is found that the first 17088 data MAPs of each file do not exist.
·

After careful conversion, it is found that the lost data MAP items contain less than 1G of data in total, and the deleted files are all vmdk files of the virtual machine, which are all NTFS file systems, and the MFT of the NTFS file system is basically 3G. Position, that is, you only need to manually forge an MBR and DBR at the head of each vmdk file to interpret the data in the vmdk (I don't know if it is a coincidence! Or a coincidence!). Interpret the scanned data MAP, and derive the data according to the order of the VCN number. If there is no MAP, leave it as zero.
·
After continuous testing, first export a vmdk file to have a look. The result surprised me, the exported vmdk file is smaller than it actually is, and the location of the MFT in the vmdk doesn't match its description. Manually and randomly verified that several MPAs can point to the data area, and there is no problem with the way the program interprets the MAP. So guess it may be the file sparse!
After making some adjustments to the code, re-export the vmdk just now. This time, the size of the vmdk matches the actual size, and the position of the MFT is also in the corresponding position. Manually forge an MBR, partition table and DBR, and then use the file system interpretation tool (self-use tool) to successfully explain its file system, and export the database and video files in the vmdk.
After verifying that the database and video files in this vmdk are OK, export all important vmdk files in batches, and then manually modify each vmdk file one by one.

[Data acceptance]

After all important data of the customer is restored, the customer will arrange an engineer to test the integrity and accuracy of all the restored data. There is no problem in the final determination of the data, and the data is restored successfully.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325622526&siteId=291194637