Raid disk array data recovery - database repair process

One of the customer's DS5020 fiber optic storages failed, resulting in data loss. The storage used 16 hard disks to form a RAID disk array. Disk No. 10 and No. 13 are disconnected. Disk No. 6 warns, and data recovery is required.

Raid disk array failure situation:

Back up the complete log state currently stored through IBM storage manager, and analyze the backup storage log to obtain some information about the logical volume structure. All the disks in the client server are sorted and moved out of the slot in a fixed order for testing. It is found that all the hard disks in the disk array are normal except that the smart status of the No. 6 disk is "Warning".

Disk array data recovery process:

The engineer first marked the normal hard disks in the raid array as offline in the Windows environment, and then performed a full-disk operation on all disks. During the backup process, the No. 6 hard disk was found to be abnormally slow. Preliminary speculation may be the cause. Because there are many unstable sectors and bad sectors in the disk, the equipment that specializes in mirroring hard disks with bad sectors is replaced to mirror the No. 6 hard disk. Bad sector data is adjusted.

After the mirroring operation, all disks mirrored by winhex under the Windows platform have been mirrored. Check the log generated by winhex and find that there are bad sectors in the No. 1 disk, which has no error in IBM storage manager/frombyte.com and in the SMART status of the hard disk. , Disks 10 and 13 both have a large number of irregular bad sectors. According to the bad sector list, the target image file is located and analyzed. It is found that some key data of the file system in the disk array is in the bad sector, so it is transferred to the hard disk of No. 6. Same strip xor repair manually. We use data recovery software to expand all the data in the backed up raid, organize and analyze the reverse of the ext3 file system and log files to analyze the disk sequence of the raid disk array, the size of the raid block, and the check trend and verification of the raid. method and other necessary information.
Through the analysis of the raid information, the RAID disk array is virtually reorganized and the database files are extracted by connecting and unloading the ext3 file system. An error occurred during the extraction of the database file, and the database reported an imp-0008 error. The data recovery engineer re-analyzed the raid structure, and extracted the dmp file and the dbf original library file again. All files are normal and no error is reported.

Database data recovery process

1. Copy the database files to the original database server, the path is /home/oracle/tmp/syntong. as a backup. Create an oradata folder in the root directory, and copy the entire syntong folder backed up to the oradata directory. Then change the group and permissions of the oradata folder and all its files.
2. Back up the original database environment, including related files in the product folder under ORACLE_HOME. Configure monitoring and use splplus in the original machine to connect to the database. Attempt to start the database to nomount state. After performing basic status query, I learned that there is no problem with the environment and parameter files. Try to start the database to the mount state, and there is no problem with the status query. Start the database to the open state. An error occurred:

ORA-01122: database file 1 failed verification check/frombyte.com
ORA-01110: data file 1: '/oradata/syntong/system01.dbf'
ORA-01207: file is more recent than control file - old control file

3. After further detection and analysis, it is judged that the fault is inconsistent with the information of the control file and the data file, which is a common fault caused by power failure or sudden shutdown.
4. Check the database files one by one, and detect that all data files are not physically damaged.
5. In the mount state, back up the control file, alter database backup controlfile to trace as ' /backup/controlfile'; view and modify the backup control file, and obtain the rebuild control file command. Copy these commands into a new script file controlfile.sql.
6. Close the database and delete the 3 control files under /oradata/syntong/. Start the database to the nomount state and execute the controlfile.sql script.

SQL>startup nomount/frombyte.com
SQL>@controlfile.sql

7. After the reconstruction of the control file is completed, start the database directly and report an error, which requires further processing.

SQL> alter database open;
alter database open/frombyte.com
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/free/oracle/oradata/orcl/system01.dbf'

Then execute the restore command:

recover database using backup controlfile until cancel;
Recovery of Online Redo Log: Thread 1 Group 1 Seq 22 Reading mem 0
Mem# 0 errs 0: /free/oracle/oradata/orcl/redo01.log

Do media recovery until it returns a report and the recovery is complete.
8. Try to open the database.
SQL> alter database open resetlogs;
9. The database is started successfully. Add the data files of the original temp tablespace to the corresponding temp tablespace.
10. Perform various routine checks on the database without any errors.
11. Make an emp backup. The full database backup is completed without any errors. Connect the application to the database for application-level data validation.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324525082&siteId=291194637