National Computer Rank Examination Level 3 Database Technology (11)

Chapter 11_Fault Management

test analysis

◆Generally, it will appear in multiple-choice questions and fill-in-the-blank questions in the exam.
◆Common test knowledge points include:
1. Master the types of faults and corresponding solutions
2. Master the relevant content of data dump and log files
3. Master RAID redundancy technology and server fault tolerance technology
4. Familiar with database mirroring and database disaster recovery

11.1 Overview of Fault Management

1. Fault types and their solutions
1. There are roughly four types of faults in the database system: internal transaction faults, system faults, media faults, and computer virus faults.
01. Faults within the transaction: divided into expected and unexpected, most of which are unexpected
02. System faults: also known as soft faults, will affect all running transactions.
03. Medium failure: also known as hard failure. May result in damage to the physical storage device.
04. Computer virus failure: It is a malicious computer program that can damage the database system.

The four types of faults are different. But there are two kinds of impacts on the database, namely, the damage to the database itself or the damage to the data in the database.
The basic principle of its recovery is summarized as follows: redundancy, all data in the database can be reconstructed according to the redundant data stored elsewhere.
2. Failure of internal affairs
3. System failure
4. Media failure
Description: Media failure is also called hard failure. It mainly refers to a type of failure in which part or all of the data in the database is lost due to head collision, disk damage, strong magnetic interference, natural and man-made disasters, etc. during the operation of the database. There are two kinds of fault-tolerant countermeasures for media failure: software fault-tolerant and hardware fault-tolerant.
A. Hardware fault tolerance
a. The hardware fault tolerance method can ensure that the database under media failure can be completely restored
b. The current common method of hardware fault tolerance is to use dual physical storage devices, such as dual hard disk mirroring;
c. In a higher level hardware fault tolerance scheme A dedicated storage device is required.
d. Another method of hardware fault tolerance is to design two sets of the same database system. Through the database software mechanism, the data is changed synchronously, and there is a certain distance between the two systems. 5. Computer virus fault description:
computer
virus is a malicious Computer programs, which can reproduce and spread like viruses, may cause damage to the database system while causing damage to the computer system (the damage method is mainly to destroy the database file). The methods to prevent computer virus damage are as follows
:
use Firewall software prevents virus intrusion.
Use anti-virus software to kill virus-infected database files
. Use database backup files to restore database files in a software fault-tolerant manner.

2. Overview of database recovery technology
The concept of database recovery: when a failure occurs, redundant data stored elsewhere in the system can be used to reconstruct damaged or incorrect data in the database, and restore the database from an error state to a known state correct state, thereby rebuilding a complete database.
The two issues involved in the recovery mechanism are:
① How to establish redundant data;
② How to use these redundant data to implement database recovery.

Techniques for creating redundant data:
data backup, logging log files, database replication, database mirroring, setting up savepoints for segments, using backup segments and current page tables to support segment preservation.

11.2 Data Dump

1. The basic concept of data dump
1. The concept of data dump: refers to the process that the database administrator (DBA) or database management system regularly copies the database and stores the copied data in other media, so data dump is also called data backup.
2. The database administrator can use these copies to restore the database after the database system fails, but it can only restore to the state at the time of the dump. If you want to restore to the state before the failure, you need to refer to the log file.
insert image description here

3. The introduction of log files plus dynamic dump can restore the database to a certain correct moment.

2. Static dump and dynamic dump
Data dump is divided into static dump and dynamic dump, the details are as follows:
1. Static dump
(1) During the static dump process, the system cannot During the storage period, there are any access and modification activities to the database, that is, the system must be in a consistent state before and after the dump.
(2) Static dump is simple though. But the dump operation must wait for the end of the old thing, and the new thing must wait for the completion of the dump operation, the dump operation and the transaction are mutually exclusive, and either dump or run the transaction within a period of time, so the availability of the database will be reduced .
2. Dynamic dump
(1) Dynamic dump refers to allowing concurrent execution of dump operations and user transactions. That is, allowing access and modification operations to the database during the dump process.
(2) There may be transactions in the dynamic storage that modify the data in the database. It cannot guarantee the consistency of the dumped data, because the dump file only saves the data at a certain moment during the storage period. If the transaction is modified at the next moment The data, the changed data will not be reflected in the dump file.
(3) The data on the dump file is not data at a certain time point, but may be mixed data at multiple time points.
3. Data dump mechanism
1. Data dump mechanism
The three dump methods of the data dump mechanism are shown in the figure below:
(1) Full dump
A full dump is to dump all the data in the database. This dump method takes more time and space, but the recovery time is short when the system fails.
(2) Incremental dump
Only copy files or data blocks that have changed since the last dump. The time and space required for the incremental dump are relatively short, but the incremental dump data can only be restored with the full dump, and the recovery time of the incremental dump is longer than that of the full dump only.
(3) Difference dump
Dump the data changes that have occurred since the last full database dump. The differential dump is also called the differential dump. Compared with the full dump, it is faster and takes up less space; compared with the incremental dump. It is slow and takes up a lot of space, and the recovery speed is faster than incremental dump.
2. Data dump mechanism
The characteristics of the three data dump mechanisms, full dump, incremental dump, and differential dump, are compared, as shown in the table below.

full dump incremental dump differential dump
time and space use longest least less than full dump
Dump speed slowest fastest faster than full dump
recovery speed fastest slowest faster than incremental dump

3. Combined use of multiple dump methods
01: Only use full dump
Only using full dump will generate a large amount of data transfer. It takes up a lot of time and space, and may have a large impact on database performance. This method is expensive.
02: Full dump plus incremental dump
Full dump plus incremental dump is to perform a full dump every once in a while. Perform multiple incremental dumps in the middle of the full special dump, avoiding the use of all full dumps The resulting large amount of data movement.
03: Full dump plus differential dump
Because data recovery is very difficult in the full plus incremental method. So there is a method of full dump plus differential transfer. Although a differential dump moves and stores more data than a delta dump,
the restore operation is simple. Recovery time is also short.

11.3 Log files

1. The concept of the log file
The log file records the modification operation of each transaction on the database. During the operation of the database system, the modification operation of all transactions is registered in the log file.
The specific function of the log file is as follows.
1. Log files must be used for transaction failure recovery and system failure recovery
2. Log files must be created in dynamic dump mode
3. Log files can also be used in static dump mode

1. Transaction failure recovery and system failure recovery must use log files
(1) Two basic operations for failure recovery. There are two basic operations when using log files for fault recovery: UNDO(T) and REDO(T).
Two basic operations for failure recovery

(2) Transaction failure recovery
A transaction is a complete unit of work. The work in the transaction is either done or not done at all, otherwise the database will appear in a different state, so when the transaction fails to recover, it is only necessary to undo the corresponding transaction operation UNDO (Ti) can be.
2. System failure recovery
When the system fails, it may be affected by multiple running transactions. It needs to be discussed according to the situation:
(1)
Undo The transaction has started but not yet committed, that is, there is a record of BEGIN TRANSACTION in the log file, but there is no COMMIT or ROLLBACK.
(2) Redo
The transaction has completed all operations and submissions of the transaction, and there are both BEGIN TRANSACTION records and COMMIT records in the log file.
① Scan the log file forward to find all the transactions that occurred before the system failure. If the transaction is not completed, add its transaction mark to the undo queue; if the transaction is completed, add its transaction mark to the redo queue.
②Undo UNDO for all transactions in the undo queue.
③Redo operation REDO for all transactions in the redo queue. 2. In
the dynamic dump mode, a log file must be created.
The dump file can only restore the database to a certain state during the dump process, and the data in the dump file may be in an inconsistent state.
02. Only when dynamic dump and log files are used comprehensively, can the database be restored to a consistent state, or the database can be restored to the state before the failure, so as to effectively restore the database.
3. Log files can also be used in the static dump mode
(1) In the static dump mode, when the database is damaged, the dump file can be used to restore the database to the state at the end of the dump, and then use the log file to restore the Completed transactions are redoed, and unfinished transactions are undone when the fault occurs.
(2) The process of using log file recovery in static dump mode is shown in the figure.
Recovery from log files

2. The format and content of log files
1. The log file formats used by different database systems are not exactly the same, and they can be summarized into two types: log files with records as units and log files with data blocks as units.
01 The log file in record unit
includes the start mark (BEIN TRANSACTION) of each transaction, the end mark of each transaction (including transaction commit record or transaction termination record), and all modification operations of each transaction (located between the start mark and closing tag).
02 Log files in units of data blocks
Put the entire block before the update and the entire block after the update into the log file. so. The content of the log record only needs to include the transaction identifier and the updated data block, and does not need to include information such as the block operation type and operation object.
2. Composition of log files
Composition of log files
3. Principles of registering log files
In order to ensure that the database is recoverable, log files must follow two principles, as shown in the figure:
insert image description here

4. Checkpoint
1. The role of checkpoint
In the process of using log files to restore database data, the recovery subsystem needs to search the log, check all log records, undo the transactions that were not committed when the failure occurred, and redo the committed transactions . Checkpointing minimizes the portion of the log that must be performed when the database is fully restored, improving recovery efficiency.
2. The introduction of checkpoints
(1) The introduction of checkpoints is to add a new type of record in the log file - checkpoint records, and add a "restart file" (used to record each checkpoint record in the log file address in ), and let the recovery subsystem maintain the log dynamically during logging to the log file. The content of the checkpoint record is as follows:
01. A list of all transactions being executed when the checkpoint is established.
02. The address of the last log record of these transactions.

(2) Steps for dynamically maintaining log files
The method for dynamically maintaining log files is to periodically perform the following operations: establish checkpoints and save the database state. The specific steps are as follows:
A. Write all the log records in the current log buffer to the log file on the disk.
B. Write a checkpoint record in the log file.
C. Write all the data records of the current data cache into the database on the disk.
D. Write the address of the checkpoint recorded in the log file to a "restart file".
The recovery subsystem can periodically or irregularly check points to save the state of the database. Checkpoints can be established at predetermined intervals.
3. Checkpoint-based recovery steps
Checkpoint-based recovery steps

11.4 Hardware Fault Tolerance Scheme

1. Overview of hardware fault tolerance scheme
1. In order to ensure the continuous operation of the database system, only relying on the database system software cannot meet the requirements. Therefore, it is necessary to protect from the database system at the hardware level.
2. The scheme of hardware fault tolerance needs to start from the various environments required by the operation of the database system, and analyze the links that support the operation of the database system. For example, the electricity in the computer room, the air-conditioning environment in the computer room, the network, the storage, the server, and comprehensive consideration, otherwise a failure in a certain link may cause the database system to be inoperable.
2. RAID system
1. Description: Inexpensive Redundant Disk Array (RAID), it is a whole composed of multiple disks, but this is not equal to the simple superposition of disk capacity, but relative to other storage devices in terms of capacity, Management, performance, reliability and availability have been further improved.
2. Features: When one of these disks is extracted, the information on this disk can be recovered by using the information on other disks.
The RAID system can be connected to the host system as a medium for storing data, and has the ability of device virtualization. The RAID subsystem diagram is shown in the figure.
RAID subsystem diagram

3. Two redundancy technologies of RAID
(1) Mirror redundancy
Mirror redundancy is to copy all data to other devices or other places.
Features: It is very simple to implement, but the additional overhead is large, requiring more disks, controllers and cables.
(2) Check redundancy
Check redundancy is to obtain the check value by performing exclusive OR (XOR) operation on the data on the member disk, and store it on another check disk.
Features: It is complicated to implement, but it takes up less disk redundancy than mirroring.
4. Common RAID levels and features are shown in the figure
Common RAID levels and features
5. Soft RAID and Hard RAID
insert image description here

3. Server fault tolerance technology
1. Reasons for introducing server fault tolerance
(1) RAID technology can prevent disk damage from causing the database system to stop working abnormally.
(2) Server fault tolerance can prevent server hardware failure or operating system software failure from causing abnormalities in the database system, resulting in the database system being unable to provide external services.
(3) Server fault-tolerant technology is a solution to solve server hardware abnormalities
2. Introduction to server fault-tolerant technology
Server fault-tolerant technology generally uses two identical servers, two servers share storage devices, one of which runs the database system, Database data is stored in storage devices.
3. Introduction to other server fault-tolerant technologies
(1) In terms of hardware,
in order to provide high stability at the hardware level, some minicomputers have adopted a dedicated software and hardware framework designed and manufactured by themselves.
For example: when a CPU or memory has a hardware failure, if the operating system kernel does not run on this CPU or memory, the operating system kernel can directly assign tasks to the normal CPU or memory, and then shut down these CPUs or memory to ensure that the system Stable operation
(2) In software garbage classification
Some large-scale database software provides dedicated server-level fault-tolerant technology,
for example: Oracle database provides RAC architecture. In the Oracle RAC architecture, the database can run on multiple servers at the same time, and these servers share a storage.

4. Database mirroring and database disaster recovery
1. Reasons for introducing database mirroring
With the deepening of enterprise informatization, the dependence of daily work on services is also increasing. Therefore, enterprises put forward higher requirements for the reliability and stability of database servers. In order to avoid the impact of media failure on database availability. Many database management systems provide database mirroring capabilities.
2. Introduction to database mirroring
(1) Database mirroring is a solution for improving database availability. It automatically copies the entire database or key data in it to another touch disk according to the requirements of the DBA. There are different database servers on different disks, and they realize mutual backup of applications and data through corresponding software and hardware technical means.
(2) Advantages of database mirroring
01. In the event of a disaster, database mirroring can quickly enable the standby copy of the database to provide services, so that data will not be lost and improve the availability of the database.
02. Database mirroring provides complete or nearly complete data redundancy and enhances data protection.
03. Improve the availability of the mirror database during the upgrade.
3. Classification of database mirroring
1. The basic structure of database mirroring is divided into: dual-machine mutual backup mode and dual-machine hot backup mode.
(1) Dual-machine mutual backup mode
Dual-machine mutual backup means that both hosts are working machines. Under normal circumstances, the two working machines
provide support for the information system and monitor each other's operating conditions. When one host is abnormal, the other host will take over the work of the abnormal machine, so as to ensure the uninterrupted operation of the information system. However, there will be an increase in the host load for normal operation.
(2) Two-machine hot backup mode
Two-machine hot backup means that one host is a working machine and the other is a backup machine. In the normal operation of the system, the working machine provides support for the information system, and the backup machine monitors the operation of the working machine.
2. The switching timing of dual-machine mutual backup mode and dual-machine hot backup mode is the same, and the content is as follows.
01. The system software or application software causes the server to go down.
02. The server is not down, but the system software or application software is not working properly.
03. The SCSI card is damaged. The server and disk array cannot access data.
04. The hardware inside the server is damaged, causing the server to go down
. 05. The server shuts down abnormally.

4. How database mirroring works
insert image description here

  1. Introduction to SQL Server database mirroring
    (1) SQL Server database mirroring is to move database transactions from a SQL Server database to another SQL Server database in a different SQL Server environment.
    (2) The copy of the mirror image is a backup copy. It cannot be accessed directly and is only used for error recovery.
    (3) Database mirroring session mode
    01. Asynchronous operation: Transactions can be submitted without waiting for the mirror server to write the log to disk, maximizing performance.
    02. Synchronous operation: The transaction will be submitted at both partners, but the transaction lag time will be extended.
    Recovery
    (4) has two mirroring modes of operation: High Security Mode and Run Mode.
    High security mode:
    ●Support synchronous operation. That is "high performance mode".
    ●When the session starts, the mirror server will synchronize the mirror database with the principal database as soon as possible. After synchronizing the database, the transaction will be committed at both partners, which will increase the transaction lag time.
    (5) Database mirroring provides three implementation methods: high availability, high protection, and high performance.
    01. High availability: synchronous transaction writes on two servers, and supports automatic error recovery.
    02. High protection: Synchronous transaction writing on two servers, but error recovery is manual.
    03. High performance: The writing on the two servers can be asynchronous, so the performance is improved, and only manual error correction is allowed.

Guess you like

Origin blog.csdn.net/weixin_47288291/article/details/123519588