"MySQL Practical Combat 45 Lectures" - Study Notes 31 "Solutions for accidentally deleting data (deleting rows/deleting tables/deleting databases/deleting instances)"

This article introduces several situations in which MySQL accidentally deletes data and how to deal with it, including:

Use the delete statement to delete the data row by mistake;
Use the drop table or truncate table statement to delete the data table by mistake;
Use the drop database statement to delete the database by mistake;
Use the rm command to delete the entire MySQL instance by mistake;

Use the delete statement to delete data rows by mistake

If the data row is accidentally deleted by using the delete statement, you can use the Flashback tool to restore the data through flashback;

The principle of Flashback to restore data is to modify the contents of the binlog and replay it back to the original library ; the premise of using this solution is to ensure that binlog_format=row and binlog_row_image=FULL;

binlog_format

When binlog_format=statement, the original text of the SQL statement is recorded in the binlog, including comments; in the statement format, if there is a limit in the statement, this command may be unsafe, such as delete+limit, when going to different indexes, the execution results are different ;Therefore, although the statement format has a small log volume and saves IO, it may lead to inconsistencies between the master and backup ;

When binlog_format=row, the primary key id of the row actually deleted is recorded in the binlog, and there will be no problem of the master and backup deleting different rows; however, a large number of logs will be generated and more IO resources will be consumed ;

MYSQL has taken a compromise, and also provides a mixed format log to judge whether this SQL statement may cause an inconsistency between the master and backup. If possible, use the row format, otherwise use the statement format; that is to say, the mixed format can use statment The advantages of small format logs, while avoiding the risk of data inconsistency;

Although MYSQL provides logs in mixed format, it is also recommended to set MySQL’s binlog format to row. One of the important benefits is: recovering data; it can accurately locate the modified record and restore it according to the original SQL ;

binlog_row_image

before image: before image, that is, the contents of the database table before modification;

after image: After mirroring, that is, the modified content in the database table;

Three settings of binlog_row_image and similarities and differences The binlog_row_image parameter can be set to three legal values: FULL, MINIMAL, NOBLOB; when FULL is set , Log all columns in both the before image and the after image, that is, binlog logs record all before and after images, ensuring The content before and after the data modification can be retained to facilitate recovery ;

When restoring data, do the following for a single transaction :

For the insert statement, the corresponding binlogevent type is Write_rowsevent, just change it to Delete_rowsevent;
For the delete statement, also change Delete_rowsevent to Write_rowsevent;
If it is Update_rows, the binlog records the values of the data row before and after modification, just swap the positions of these two rows;

For example, consider the following three transactions:

(A) delete...
(B) insert...
(C) update...

Now to restore the data, after parsing the binlog with the Flashback tool, the command to write back to the main library is:

(reverseC) update...
(reverseB) delete...
(reverseA) insert...

That is to say, if the accidental deletion of data involves multiple transactions, the order of the transactions needs to be reversed before execution;

Suggestion: It is not recommended that you execute SQL commands for data recovery directly on the main database based on the binlog

This is because, for a main library that is executing online logic, the change of data status is often related; it may be that the data misoperation problem is discovered a little later, resulting in the previous misoperation. The business code logic continues to modify other data;

Therefore, if these rows of data are recovered separately at this time without confirmation, secondary damage to the data may occur;

Prevention of Misdeleting Data by SQL DELETE Command

Set the sql_safe_updates parameter to on; in this way, if we forget to write the where condition in the delete or update statement, or if the where condition does not contain index fields, the execution of this statement will report an error;
Before the code goes online, it must pass the SQL audit;

Use the drop table or truncate table statement to delete the data table by mistake

How to delete table data?

The most direct way is to delete items one by one through the SQL DELETE command; if you are sure that the delete operation is OK, you can add the where id>=0 condition to the delete statement;

However, deleting the full table is very slow, and needs to generate rollback logs, write redo, and write binlog ; therefore, from the perspective of performance, you should give priority to using the truncate table or drop table command;

You can use Flashback to restore the data deleted by using the delete command; but the data deleted by using the truncate table/drop table/drop database command cannot be restored by Flashback; why ?

This is because, even if we configure binlog_format=row, when the above three table deletion commands are executed, the recorded binlog is still in the statement format, that is to say, there is only one truncate/drop statement in the binlog, and the information cannot be recovered. ;

How to recover after drop / truncate table? ——Full data backup + incremental log backup

Use drop/truncate to delete table data. In this case, to restore data, you need to use full data backup and add incremental binlog logs; this solution requires regular full backup online and real-time backup of binlog ;

When both conditions are met, if someone accidentally deletes a database at 12:00 noon, the process of restoring data is as follows:

Take the latest full data backup, assuming that the library is backed up once a day, and the last backup was at 0:00 of the day;
Use the backup to restore a temporary library;
From the binlog log backup, take out the logs after 0 am;
Apply these logs to the temporary library except for the statement about deleting data by mistake;

How to skip the binlog corresponding to the mistakenly deleted data statement?

If the original instance does not use GTID mode, you can only use the --stop-position parameter to execute the log before the misoperation when applying to the binlog file containing 12 o'clock, and then use --start-position to start from the log after the misoperation continue to execute;
If the instance uses the GTID mode, it will be much more convenient; suppose the GTID of the misoperation command is gtid1, then only need to execute setgtid_next=gtid1;begin;commit; first add this GTID to the GTID set of the temporary instance, and then press the When the binlog is executed sequentially, it will automatically skip the misoperation statement;

Use the drop database statement to delete the database by mistake

Because the binlog is at the entire database level, the above solution can also be used for data recovery when dropping the database;

Although we can speed up the process of restoring data by using parallel replication, this solution still has the problem of "uncontrollable recovery time"; if the backup of a library is particularly large , or the time of misoperation is longer than the last full backup , For example, for an instance that is backed up once a week, if a misoperation occurs on the 6th day after the backup, the logs of 6 days need to be restored, and the recovery time may be calculated in days ;

So, how can we shorten the time required to restore data ?

Standby for delayed replication

If there is a very core business that does not allow too long recovery time, we can consider building a standby database for delayed replication; this feature was introduced in MySQL5.6;

The problem with the general master-slave replication structure is that if a table on the master database is accidentally deleted, this command will soon be sent to all slave databases, which will cause all slave database tables to be deleted by mistake. up;

The standby database for delayed replication is a special kind of standby database. Through the CHANGE MASTER TO MASTER_DELAY=N command, you can specify that this standby database continues to maintain a delay of N seconds with the main database;

For example, if you set N to 3600, it means that if the data on the main database is deleted by mistake, and the misoperation command is found within 1 hour, then the command has not been executed on the delayed replication standby database; At this time, execute stop slave on the standby database, and then use the "backup + incremental log" method described above to skip the misoperation command in the binlog, and then the required data can be recovered;

Through this delayed replication standby database (N=3600), you can get one at any time, and you only need to chase it for another hour at most to recover the temporary instance of the data, which shortens the time required for the entire data recovery;

Methods to prevent accidental deletion of databases/tables

The first suggestion is to separate account permissions;

The purpose of this is to avoid writing wrong commands; for example: we only give DML permissions to business developers, not truncate/drop permissions; and if business developers have DDL needs, they can also be supported through the development management system; even I am a member of the DBA team, and I only use read-only accounts on a daily basis, and only use accounts with update permissions when necessary;

The second suggestion is to develop a code of practice;

The purpose of this is to avoid wrongly writing the name of the table to be deleted; for example: before deleting the data table, you must first rename the table; then, observe for a period of time to ensure that there is no impact on the business before deleting this table; change For the table name, it is required to add a fixed suffix to the table name (for example, add _to_be_deleted), and then the action of deleting the table must be executed through the management system; and, when the management system deletes the table, only the table with the fixed suffix can be deleted;

Use the rm command to delete the entire MySQL instance by mistake

In fact, for a MySQL cluster with a high-availability mechanism, the most feared thing is to delete data by rm; as long as the entire cluster is not maliciously deleted, but only the data of one of the nodes is deleted, the HA system will start Work, elect a new main library, so as to ensure the normal work of the entire cluster;

At this time, all you have to do is restore the data on this node, and then access the entire cluster;

Of course, now not only DBAs have automated systems, but SAs (system administrators) also have automated systems, so maybe an operation of offline machines in batches will wipe out all the nodes in your entire MySQL cluster;

In fact, for the above-mentioned mistakenly deleted data problem, no matter what kind of data recovery scheme is used, the core idea of its premise is to do a good job of data backup ; in the case of the cluster rm command, the suggestion can only be to say that the data should be restored as much as possible. Backup across computer rooms, or save across locations;

References for this article: 31 | What else can I do besides run away after accidentally deleting data?