Talk about MySQL database data security issues

When I went to a company for an interview before, I was often asked some questions about the MySQL database, the most typical of which was about the data security of the MySQL database.

For example: How can we ensure the data security of the MySQL database? Where can data loss occur in a MySQL database? How to rescue the data loss in MySQL database?

The main purpose of asking these questions is to test the production experience of job seekers; but as far as my interview process is concerned, there are very few job seekers who can answer them completely.

This is mainly because most job seekers do not have a clear understanding of the operating mechanism of the underlying data storage of the MySQL database, so today we will discuss the underlying storage operating mechanism of the MySQL database in detail for the above-mentioned issues.

Under what circumstances may data loss occur in a MySQL database

Before introducing the circumstances under which data may be lost in the MySQL database, we first review the modules experienced in writing a piece of data to the MySQL database, as follows.

  • The first module: save the modified data logic in  change buffer it.
  • The second module: save the modified data in  binlog cache it.
  • The third module: save the modified data in  redo log it.

Among these three modules, change buffer the logic used to save the modified data is written to the disk by means of merge after modification. For details, we can refer to  [Article 7] and  [Article 13]  .

In this module, is there any possibility of data loss? In fact, it is very small. This is mainly because the MySQL database adds  redo log this module to prevent data loss. Its main function is to prevent data loss. So, let's talk about  redo log how to save data.

redo log It is mainly divided into two parts, namely  redo log and  redo log buffer two parts. redo log The function of sum  redo log buffer has been mentioned many times before, so I won't repeat it here. Below we mainly introduce  redo log how to save data.

Before a piece of data is written into the database, in order to prevent data loss, the piece of data will first be saved in it  redo log buffer , and then saved in  redo log it, so that it can be used for data recovery when the database is down.

So at this time we have to ask, is there any possibility of data loss during this process  redo log ?

It is possible. If the MySQL database crashes suddenly during the execution of a transaction,  redo log buffer all data in it will be lost at this time, but generally this will only happen when the transaction is not committed, and the loss of all data will not affect much.

At this point, I believe you will ask, if the MySQL database crashes just when the transaction is committed, will the data be lost?

Obviously, this is also possible. Next, we  redo log buffer analyze the possibility of losing data based on the data stored in the MySQL database.

In the MySQL database, redo log buffer there are three states of saving data, namely:

Note: The system disk also has a cache, usually we call it: page cacge.

  • redo log buffer Save the data in the memory of the MySQL database, which is the InnoDB storage engine  buffer pool . This is actually consistent with the above situation. Uncommitted data is saved. If MySQL crashes at this time, the uncommitted transaction information will be lost, which has no major impact on the MySQL database as a whole.
  • redo log buffer Save the data in  page cache it, that is, in the disk cache. At this time, the transaction in MySQL has been submitted. If the server running the MySQL database happens to be down at this time, it is obvious that the transaction data that has been submitted will be lost. Data loss that occurs under such circumstances is irrecoverable.
  • redo log buffer Save the data in the disk. In this case, only the disk is not abnormal, and the data will not be lost.

For redo log bufferthe three states of saving data, MySQL also provides innodb_flush_log_at_trx_commita parameter called . This parameter has three values. The main function is to tell MySQL  redo log buffer where to save the data, as follows:

  • When the value of this parameter is set 0 to , redo log buffer all data will be saved in  buffer pool it; that is, all of it will be saved in memory, and the performance is the best at this time, but once the database restarts, the redo log buffer data in All lost.
  • When the value of this parameter is set 1 to , redo log buffer all data will be saved directly in the disk. At this time, the data is the safest, but the performance is the worst.
  • When the value of this parameter is set 2 to , redo log buffer all data will be saved in  page cache it; that is to say, all data will be saved in the disk cache. At this time, the performance is almost the same as 0 when it , but if at this time If the server on which the MySQL database is deployed goes down, the data will be lost immediately.

Note: In actual applications, data is not saved to disk only when a transaction is committed. There are also the following two situations.

  • The first case: If there are multiple transactions in parallel,  redo log buffer all the data that has been saved in the database will be persisted.
  • The second case: If  half of the memory space redo log buffer of the InnoDB storage engine is occupied  buffer pool , MySQL will also persist the data.

We have introduced  redo log the scenarios where data loss may occur above, and let's understand  binlog the situations where data loss may occur in .

Relatively  redo log speaking, binlog writing data is relatively simple.

The first thing to explain is that binlog each write writes the entire transaction into  binlog the file at the same time. This is mainly because transactions in the MySQL database are atomic, so before a transaction is executed, the MySQL database writes it to the file.  binlog cache among.

It  binlog cache is a piece of memory space in MySQL, which brings a new problem at this time, that is, if the  binlog cache space is not enough to carry all the data contained in a certain transaction, MySQL will temporarily store all the data in the transaction to the disk (it will be generated as a last resort at this time 磁盘IO, and then it will cause certain performance problems).

In order to solve this problem, the MySQL database provides us with a  binlog_cache_size parameter, which is mainly used to set  binlog cache the size of the space. If the size of the data binlog cache in it  exceeds binlog_cache_size the set size, MySQL will temporarily save all the data in the transaction to disk.

And  redo log the same is that binlog cache there are three states to save data, and MySQL provides  sync_binlog this parameter to control this state. The three states are:

  • When the value is equal to  0 , after each transaction commits, save in  page cache;
  • When the value is equal to  1 , after each transaction is committed, it is saved to the disk;
  • When the value is equal to  N(N > 1) , after each transaction is committed, it will be saved in  page cache it, and written to disk after accumulative N times.

Therefore, it is not difficult to see that when N is larger, the related performance will be better; on the contrary, if the database goes down during data submission, the subsequent consequence is that the data stored in it will be lost  binlog cache .

Summarize

Today, we mainly introduced several situations in which data loss may occur during the operation of MySQL database.

First of all  redo logredo log the most likely situation in which data will be lost is that the data redo log buffer in it  is stored in MySQL memory, that is, when innodb_flush_log_at_trx_commit it is set to  0 ; therefore, for the sake of safety and performance, it is recommended to set it  2to , and the server where MySQL is deployed will not restart.

Secondly  binlog there are three states for saving data, similar to the same as the . For the sake of safety and performance, I suggest you set  redo logit   to  .binlogsync_binlogN

In a daily production environment,  sync_binlog the setting is generally between 100 and 1000, depending on the performance of the server. If the server has free memory, it can be increased as needed.

Guess you like

Origin blog.csdn.net/Java_LingFeng/article/details/128692182
Recommended