What? Still using delete to delete data "Deadly Knock MySQL Series Nine"

series of articles

5. How to choose ordinary index and unique index "Deadly Kick MySQL Series V"

Six or five minutes, let you understand how MySQL chooses indexes "Deadly Kick MySQL Series VI"

7. Strings can be indexed like this, do you know? "Deadly Kick MySQL Series Seven"

8. "Slow" SQL that cannot be reproduced "Deadly Kick MySQL Series 8"

insert image description here

Participated in the development of several projects. As the business volume of each project increased, the MySQL data increased dramatically. For example, the user footprint table in one of the projects was very crazy.

In this article I will discuss the performance impact of delete and how to delete data in the correct posture.

The table of the Innodb storage engine in MySQL has two parts, one part is the table structure, and the other part is the table data.

Before MySQL 8.0, /var/lib/mysqlthe .frm file will exist, but it will not exist after MySQL 8.0. This is because MySQL 8.0 has allowed the table structure definition to be placed in the data dictionary, which is determined by the parameter innodb_file_per_table.

1. Tablespace

There are several types of tablespaces: system tablespace, user tablespace, and undo space.

System tablespace: MySQL internal data dictionary, such as the data under the information_schema library.

User tablespace: table structure data created by yourself

Undo space: Stores Undo information for fast rollback.

Before MySQL 8.0, the table structure was stored in the system table space. After MySQL 5.6.6, the parameter innodb_file_per_table can be used to control it.

When set to off, the table data is placed in the system tablespace, that is, the MySQL data dictionary is put together.

When set to on, table data for the innodb storage engine is stored in .idb files.

Do you know where the table definitions are stored?

Come to kaka, the dedicated database of the dead MySQL series, and create a new table evt_sms.

Guess where is the created evt_sms table structure definition stored?

In TABLES in the information_schema library, execute the querySELECT TABLE_NAME,TABLE_COMMENT FROM TABLES WHERE TABLE_TYPE='BASE TABLE';

Our custom table type is TABLE_TYPE.

This is said to explain that if it is innodb_file_per_tableset to off, the table data will also be stored here.

Question: If the data exists in the shared tablespace and the table is deleted, will the space be deleted?

The answer is no.

Where is the data stored when the parameter innodb_file_per_table is set to on?

Under normal circumstances var/lib/mysql, you will see the database you created, and you can see a table corresponding to an ibd file when you enter the database.

Data is stored here.

in conclusion

Remember to set innodb_file_per_table to on at the beginning of the project, this is the right thing to do.

2. Data deletion process

Now you should know that the Innodb storage engine uses a B+ tree data structure, as shown below.

If the record with the primary key ID of 4 is deleted now, the Innodb engine will mark the record with the ID of 4 as deleted. If the record with the ID of 4 is inserted later, this location may be reused, but the size of the disk file is not will not shrink.

implicit field

This involves a knowledge point in mvcc. The implementation principle of MVCC is realized by two implicit fields, undo log, and read view.

The deletion of the mark mentioned above is the delete flag in the implicit field, that is, the record is updated or deleted. The deletion here does not mean the real deletion, but the delete flag of this record is changed to true.

In MVCC: I heard that some people are curious about my underlying implementation. This article also left a foreshadowing for everyone. Is the deletion of the database really deleted?

Question: What happens when all data of a data page is deleted

Just like a single piece of data, the entire data page can be reused.

The multiplexing of records is limited to data that meets the range conditions. For example, the record with ID 4 deleted above will be multiplexed if the ID is 4 when inserted.

Here we need to talk about a new knowledge point for everyone 页合并. If the utilization rate of two adjacent data pages is very low, the system will merge the two data pages into one page, and the other data page will be marked as available. reuse.

Question: What happens if the data in the entire table is deleted using delete

The answer is that all data pages will be marked as reusable, but the disk file size will not change.

Three, practice full table delete table file size does not change

After adding data, the table data has reached nearly 100W, and the file size has reached 108M.

expand

You should be able to see here stoppedthat the command is executed ctrl + z. The function is to start us in the MySQL window, but do not want to exit the MySQL window to check the size of the MySQL table file, and then you can execute this command to end the task.

After viewing, you can fgreturn to the MySQL window during execution.

Question: How does Linux display the file unit as M

Assuming that you have just executed the ll command to view the file, you need to manually calculate the file size, which is very inconvenient.

Execute the ll -h command to see the file size intuitively.

Delete data to see if the disk file shrinks


In order to visually see the change in the size of large files, Ka Ka directly deletes all the data in the table, and then looks at the file size, which is still 108M. The file size is unchanged.

Fourth, how to correctly reduce disk files

In the third section, we demonstrated that the file size did not change after deleting 100W of data, that is, the effect of the hole problem, and then we will solve this problem.

Question: How are voids created?

At this point, you should know that the void is caused by a large number of additions and deletions.

Solutions

You can create a new evt_sms_copy table, and then read the data from evt_sms into evt_sms1 according to the increasing order of the primary key ID.

In this way, the problem that the disk file size cannot be shrunk due to holes can be achieved.

Question: Why can it be solved?

Because evt_sms_copy is a new table, and the data is incremented by the primary key ID, the index is tight, and the data page utilization has reached the peak state, which causes the problem that the disk file cannot be shrunk.

dry goods

Directly execute the alter table evt_sms engine = Innodb command to achieve disk file shrinkage.

Here we need to talk to you about the different processing of different versions.

Before MySQL 5.5, this command did the same thing as our solution, except that evt_sms_copy did not need to be created by itself.

If there is new data during the execution of the command, it will cause data loss, because the DDL version before MySQL 5.5 is not online. Therefore, no data changes can be made.

Now MySQL has been updated to version 8. If you are a new project, use version 8 directly. Don't use the old version before 5.6. Kaka has been using MySQL version 8.0 since 2018.

In the article on the lock, I talked to you about the optimization of DDL operations in MySQL 5.6 and the introduction of Online DDL.

Optimized execution flow

  • Create a temporary file tmp_file, and store the B+ tree of the table in the temporary file. If there is an operation on the table at this time, it will be recorded in the row log file.
  • After all data is flushed from the original table to the temporary file, the data in the temporary file is consistent with the data in the original table.
  • Finally replace the data file of table A with the temporary file.

Origin of Online DDL

It can be seen that data updates will be recorded in the row log when shrinking the disk file, which means that the table can be added, deleted, modified and searched when shrinking the disk space.

important point

In the process of shrinking disk files, the original data and new temporary files will be scanned in the full table. If your table is very large, it will consume a lot of IO and CPU.

Therefore, you want to do this safely, you can use the open source gh-ost to do it.

in conclusion

When you want to shrink the table disk file very large due to a large number of additions, deletions, and changes, you can execute the alter table evt_sms engine=Innodbcommand to achieve the purpose of shrinking the tablespace.

5. Practice is the only criterion for testing the truth of knowledge

You should know that 实践是检验认识是否具有真理性的唯一标准, then let’s put the conclusions put forward in this article into practice.

  • Execute the ctrl + zend MySQL task window first
  • Execute to ll -hcheck that the size of the table evt_sms disk file is 108M at this time
  • Execution fgreturns to the MySQL Tasks window
  • Excuting an orderalter table evt_sms engine=Innodb
  • Execute again ctrl + z, and execute to ll -hcheck that the disk file size has reached 128k.

The above picture is the whole process of the Ka Ka operation. The conclusion is that executing the command alter table ect_sms engine = Innodbcan reduce the hole problem caused by a large number of additions, deletions and changes in the table. Finally achieve the purpose of shrinking the table space.

6. Development Suggestions

To delete data, do not use delete, but use soft delete, just make a mark to delete.

In this way, there will be no void problem, and it is also convenient for data traceability.

Each table must have three fields create_time, update_time, delete_time.

7. Summary

From this article, we need to know the following points.

  • There will be holes in the table that is checked through a large number of additions, deletions and modifications
  • To get rid of the hole, you need to execute alter table evt_sms engine=Innodb to solve
  • Using delete to delete data will only do a mark processing, and will not really delete the space
  • All conclusions of this article are based on innodb_file_per_table = on

Persistence in learning, perseverance in writing, perseverance in sharing are the beliefs that Kaka has upheld since her career. I hope the article can bring you a little help on the huge Internet, I am Kaka, see you in the next issue.

Guess you like

Origin blog.csdn.net/fangkang7/article/details/121153360