Talk about the recovery of the database table space

EDITORIAL summary:

1. innodb_file_per_table recommended parameter set to ON, so that a table is stored as a single file, through the drop table command, the system will delete the file. And if it is in the shared table space, even if the deleted table, space is not recovered

2. Delete the entire table, you can use the drop table command to reclaim table space.

3.delete space recovery command not only reusable labeled

4. delete records and data pages can be re-used space, but the record can only be reused within a specific range, and unlimited data page.

5. CRUD voids can cause data to be repaired by rebuilding the table. (5.5 and previous versions can not be done Online, 5.6 can later, that can rebuild [online] write data without blocking, but for safety, online recommended gh-ost)

Relations 6.Online and inplace of: DDL If the process is Online, it must be inplace of; not necessarily contrary (such as adding full-text indexing and index space)

7. Finally, spoke optimize table, analyze table and alter table three ways to rebuild the table difference

 

Have you encountered due to take up too much space database, the table data is deleted half, but found that the size of the table file is still not changed the problem?

 

In this article we talk about the recovery of the database table space to see how to solve this problem.

 

We discuss for the most widely used MySQL InnoDB engine.

A InnoDB table consists of two parts, namely: the table structure definition and data .

 

In previous versions of MySQL 8.0, the table structure is present in the .frm file suffix. And MySQL 8.0 version, has allowed to define the table structure in the system data table. Because the table structure defines the space occupied by small, so we are discussing today is the main table data.

 

Table data can be shared table space exists, may also be a separate file. This behavior is caused by parameters innodb_file_per_table control:

1. This parameter is set to OFF indicates that the data table in a table space sharing system, i.e. together with the data dictionary;

2. This parameter is set to ON indicates that each InnoDB table data stored in a file with the extension to the .ibd.

 

Starting MySQL 5.6.6 version, it's the default value is ON. (Recommended regardless of which version of MySQL, this value will be set to ON)

 

Because, a table is stored as a separate file management easier by drop table command, the system will delete the file. And if it is in the shared table space, even if the deleted table, space is not recovered .

 

The following discussion is based on this set up a deployment.

 

Mentioned above, delete the entire table, you can use the drop table command to reclaim table space .

However, the scene is more delete data delete rows, then you have a problem at the beginning: the data in the table is deleted, but the table space has not been recovered.

 

Data deletion process

 

 

 When you want to delete this record R4, InnoDB engine will only R4 this record marked for deletion. If after inserting an ID is recorded again between 300 and 600, this position may be multiplexed. However, the size of the disk file and will not shrink.

 InnoDB data is stored by the page, if we deleted all records on a data page, a whole page of data can be re-used .

 

However, multiplexed with the recording of the multiplexed data pages is different .

Multiplexing recording, data conforming limited range condition. In the example above, R4 After this record is deleted, if an ID is inserted into the line 400, may be multiplexed. However, if an ID is inserted is the line 800, it can not be reused.

When an entire page after take off from a B + tree which can be multiplexed into any position.

 

If adjacent two pages of data utilization is very small, this system will fit on two pages of data on one page, the other page of data is marked as reusable.

Further, if we put the entire table of data with the delete command to delete all the data pages will be marked as reusable. But on the disk, the file is not reduced.

Obvious, delete command is only the position of the record, or data page marked for "reusable", but the size of the disk file will not change.

In other words, by the delete command can not reclaim table space. These can be reused, while the space is not being used, looks like a "hole."

 

In fact, not only deleted data can cause cavities, insert the data will be .

 

If the data is inserted into ascending order according to the index, the index is compact. But if the data is randomly inserted, it may cause data page split index.

 

The above figures assume page A full, then I have to insert a row, what will happen?

 

 

You can see that because of page A full, then insert a data ID is 550, it shall no longer apply for a new page page B to save the data.

After the completion of the split pages, page A on the left end of the hollow (Note: Actually, the position may be more than one record is empty).

 

In addition, the updated value of the index can be understood as to delete an old value, and then insert a new value. Understandably, this is the cause empty.

So, after a lot of additions and deletions to change the table, it is there may be empty.

 

Reconstruction table

If you now have a table A, shrink space needs to be done, in order to remove the empty table exists, how can you do it?

You can create a structure same as in Table A Table B, and the primary key ID in ascending order, the data read out line by line and then inserted into Table A Table B.

Since the new table is the table B, so that the hole in the primary key index table A, table B, it does not exist. Obviously, the primary key index table B more compact, also higher utilization data pages. If we put the temporary table as Table B, import data from Table A Table B after the operation is completed, replacing A with Table B, In effect, it serves as a contraction of Table A space.

 

You can alter table A engine = InnoDB command to rebuild the table. Before MySQL 5.5 version, execute this command with the process we've described is similar to the difference is only temporary table B do not need to create your own, MySQL will automatically dump data exchange table name, delete the old table.

 image 3

 

 

Obviously, spend the most time step is the process of inserting data into a temporary table , if in the process, there are new data to be written to Table A, it would cause data loss.

Thus, the entire process DDL, Table A can not be updated. In other words, this is not a DDL Online's.

 

In Online DDL MySQL 5.6 version introduced in the beginning of the operational processes optimized.

After the introduction of the Online DDL, table reconstruction process:

1. Create a temporary file, all the data pages A scan table primary key;

2. The recording data generating page tables A B + tree is stored in the temporary file;

3. The process of generating a temporary file, all the operations of the A recorded in a log file (row log), the corresponding figures are state2 state;

4. temporary file is generated, the operation log file to a temporary file in the application, to obtain a logical data in Table A the same data file, the corresponding figure is state3 state;

The replacement data file of Table A temporary files.

Figure 4 

 Can be seen, different from the previous version 5.5 procedure is that, since the recording and reproducing operation log file is present this feature, the program in the reconstruction table, Table A do allow additions and deletions to the operation. This is the source of Online DDL names.

 

Some may ask, is to bring it before the DDL MDL write lock, so you can call Online DDL?

 

Indeed, alter statements at boot time need to acquire MDL write lock, but the lock before you actually write copy data to degenerate into a read lock.

Why should it degenerate? In order to achieve Online, MDL read lock does not block the CRUD operations.

Why do not you just unlock it? To protect yourself, prohibits other threads on this table do DDL same time.

 

For a large tables, Online DDL is the most time-consuming process of copying data into the temporary table can accept CRUD operations during the execution of this step.

So, relative to the entire process DDL, the lock time is very short. For business, it can be considered the Online.

 

The above-described reconstruction method scans the data table and constructs the original temporary file. For large tables, this operation is very consuming CPU and IO resources. Accordingly, if online services, operating time to be carefully controlled. If you want to compare the safety of the operation, I recommend that you use GitHub open source gh-ost do.

 

 

Online 和 inplace

In Figure 3, we have the data in Table A is called storage position turned out tmp_table. This is a temporary table is created in the server layer.

In Figure 4, Table A reconstruction out of the data is on the "tmp_file" inside, this temporary file is created InnoDB out internally.

DDL whole process is done in-house InnoDB. For the server layer, there is no data to move into a temporary table, an "in situ" operation, which is the source of "inplace" name.

 

Q: If there is a table of 1TB, is now among disk 1.2TB, a DDL can not do inplace of it?

The answer is no. Because, tmp_file also to take up temporary space.

 

This statement alter table to rebuild the table of t engine = InnoDB, in fact, the implication is:

alter table t engine=innodb,ALGORITHM=inplace;

With inplace corresponding copy of the table is the way, the usage is:

alter table t engine=innodb,ALGORITHM=copy;

When you use ALGORITHM = copy, the copy is mandatory table indicates, the process is the operation corresponding to FIG. 3.

 

But you may still feel, inplace with Online is not that a meaning?

Well, not exactly, but in the reconstruction table is exactly this logic would be it.

For example, add a field full-text index if I'm going to InnoDB tables, wording is:

alter table t add FULLTEXT(field_name);

This process is inplace, but will block CRUD operations, the non-Online.

 

If you say what is the relationship between these two is logical, then, can be summarized as:

1. DDL Online process if it is, it must be inplace of;

2. In turn necessarily, that is inplace of DDL, there's probably not Online. As of MySQL 8.0, add full-text indexing (FULLTEXT index) and spatial index (SPATIAL index) This is the case.

 

 

Difference optimize table, analyze table and alter table three ways to rebuild the table

1. From start MySQL 5.6 version, alter table t engine = InnoDB (i.e. the recreate) the default is the flow of FIG. 4 above;

2.analyze table t actually not rebuild the table, but the index information table do recount, not modify the data, the process adds MDL read lock;

3.optimize table t equal to recreate + analyze.

Guess you like

Origin www.cnblogs.com/roggeyue/p/12451585.html