mongoDB releases disk usage

Today op said that the mongo space is full, and then has the next solution:

Recommend the second


Deleting data in mongo does not directly release the disk, but generates a lot of fragments. These fragments will continue to be used by mongo. When new data is inserted, these fragments will be reused without the need to apply for new disk space.

 

But the problem that this will cause is that the disk may have been at a high level of use, which is a time bomb for operation and maintenance. Because the fragments will only be used by the library they belong to, but we often create a lot of new libraries, which leads to more and more disks.

 

Currently I have found three methods

 

1. Clear the standby node and resynchronize.

 

During the synchronization of the standby node, the fragmented space will be defragmented. The standby node synchronizes complete data and indexes.

 

Then perform a master/backup switchover.

 

Make the standby node the primary node.

 

Clear the master node data again and synchronize it once.

 

In the end, both the main and the standby achieve the purpose of releasing fragments.

 

Advantages: service is always available

 

Disadvantages: too much action. Ensure that the synchronization process is within the synchronization window. It is recommended to set the oplog as large as possible

 

2. Use the compact command

 

compact is the compression command in mongo. You can sort out the fragments generated by deleting data.

 

Under the WiredTiger database engine, this command will release the sorted space to the operating system.

 

Under the MMAPv1 engine, compact will defragment and rebuild the index, but will not release unused space to the system. The newly inserted data can still use this space.

 

command:

 

Enter the database to be executed

 

db.runCommand({ compact: <collection name>,force:<boolen> } )

 

Execute this command once for the active and standby nodes respectively

 

Some notes in replica set mode:

 

This command will block the library that runs the command, so be sure to choose the execution time

 

The compact command will not be automatically copied to the secondary node for execution. Compact is independent in each node member. The force parameter must be used when executing in the primary and secondary nodes.

 

When the compact command is executed on the secondary collection, the secondary node will become RECOVERING and cannot provide read operations

 

There is no need to use compact command in capped collection, because it is a fixed space

 

experiment:

 

The current library occupies about 5.8G of disk

 

 

Then I deleted some data

 

 

shard1:PRIMARY> db.runCommand({ compact: "fs.files"} )

 

{

 

"ok" : 0,

 

"errmsg" : "will not run compact on an active replica set primary as this is a slow blocking operation. use force:true to force"

 

}

 

shard1:PRIMARY> db.runCommand({ compact: "fs.files",force:true} )

 

It will be stuck here for a long time, and database operations will be locked

 

 

Practice has proved that it can be compressed, but it cannot reach the actual disk usage of the data

 

3,做repairDatabase

 

I did not try this command, use it with caution.

 

The repairDatabase requires the remaining disk to be large enough.

 

But when my disk alarms, there are not enough disks for repairDatabase

 

db.runCommand({repairDatabase:1})

 

Note: This command must be used with caution, and try not to use it if it can be used, because it will take a lot of time and performance

 

A sentence on the official website says:

 

The repairDatabase command compacts all collections in the database. It is identical to running the compact command on each collection individually

 

It is not recommended to directly repairDatabase, it is better to compact the collection that needs to be compressed


 


1 Background description

After filling the data with MongoDB, it is found that there is not enough space for indexing, so I use db.collections.remove() to delete some data.

But when the deletion is over, the number of data is reduced, but the disk space is not freed up.

2 Solve the problem

Use the compact command to compress the collection and its indexes:

> use test
switched to db test
> show collections
people
system.profile
> db.runCommand({compact:"people"}) # 未指定 force, 报错
{
    "ok" : 0,
    "errmsg" : "will not run compact on an active replica set primary as this is a slow blocking operation. use force:true to force"
}
> db.runCommand({compact:"people", force:true}) # 指定 force, 执行成功
{ "ok" : 1 }

Advantages of compact:
1. Compact only compresses the required collection, the temporary files generated during the compression will be relatively small, and the need for remaining disk space is small;
2. The compact removes the fragments of the files where the collection is located, and the cost of rebuilding the index is also reduced. , The demand for memory is also reduced;

Compact attention points:
1. The compact operation will not release the disk space, but will continue to use the released space for MongoDB;
2. The corresponding lock will be generated when the compact operation is in progress, which can be viewed with mongostat, so the operation is best The business volume is the least;
3. The limited collection {Capped Collection} does not require compact

Guess you like

Origin blog.csdn.net/qq_32907195/article/details/112529872