Solve the problem that the disk space of the Linux instance is full

Solve the problem that the disk space of the Linux instance is full

This article mainly introduces the investigation and corresponding solutions to the problem of insufficient disk space in Linux instances.

Problem Description

If you create a file in the cloud server ECS instance of the Linux system or the application reports an error as follows, it means that your disk space is insufficient: No space left on device . If the disk is full and meets your expected usage, you can solve the problem by adding a new cloud disk or expanding the capacity of the cloud disk. For details, see Creating a Cloud Disk  ,  Mounting  a Data Disk  , and  Cloud Disk Expansion Guidelines  . This article mainly introduces when the disk is full beyond your expected use, you can judge the cause of the disk full and the corresponding solution according to the following.

problem causes

Insufficient disk space problems are usually caused by the following four categories:

  1. Disk partition space usage reaches 100%.

  2. Disk partition inode usage reaches 100%

  3. There are deleted unreleased zombie files on the disk.

illustrate:

The deleted file may be opened when the file handle is deleted, resulting in the file space not being released when the file is deleted.

Mount point overrides.

illustrate:

A large number of files already exist in the directory of the original file system, and the mount point (directory) is overwritten after the new disk is mounted. However, the applications in your system may still continue to read and write the original file system space. At this time, your application may report that the space is insufficient, but you cannot use df the or du disk file directory. The reason is df that or The command du counts the usage of the partition corresponding to the current mount point.

solution

Please handle it in the following ways according to different causes of the problem.

1. Disk partition space usage reaches 100%

You can solve the problem that the disk partition space usage reaches 100% by cleaning up files or directories that take up a lot of disk space, expanding the disk capacity, or buying a new disk. The specific operation steps are as follows:

Clean up files or directories that take up a lot of space

  1. Remotely connect to the ECS instance.

    For details, see  Logging In to a Linux Instance Using Password or Key Authentication  .

  2. Execute the following command to view the disk usage.

df -h

The system displays information similar to the following. For example, partition /dev/xvda1 is 15% used.

3. Execute the following command to enter the root directory and check which directory takes up more disk space.

cd /
du -sh *

The system displays information similar to the following. From the example in the figure, it can be seen that /usr the directory occupies the largest space, so you need to continue to check /usr which file or directory under the directory occupies the largest space. Please operate according to the actual environment.

4. Execute the following commands to check which directory takes up more disk space step by step. For example, enter a larger /usr directory , and continue to check /usr which file or directory is larger in the directory.

cd /usr
du -sh *

The system displays information similar to the following. From the example in the figure, it can be seen that local the directory occupies the largest space, you need to check local which file or directory under the directory occupies the largest space, and so on.

5. Based on the judgment of the business situation, delete the files or directories that are no longer used.

Expansion disk or newly purchased disk

If you cannot free up more space by cleaning files, you can consider expanding the disk or purchasing a new disk to solve the problem. For details, see Guidelines  for Creating Cloud Disks  ,  Mounting Data Disks  , and  Cloud Disk Expansion  .

2. The disk partition Inode usage rate reaches 100%

If the inode usage rate of the disk partition reaches 100%, your application cannot continue to create new directories or files. At this time, the corresponding disk space in your system is usually not full, and the full inode is also a point that is usually overlooked. You can solve the problem that the disk partition Inode usage reaches 100% by clearing the files or directories with high Inode usage, or increasing the number of Inodes.

illustrate:

Important information such as file type, size, permission, owner, number of file connections, creation time and update time, and pointer information to data blocks are recorded in the Inode node of Linux. In general, there is no need to modify the Inode configuration. If there are many files stored and the Inode capacity is full, it needs to be modified.

Query Inode usage

  1. Remotely connect to the ECS instance.

    For details, see  Logging In to a Linux Instance Using Password or Key Authentication  .

  2. Run the following command to query the Inode usage.

df -i

3. If the Inode utilization rate reaches or approaches 100%, it can be handled in the following two ways:

Clean up files or directories with high Inode usage

If it is inconvenient to format the disk to increase the number of Inodes, you can refer to the following steps to clean up files or directories with high Inode usage.

1. Execute the following command to analyze how many files are in each secondary directory under the root directory.

for i in /*; do echo $i; find $i | wc -l; done

The system displays information similar to the following. It can be seen from the example in the figure that there are the most files in /usr the directory , and you need to continue to check /usr which directory has the most files in the directory. The more files, the higher the Inode usage. Please operate according to the actual environment.

2. Enter the directory with the highest Inode occupation layer by layer, continue to execute the above commands, gradually locate the files or directories that occupy too much space, and finally clean up accordingly.

Increase the number of Inodes

If it is not allowed to clean up the files on the disk, or the Inode usage rate is still high after cleaning up the files that can be cleaned up, you need to backup data, reformat the disk to increase the number of Inodes, copy back the data, etc. to complete the data retention and Increase the number of file system inodes.

warn

  • The adjustment of the number of Inodes needs to reformat the disk, and the data in the disk will be deleted. Please ensure that the data has been effectively backed up before performing the following operations. You can copy files yourself, or back up data through snapshots. For details about creating snapshots, see  Creating a Cloud Disk Snapshot  .

  • The adjustment of the number of inodes needs to unmount the file system, which may cause interruption of your application services. Please choose a suitable time for your business.

1. Execute the following command to unmount the file system.

This example uses uninstallation /home as an example , please replace it according to your actual environment.

umount /home

2. Execute the following command to re-establish the file system and increase the number of Inode nodes.

In this example, the disk partition is /dev/xvdb, the file system type is ext3, and the number of inodes is 1,638,400. Please modify it according to your actual environment.

mkfs.ext3 /dev/xvdb -N 1638400

illustrate

The number of Inodes in Linux is usually generated according to the size of the disk capacity, generally at a ratio of 1:16KB. Taking a 40 GB cloud disk as an example, the number of Inodes in Linux is usually 2,621,440, and the maximum supported value is 2^32 (about 4.3 billion). , you can select an appropriate Inode value for your business by multiplying the actual cloud disk capacity by a certain magnification factor (for example, 1.2).

3. Execute the following command to remount the directory.

This example will remount the unmounted directory according to /etc/fstab the configuration , please operate according to the actual situation.

mount -a

4. (Optional) Run the following command to view and confirm the modified number of inodes.

dumpe2fs -h /dev/xvdb | grep node

The system displays information similar to the following, indicating that the number of Inodes has been adjusted successfully. You can then copy back the backup data and restore related applications.

3. Zombie files exist

If there is no problem with the disk partition capacity and Inode capacity, it may be that a large number of files in the system have been deleted (displayed as deleted) but are still occupied by processes in the system, and the system cannot release disk space, and because these files have been marked for deletion, Cannot be counted by df or command. du If there are too many zombie files, it will take up a lot of disk space. You can refer to the following steps to view and delete zombie files.

  1. Remotely connect to the ECS instance.

    For details, see  Logging In to a Linux Instance Using Password or Key Authentication  .

  2. If the system does not have lsof pre-installed, choose the following appropriate command to install lsof.

  • Alibaba Cloud Linux, CentOS and other systems
yum install -y lsof
  • Debian, Ubuntu and other systems
apt-get install -y lsof

3. Execute the following command to view the occupation of zombie files.

lsof |grep delete | sort -k7 -rn | more

The system displays information similar to the following, where the seventh column is the size of the corresponding file (unit is Byte), you can add up the value of the seventh column to see if the total file size is close to your unexpected disk usage, and if it is close, it is a zombie file Taking up space on your disk.

4. Use the following two methods to release handles, clear zombie files, and release disk space.

  • Restart the server to clear

Restart the server, the system will exit the existing process and release the handle of the deleted file.

important

Restarting the server may affect the business, please choose an appropriate time to restart.

  • Cleared by kill command

According to the PID process number (usually the second column) listed by lsof the command , use kill the command to end the service process occupying these files.

i. Execute the following command to list the PID process number.

lsof |grep delete 

ii. According to your business situation, ensure that the corresponding process can be stopped or restarted, and execute the following command to stop the service process occupying these files.

kill <进程号>

important

If the server is running business, it may affect the business, please operate with caution.

4. Mount point coverage

When you have ruled out the above three problems and haven't found unexpected disk space usage, the possible cause is mount point override. You can confirm it by the following method.

In the case shown in the figure below, you can see that the usage rate of the 30 GB system disk /dev/vda1 has reached 95%. du You can , mainly that /home the directory occupies 24 GB of space.

But when we mount /dev/vdb1 to /home the directory , as shown in the figure below, we can see that the usage rate of the system disk /dev/vda1 is still 95%, and the largest directory under the entire root partition only /usr occupies more than 1 GB, which cannot be found Specifically which directory takes up the most space, /home the space used by the statistics of the directory is only 20 KB, not the 24 GB space occupied by the previously seen, this phenomenon is the mount point coverage.

To solve the problem of overwriting the mount point, usually cancel the mount of the disk partition first, and then check the space occupation under the original mount directory.

warn

Partition unmounting may interrupt your application services, please choose a suitable time for your business.

Guess you like

Origin blog.csdn.net/qq_25231683/article/details/129778709