SVN completely deletes obsolete directories by backing up, filtering, and importing again

foreword

The space occupied by SVN increases with the iteration of the project version. Because each version in the history is saved, even if the obsolete directory is deleted and submitted locally, the excess space will not be released. There is a high probability that the operation will increase due to deletion. A version number is added, which makes the occupied space larger.

How can it make its space smaller? The simplest and rude way is to discard the historical records. The discarded directories deleted in the latest version are directly uploaded to create a new warehouse, so that all people need to re-download after the operation. If you want to keep historical records, you need to use the method of backup, filter, and re-import mentioned today.

brief steps

Suppose the path of svn repository A on the server is: /data/svndata/repos/A, the path of the file to be deleted is /arts/tmp/pictures, note that the path of the folder to be filtered starts /with , which actually refers to the starting directory A.

The operation steps are as follows:

//1.备份
svnadmin dump /data/svndata/repos/A > A.dump

//2.过滤掉废弃目录
cat A.dump | svndumpfilter exclude /arts/tmp/pictures > B.dump

//3.创建新的库
svnadmin create /data/svndata/repos/B

//4.导入新库
svnadmin load /data/svndata/repos/B < B.dump

//5.重命名老库
mv A A_backup

//6.重命名新库代替老库
mv B A

Operation example

The operation process uses a real SVN warehouse. I just tried it once according to the above steps. I mainly want to see if it is as time-consuming as the legend. The warehouse name is R, the data volume is 115G, and there are 10843 versions in total.

  • export backup file

    # svnadmin dump/data/svndata/repos/R > r.dump
    * Dumped revision 0.
    * Dumped revision 1.
    * Dumped revision 2.
    ...
    

    It took 2 hours and 40 minutes to complete the export, and the export file size was 514G, and the data volume increased by nearly 5 times.

  • Filter out the specified directory

    # cat r.dump | svndumpfilter exclude /arts/tmp/pictures > r-exclude.dump
    Excluding prefixes:
       '/arts/tmp/pictures'
    
    Revision 0 committed as 0.
    Revision 1 committed as 1.
    Revision 2 committed as 2.
    Revision 3 committed as 3.
    Revision 4 committed as 4.
    ...
    

    Filter out the /arts/tmp/pictures directory, it takes 58 minutes in total, and the size of the filtered backup file is 442G

  • Create a new staging repository

    # svnadmin create r-new
    
  • Import the filtered backup file into the new repository

    # svnadmin load ./r-new < r-exclude.dump 
    <<< Started new transaction, based on original revision 1
         * editing path : arts ... done.
         * editing path : develop ... done.
    
    ------- Committed revision 1 >>>
    
    <<< Started new transaction, based on original revision 2
         * editing path : develop/client ... done.
         * editing path : develop/server ... done.
    
    ------- Committed revision 2 >>>
    ...
    

    The import work was applied sequentially from the first version and took a total of 4 hours.

This process is really slow. It takes nearly 8 hours to complete and requires a lot of disk space. In addition, there is a library with a version number of nearly 10,000 and a size of 800G. It takes 41 hours to export only on a 24-core machine. , the size of the backup file is 4.5T, I will not try it later, the disk is almost full.

Summarize

  • backupsvnadmin dump /data/svndata/repos/A > A.dump
  • filtercat A.dump | svndumpfilter exclude /arts/tmp/pictures > B.dump
  • Jianxinsvnadmin create /data/svndata/repos/B
  • importsvnadmin load /data/svndata/repos/B < B.dump
==>> Anti-climbing link, please do not click, it will explode on the spot, and we will not be responsible for it! <<==

The energetic child is really cute, but I still want to grow up, so that I have the opportunity to control the rhythm. Although I look like a marionette now, I am trying to break through the shackles~

Guess you like

Origin blog.csdn.net/shihengzhen101/article/details/130277612