Use of git bfg (delete sensitive information in commit records, delete files, etc.)

Preface

During the use Gitof , we may mistakenly submit some sensitive information (keys, personal privacy) or useless files to the remote warehouse. At this time, we need to clean up the relevant data and directly delete the sensitive information in the file before submitting it. Although there is no sensitive information in the warehouse , the corresponding sensitive information can still be seen in the submission history .

When we need to remove this sensitive information from the submission record and do not want the submission record of the entire warehouse to be lost, we can use the official git-filter-branch tool, but it is cumbersome and not fast to use.

It is recommended to use the BFG Repo-Cleaner tool here. It is Scalawritten by and is specially gitmade for removing submission records. It is git-filter-brancha substitute for and the official introduction says that it is git-filter-branchup to three 10~720times faster than.

BFG Repo-Cleanerusage of

Official website process introduction:

git clone --mirror git://example.com/some-big-repo.git
java -jar bfg.jar --strip-blobs-bigger-than 100M some-big-repo.git
cd some-big-repo.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git push

Details below.

1. Installation

  1. It requires a local javaenvironment. This article will not introduce the installation.
  2. Download bfg.jar, the download link provided here is 1.14.0 , you can download it from the official website yourself.

2. Basic usage

my-repo.gitUse to --mirrorclone to your local code repository. The file structure for executing the command is as follows:

Insert image description here
my-repo.git.bfg-reportWhat is recorded is modified data.

  1. The following command will 500Mclear all files larger than in the submission history.
java -jar bfg.jar --strip-blobs-bigger-than 100M my-repo.git
  1. Delete specified file
java -jar bfg.jar --delete-files id_dsa my-repo.git // 删除 id_dsa 文件

java -jar bfg.jar --delete-files id_{dsa,rsa} my-repo.git // 文件名为 `id_dsa` 和 `id_rsa` 的文件都会删除
  1. delete directory
java -jar bfg.jar --delete-folders pwd my-repo.git // 删除 pwd 目录
  1. Remove sensitive information
java -jar bfg.jar --replace-text replace_pwd.txt my-repo.git

Here replace_pwd.txtwe define the text that needs to be removed. We can learn and use the specific grammar rules by ourselves. The example is as follows:

PASSWORD1                       # 默认删除 PASSWORD1 的相关记录
PASSWORD2==>examplePass         # PASSWORD2 改为 examplePass
PASSWORD3==>                    # PASSWORD3 转为 空字符串
regex:password=\w+==>password=  # 正则匹配替换,将password具体的数据删除 password=xxx 替换为 password=
regex:\r(\n)==>$1               # 替换 Windows 换行符 为 Unix 换行符

3. Example

bfgScript

# 先使用 `--mirror` 将数据库克隆下来, `git --mirror` 的作用,感兴趣的可以自己去搜索。
git clone --mirror xxx/my-repo.git

# 替换 Password1:xxx 为空
java -jar bfg.jar --replace-text "replace_pwd.txt" --no-blob-protection my-repo.git

# 删除 pwd.txt 文件
java -jar bfg.jar --delete-files "pwd.txt" --no-blob-protection my-repo.git

cd my-repo.git

# 清理脏数据
git reflog expire --expire=now --all && git gc --prune=now --aggressive

# 推送至远端
git push

replace_pwd.txtdocument content

regex:Password1:[\s\S]+==>

Simulated an operation of mistakenly uploading sensitive information to a remote warehouse.

Insert image description here

  1. addcommit test.txtand pwd.txtfile
    Insert image description here
  2. deleteCommit pwd.txtdeletes the file and removes the value of test.txtin the file Password1.
    Insert image description here
  3. After executing bfgthe script, check the two submissions again and find that all sensitive information has been removed.

addThe assignment in the submission Password1is gone, and pwd.txtthe modification of the file cannot be seen.
add_after
deleteThere are no modifications to the submission.
delete_after

4. Problems encountered during use

  1. You need to --mirrorclone the code repository first. All operations are based on the cloned code repository.
  2. bfggitFiles in the repository will not be deleted , even if --delete-filesdeleted using . The correct approach should be to manually delete the file, submit it to the remote warehouse, and bfgdelete only the relevant data in the submission history.
  3. In the end git push, it always reports insufficient permissions (the code repository I created myself), so --no-blob-protectionparameters need to be added. Some of the permission controls here are not very clear yet. If it doesn't work, you can release the protection of branch / first tag, and then set it back to the original after the update is completed. Insert image description here
    Here is an example of a branch that is not allowed to be forced to push. If you turn off the button switch in the picture, and push again, an error will be reported.
  4. --replace-textThe parameter needs to provide a replacement sample, and one line is a replacement sample. What is replaced is the entire warehouse data, and the warehouse will be scanned.
  5. What should be done if there are devand mainbranches, both of which contain some commit history that needs to be stripped of sensitive information? Because clonethe command uses --mirrorthe flag, this push will update all references on the remote server, namely: devand mainbranches, including all tagcommit records of will be deleted.
  6. The commit record will not be deleted, only the data modifications in the record will be removed.

Summarize

bfgMain purpose: If you want to retain the submission history, you need to delete information in the submission history that you do not want others to see (keys, personal privacy, etc.) or remove large files.
After the environment configuration is completed, it is relatively convenient to use and the software runs very fast. It is recommended when encountering similar needs.

Finally, I hope that everyone will not use this tool and try to prepare the corresponding knowledge reserves ( gitits use, filtering of sensitive information, etc.) in advance to prevent unnecessary trouble.

reference

Guess you like

Origin blog.csdn.net/DisMisPres/article/details/127442307
Recommended