hadoop snapshot backup and recovery

Backup files on HDFS through snapshot

For the api address, please see http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.5.0-cdh5.2.0/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html

==========================================================================================

1. Allow snapshot creation

First, execute the command below the folder you want to backup to allow the folder to create snapshots

hdfs dfsadmin -allowSnapshot

例如:hdfs dfsadmin -allowSnapshot /Workspace/linlin

 

Center

The appearance of this command proves that the creation of snapshots has been allowed successfully.

=================================================================================================

2. Create a snapshot

Next, start backing up this folder.

hdfs dfs -createSnapshot [name]

例如 hdfs dfs -createSnapshot /Workspace/linlin bak1

Center 1

The appearance of this command proves that the snapshot has been created successfully.

At this point we can consider whether we can create a snapshot in the linlin subdirectory

The directory structure on hdfs' is as shown in the figure

Center 2

Then try to create a snapshot on snaptest

hdfs dfs -createSnapshot /Workspace/linlin/snaptest bak2

Center 3

The error is reported and is visible. Snapshots can only be created in directories that you allow;

There was no snaptest folder when taking the first snapshot bak1. Now there is a snaptest folder. Create another snapshot.

If you still use

hdfs dfs -createSnapshot /Workspace/linlin bak1

Center 4

There is an error message that the snapshot name already exists.

执行 hdfs dfs -createSnapshot /Workspace/linlin bak2

Center 5

Created successfully;

==============================================================================================================

3. View snapshots

View all snapshottables

hdfs lsSnapshottableDir
查看到曾经允许创建快照的所有目录 查看当前快照下的文件 hadoop sanpshot 创建快照时候,默认的文件夹是.snapshot 查看时候必须加上.snapshot才能看到里面备份的东西; .snapshot是后来hadoop才有的产物,所以之前若是有文件夹命名为snapshot关键字就不能创建快照了;
执行命令 hdfs dfs -ls /Workspace/linlin/.snapshot/

Center 6

You can see that there are three backups under this snapshot, namely bak1, bak2, and linlin.

===========================================================================================================

4. Compare snapshots

Make a comparison between snapshots and see the differences in the backup files between the two snapshots

Excuting an order

hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot>

例如: 执行命令 hdfs snapshotDiff /Workspace/linlin bak1 bak2
结果 Results:


+ The file/directory has been created. - The file/directory has been deleted. M The file/directory has been modified. R The file/directory has been renamed.
这里出现M 代表我对linlin文件夹进行了修改,+代表新增了一个文件夹 snaptest ============================================================================================================================ 5、恢复快照
恢复快照: hdfs dfs -cp <path> <path> 例如: hdfs dfs -cp /Workspace/linlin/.snapshot/bak2/snaptest /Workspace
查看hdfs目录:

已经成功恢复到 Workspace
================================================================================================================================== 题外话: 我们可以尝试着删除建立过快照的文件夹:是无法删除的,会提示
存在快照无法删,证明若是在文件夹下面建立快照,文件夹就无法删除或者移动

Guess you like

Origin blog.csdn.net/qq_16504067/article/details/132806528