By default, HDFS trash is disabled

Hadoop is a hierarchical file system, so the old fashioned ' rm deathstar'  (DONT RUN THIS! 'rm -rf /') is the greatest fear of people who worry about stuff for a living (system admins). Hadoop has this really awesome feature called the trash can, so if you accidentally delete your entire filesystem it just gets moved into your hdfs home folder. This feature is really awesome. This feature is incredible. Also this feature is OFF by default. Check the configuration in core-default.xml: fs.trash.interval = 0 . The value of fs.trash.interval represents minutes between checkpoints. If 0, the trash feature is disabled.

To enable this feature, check your core-site.xml and make sure the trash is larger than 0 so you have time to realize and recover your data! Also you will need to restart your NameNode for this change to take effect.

In my case I set it as:

<property>
	<name>fs.trash.interval</name>
	<value>2880</value>
</property>

 This configuration will do you BIG favor if you accidentally run 'hadoop fs -rmr /' on HDFS, it leave you two days to recovery the deleted contents.

猜你喜欢

转载自puffsun.iteye.com/blog/1896099