It was found that although wget follow robots.txt rules, but that can still go around, now 56 cloud shield small series to share my own method of use to you:
- Shield download any file
.htaccess
SetEnvIfNoCase User-Agent "^wget" bad_bot
<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit>
- Shielded part of the file downloads
.htaccess
SetEnvIfNoCase User-Agent "^Wget" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.5.3" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.6" bad_bot
<Files ~ ".(html|pdf|mp3|zip|rar|exe|gif|jpe?g|png|php|jsp) $">
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</files>