apache .htaccess file using wget to download web content to achieve shielding

It was found that although wget follow robots.txt rules, but that can still go around, now 56 cloud shield small series to share my own method of use to you:

  1. Shield download any file

.htaccess

SetEnvIfNoCase User-Agent "^wget" bad_bot

<Limit GET POST>

Order Allow,Deny

Allow from all

Deny from env=bad_bot

</Limit>

  1. Shielded part of the file downloads

.htaccess

SetEnvIfNoCase User-Agent "^Wget" bad_bot

SetEnvIfNoCase User-Agent "^Wget/1.5.3" bad_bot

SetEnvIfNoCase User-Agent "^Wget/1.6" bad_bot

<Files ~ ".(html|pdf|mp3|zip|rar|exe|gif|jpe?g|png|php|jsp) $">

Order Allow,Deny

Allow from all

Deny from env=bad_bot

</files>

Guess you like

Origin blog.51cto.com/14540004/2455578
Recommended