Continued From Page 4
"Some handy variations of ReWritecond..."
RewriteCond %{HTTP_USER_AGENT} ^WebReaper.*$ [OR] RewriteCond %{HTTP_USER_AGENT} ^JOC Web Spider.*$ RewriteRule /* - [F]
If you notice strange browsers(HTTP_USER_AGENT) in your server logs, they could be site grabbers, or offline-browsers. The two ReWriteCond's above would check if the browser is
looks like "WebReaper", OR "Joc Web Spider" and send back a 403 [Forbidden] response. When testing Webreaper and JOC, this code causes the applications to stop before they download any html, or images. The applications preserves the original url they requested and they would have to click it visit your site. You could redirect these applications to another url; however, when testing JOC it wouldn't allow itself to be redirected. It appeared to try re-requesting the origianl url, perhaps it uses an internal failure count. It did obey the 403 response! There are many of these offline-browsers available, check your logs occasionally for them. As an alternative to expliciting coding each one, and using the fact that they shouldn't pass anything in the HTTP_REFERER variable, the following should protect your images from them:
note: this could have an effect on friendly search engine spiders!
RewriteCond %{HTTP_REFERER} ^$ ReWriteRule .*\.(jpg|gif)$ - [F]
"Some other handy things...."
RewriteCond %{HTTP_REFERER} ^http://www.nastydomains.com/.*$ [NC] ReWriteRule .* http://www.nasty-domain-hell.com/ [R,L]
The above rule could be used to send anyone coming from a particularily nasty domain, with a request for anything(.* = a string(eg. url) of any length) to your special place.
RewriteCond %{HTTP_REFERER} !^http://www.newbie.com/.*$ [NC] RewriteRule .*page1\.html$ http://www.newbie.com/index.html [R,L]
If someone was deep-linking into your a specific page on your domain "page1.html" the above rule could be used to redirect them to your index page.
You can create ReWriteCond statements that use almost any of your server's environmental variables including: cookies, DNS, IP's. Check the Apache document links included in the article, if you need something special. :)
"I want them to hotlink me...."
If a search engine caches your page, the surfer clicking on your link won't see any of your images, or banners. This is a scenario where you might want the the search engine to be able to hotlink at least your banners so the surfer would only see it.
Funny Symbols
! - negation
^ - start of string
$ - end of string
. - any character
? - repetition [0,1]
* - repetition [0,..,n]
+ - repetition [1,..,n]
\ - escape character
[NC] - ignore case
[R,L] - redirect, last
Regular Expressions
simple examples...
Advanced Students
URL Rewriting Guide
URL Rewriting Engine
Apache HTTP Server
.htaccess testing!
www.hotlinking.com
It doesn't work?
Common problems...
Things that crawl
Friendly Spiders
billy
Buddha



