lighty's life

lighty developer blog

Delay Request Handling for Stupid Crawlers

I’m sure you know what “Crawl-Delay” is, but you may or may not know that, not all search engine crawlers support this nice stuff.

What to do for those don’t obey the instrustion? They’ll eat all your Mbits/month or slow your webserver down. OK, ban it with url.access-deny. This is the only option u can choose before. But you don’t want to remove your pages from the stupid search engine index, do you?

Here comes another option for you: with this patch, u can delay handling of a specified request for some seconds. Example configuration:

$HTTP[“user-agent”] =~ “stupid-crawler” { connection.delay-seconds = 2 }

OK, here’s the link to the lighttpd-2296-request-handle-delay.patch which applies to branches/lighttpd-1.4.x@2296

Be aware that this patch is to be reviewed before commited to repo.