lighty's life

lighty developer blog

Hash Balancing With Mod_proxy_core

mod_proxy and mod_proxy_core support 3 balancers to spread the load over multiple backends. One of them is Hash balancing which is very good for balancing the load of caching proxies like Squid. If you compare the performance of Hash-Balancing to the classic round-robin balancing you should see a increase of the performance as the backends can use their caches a lot better. With RR each backend has to handle the full URL namespace, with Hash-Balancing only a part. This increases the cache-locality and the overall performance. I've taken "wikipedia":http://wikipedia.org/ as testbed for the hash-balancing:
$SERVER["socket"] == ":1445" {
  proxy-core.balancer = "hash"
  proxy-core.protocol = "http"
  proxy-core.backends = ( "wikipedia.org" )
  proxy-core.rewrite-response = (
    "Location" => ( "^http://en.wikipedia.org/(.*)" => "http://127.0.0.1:1445/$1" ),
  )
  proxy-core.rewrite-request = (
    "Host" => ( ".*" => "en.wikipedia.org" ),
 )
}
The domain wikipedia.org resolves to several IP-addresses:
(trace) resolving wikipedia.org on port 80
(trace) adding 207.142.131.204:80 to the address-pool
(trace) adding 207.142.131.205:80 to the address-pool
(trace) adding 207.142.131.206:80 to the address-pool
(trace) adding 207.142.131.210:80 to the address-pool
(trace) adding 207.142.131.213:80 to the address-pool
(trace) adding 207.142.131.214:80 to the address-pool
(trace) adding 207.142.131.235:80 to the address-pool
(trace) adding 207.142.131.236:80 to the address-pool
(trace) adding 207.142.131.245:80 to the address-pool
(trace) adding 207.142.131.246:80 to the address-pool
(trace) adding 207.142.131.247:80 to the address-pool
(trace) adding 207.142.131.248:80 to the address-pool
(trace) adding 207.142.131.202:80 to the address-pool
(trace) adding 207.142.131.203:80 to the address-pool
When I request http://127.0.0.1:1445/ the load-balancer takes the URL hashes it and sends it the one of the backends.
(trace) using hash-balancing: /wiki/Main_Page -> 207.142.131.204:80
(trace) using hash-balancing: /skins-1.5/monobook/main.css -> 207.142.131.204:80
(trace) using hash-balancing: /skins-1.5/common/commonPrint.css -> 207.142.131.213:80
(trace) using hash-balancing: /skins-1.5/common/wikibits.js -> 207.142.131.245:80
(trace) using hash-balancing: /w/index.php -> 207.142.131.206:80
(trace) using hash-balancing: /w/index.php -> 207.142.131.206:80
(trace) using hash-balancing: /w/index.php -> 207.142.131.206:80
(trace) using hash-balancing: /w/index.php -> 207.142.131.206:80
(trace) using hash-balancing: /skins-1.5/monobook/headbg.jpg -> 207.142.131.202:80
(trace) using hash-balancing: /skins-1.5/monobook/bullet.gif -> 207.142.131.245:80
(trace) using hash-balancing: /skins-1.5/common/images/poweredby_mediawiki_88x31.png -> 207.142.131.236:80
(trace) using hash-balancing: /images/wikimedia-button.png -> 207.142.131.203:80
(trace) using hash-balancing: /skins-1.5/monobook/bullet.gif -> 207.142.131.245:80
(trace) using hash-balancing: /skins-1.5/monobook/user.gif -> 207.142.131.202:80
(trace) using hash-balancing: /images/wiki-en.png -> 207.142.131.204:80
You see that the same URL results in the same address that is connected. If one of the backends goes down, all requests that were meant for that backend are spread over the other backends. If you had 10 backends and 1 goes down, each backend has to server 1/9 URLs of the dead backend. * you have 100 URLs, 10 backends * each backend handles (100/10) = 10 URLs * one backend goes down and its 10 URLs are spread over the other 9 backends * each backend handles now its 10 URLs as before + (10/9) URLs of the dead backend Hash balancing is following the ideas from "http://icp.ircache.net/carp.txt":http://icp.ircache.net/carp.txt