lighty's life

lighty developer blog

Mod-proxy-core and SQF

mod-proxy-core has 3 different balancers for different needs. Round Robin, Shortest Queue First and CARP.

We can categorize the balancers into two sections:

  • load balancing by distribution (RR, SQF)
  • load balancing by separation (CARP)

Round Robin

Round Robin (RR) is simple and straight forward.

If you have 3 hosts

  • A1
  • A2
  • A3

the first request goes to A1, the seconds to A2 and the third to A3. The forth request starts at A1 again.

We use a slightly different implementation. Instead of really going from A1 to A2 to A3, we take all active backends and pick one randomly. On average each host gets the same number of requests.

Shortest Queue First

RR has a little problem. If A1 is slower than A2 and A3, the fast backends will get the same number of requests as the A1.

SQF tries to take that into account and take the queue-length as the base for the balancing.

The first request goes to A1 and takes 10s to complete. Meanwhile we get 4 other requests which A2 and A3 execute in 2s.

After two seconds it looks like this:

  • A1 needs 8 more seconds [q-len: 1]
  • A2 is free [q-len: 0]
  • A3 is free [q-len: 0]

Request 3 goes to A2:

  • A1 needs 8 more seconds [q-len: 1]
  • A2 needs 2 seconds [q-len: 1]
  • A3 is free [q-len: 0]

and Request 4 goes to A3:

  • A1 needs 8 more seconds [q-len: 1]
  • A2 needs 2 seconds [q-len: 1]
  • A3 needs 2 seconds [q-len: 1]

If another request comes in now, we put it into the backlog.

Benchmarks

This had to be benchmarked. What a luck that I have enough hardware at home, so we have 4 boxes joining the ring:

  • client (.23): a Mac Mini, 1.2GHz, 100Mbit
  • proxy (.27): AMD64 3000+, Linux 2.6.x, 1Gbit
  • backend-1 (.22): Intel P4 1.2GHz, WinXP 32-bit
  • backend-2 (.25): AMD64 X2, 3500+, Win2003 64-bit

The backends are running Apache 2.2.x taken from the MSI, mod_status enabled.

The proxy is lighty 1.5.0-r1435 with mod-proxy-core:

$SERVER["socket"] == ":1445" {
  proxy-core.protocol = "http"
#  proxy-core.balancer = "round-robin"
  proxy-core.balancer = "sqf"
  proxy-core.backends = ( "192.168.178.25:80", "192.168.178.22:8080" )
#  proxy-core.backends = ( "192.168.178.22:8080" )
#  proxy-core.backends = ( "192.168.178.25:80" )
  proxy-core.max-pool-size = 32
}

The backends are serving the 44byte index.html which is in the htdocs/ folder by default.

The client is always running the same command:


$ ab -k -n 100000 -c 16 http://192.168.178.27:1445/

As a first test we only active .25, the dual core box:


Requests per second: 2833.60 [#/sec] (mean)

Only .22 [my 3yr old Centrino notebook] gives:


Requests per second: 1249.45 [#/sec] (mean)

Using RR will balance the request equally over both hosts. We expect at max the double request-rate of the slowest backend.

balancer req/s req .22 req .25 %idle .27
RR 2122 50122 49896 50%
SQF 3213 30970 69048 25%

You see how SQF takes the adjusts to the possibilities of the backend and balances nicely while RR is just doing its thing and results in alot less throughput in the end.

BTW: If keep-alive is disabled, the req/s drop from 3213 to 2678 req/s with SQF and from 2122 to 1835 for RR.

The proxy-server (our lighty) is using its CPU very well and I already found ways to optimize the proxy-code to use less CPU. It won’t affect the performance of this benchmark alot as the backends are at 100% already.