Threaded stat() 4
Just as a proof of concept I implemented a threaded stat() call. It is a bit of a hack currently, but it looks promising when I look at the performance data:
avg-cpu: %user %nice %system %iowait %steal %idle
5.00 0.00 26.60 68.40 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.60 66.90 1.60 13019.20 22.40 6.36 0.01 190.39 6.10 88.20 14.49 99.28
sdb 0.00 0.60 66.60 1.60 13061.60 22.40 6.38 0.01 191.85 14.09 208.82 14.67 100.04
In http://blog.lighttpd.net/articles/2007/01/27/accelerating-small-file-transfers we tried the same without a async stat() and with fcgi-stat-accel. With the threaded stat() I moved the code into lighttpd itself which reduces the external communicating and manages everything in lighttpd itself.
name Throughput util% iowait% ----------------- ------------ ----- ------------ no stat-accel 12.07MByte/s 81% stat-accel (tcp) 13.64MByte/s 99% 45.00% stat-accel (unix) 13.86MByte/s 99% 53.25% threaded-stat 14.32MByte/s 99% 68.40%
(larger is better)
Implementation
in stat_cache.c I started a separate thread for handling the stat() call, 4 threads to be exact.
stat_cache_get_entry() checks its cache, if this file is already known. If not, it pushes the filename into the stat_cache_queue and returns HANDLER_WAIT_FOR_EVENT. On the other end of the stat_cache_queue is one of the 4 stat()-threads which runs the stat() and pushs the connection back into the joblist_queue. On the mainloop, just where the poll() call is started is now the handler for this queue which just actives all connections which are in this queue.
This way we made the stat() call itself async and can leave the rest of the code as is. Up to now we only get the inode into the fs-buffers as in the other examples, we are not handling the full stat-cache updates in the thread.
gpointer *stat_cache_thread(gpointer *_srv) {
server *srv = (server *)_srv;
stat_job *sj = NULL;
/* take the stat-job-queue */
GAsyncQueue * inq = g_async_queue_ref(srv->stat_queue);
GAsyncQueue * outq = g_async_queue_ref(srv->joblist_queue);
/* get the jobs from the queue */
while ((sj = g_async_queue_pop(inq))) {
/* let's see what we have to stat */
struct stat st;
/* don't care about the return code for now */
stat(sj->name->ptr, &st);
stat_job_free(sj);
g_async_queue_push(outq, sj->con);
}
return NULL;
}
Accelerating Small File-Transfers 14
Thanks to some help from a irc-channel (#lighttpd at irc.freenode.net) we solved another long-standing problem:
As lighttpd is event-based web-server we have problems when it comes to blocking operations. In 1.5.0 we add async sendfile() operations which helps for large files alot. For small files most of the time is spent on the initial stat() call which has no async interface.
Fobax submitted a nice solution for this problem: move the stat() to a fastcgi app which returns with X-LIGHTTPD-send-file: and hands the request back to lighttpd. The fastcgi can block and spend some time while lighttpd moves on the with other requests. When the fastcgi returns the information for the stat() call is in the fs-buffers and lighttpd doesn’t block on the stat() anymore.
All this is documented by darix in the wiki at HowtoSpeedUpStatWithFastcgi
This works with mod_fastcgi in 1.4.0 or with mod-proxy-core in 1.5.0 + aio.
For 1.5.0 I added fcgi-stat-accel to svn and to the cmake build.
I want to on port 1029 as a first test round. The -C 1 is to start only one thread in the back to see the impact later.
$ ./build/spawn-fcgi -f ./build/fcgi-stat-accel -p 1029 -C 1
As config on lighttpd side we have to enable X-Sendfile and keep a few connections open in the pool.
$SERVER["socket"] == ":1025" {
$HTTP["url"] =~ "^/seek-bound/" {
proxy-core.protocol = "fastcgi"
proxy-core.backends = ( "127.0.0.1:1029" )
proxy-core.allow-x-sendfile = "enable"
proxy-core.max-pool-size = 20
}
}
As test-env I used 100k files as in the other tests (10G of data over all).
$ http_load -parallel 200 -seconds 60 urls.100k
iostat said:
$ iostat -xm 5
avg-cpu: %user %nice %system %iowait %steal %idle
9.20 0.00 45.80 45.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 73.00 0.00 13278.40 0.00 6.48 0.00 181.90 7.09 98.30 13.71 100.08
sdb 0.00 0.00 69.20 0.00 12625.60 0.00 6.16 0.00 182.45 13.63 194.71 14.46 100.08
We are limited by the disks now, perhaps we can reduce the CPU usage a bit more by using unix domain sockets instead of TCP:
avg-cpu: %user %nice %system %iowait %steal %idle
8.19 0.00 38.56 53.25 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 1.00 67.63 4.30 12533.07 47.95 6.12 0.02 174.91 10.28 144.44 13.89 99.90
sdb 0.00 1.00 66.13 4.30 12442.76 47.95 6.08 0.02 177.35 11.92 168.46 14.18 99.90
The system time drops by 6, good enough.
Summary
Thanks to Fobax great idea I can finally max out my two disks. If you have more disks the impact will be a lot larger. Give it a try.
name Throughput util% ----------------- ------------- --------- no stat-accel 12.07MByte/s 81% stat-accel (tcp) 13.64MByte/s 99% stat-accel (unix) 13.86MByte/s 99%
Compression of dynamic content 11
It looks like a few changes won’t make it into trunk/ before I leave for vacation. But you should know what is in the pipeline and what you want to wait for:
- HTTP Response filtering is implemented
- HTTP/1.1 chunking becomes a module
- compression of dynamic content
This will add compression not only for mod_proxy_core and its backends (FastCGI, SCGI, HTTP, AJP13) but also to internally generated content like the directory listings.
With these changes we will become more and more stream based. Or like JDD called it: The Web is a Pipe
Lighttpd powers 6 Alexa Top 250 sites 14
Reading the last statistics from netcraft’s Webserver Survey lighttpd is #12 of the most used webserver software packages.
But who is running lighttpd and for what purpose ?
The Ranking
The Alexa Global Top 500 lists a few sites which are already known from PoweredByLighttpd in our wiki
#6 is youtube who is using lighttpd for sending out the static content (the images, the videos, ...). They use a patched version which handles their load better.
#12 is wikipedia. If you open en.wikipedia.org with firebug and check the “All->Net” tab, you will see upload.wikimedia.org. All the images-work is handed here: thumbnail generation, delivery.
#49 is imageshack.us
SourceForge.net is at #80. The whole site runs on lighttpd. Dynamic content via PHP and the usual static content.
#132 is sendspace.com a huge file-sharing site. They care a lot about concurrent access to a large set of large files. You don’t want to hear their disk-backends seeking. :)
On #221 we have our first brother from the family of torrent sites: mininova.org They are a long running lighttpd user and are leading the group:
... and those were just the ones I could find right away. :)
The Numbers
The December Survey from Netcraft talks about 178,619 detected servers. I know that quite a few of them are domain-hosters which are using a single IP to park a few thousand domain-names.The Others
But who is behind those numbers ? I have contact to a some of owners of the sites mentioned above and want to write an article about the different users of lighttpd. How they are using it, what they would like to see, what they love, ... If you are interested and are listed in Alexa before my personal site drop me a mail at jan@kneschke.de or just add a comment with your mail-address below.And who is on place #15,276 ?
1.5.0 goes cmake 5
It is a tradition now to change the build-system from lighttpd on each major release For now we have the autotools as the user-visible build-system and scons as the system for the developers.
Currently we are testing cmake as a replacement for the scons part.
Build Systems
Before you can build a C-program you first have to find out which functions the system you compile on supports. Not only that UNIX has various flavours, no, there is also Windows.The natural way on UNIX is using autotools (autoconf, automake, ...) which creates a shell script (configure) which generates the Makefiles which are process by make to build the application.
Using automake reduced the pain of autoconf when it was released, but it none the less is still enough pain to look for something else. Not to forget that win32 and shell-scripts aren’t real friends.
scons
SCons going another route. It replaces make and the autotools by a python based build-system. You can do everything with just a few lines of python code. Very neat.The bad side: its development is more or less sleeping. The unstable 0.96.9x branch which fixes various nasty bugs is unstable since 2004-08-22, the release of 0.96.1.
cmake
cmake is more relaxed and doesn’t want to solve the whole problem. It does the configure checks and leaves the building to the native build-system. On Unix it is either make or kdevelop3, on MacOSX it is XCode and on windows it is nmake.As extra cmake supports basic packaging and has very nice integration with test-tools like Dart.
Cross-Compiling
As a proof-of-concept I’ve added cmake support to trunk used the openwrt SDK to build lighttpd with openwrt.$ cmake . $ rm CMakeCache.txt $ OPENWRT=.../OpenWrt-SDK-Linux-x86_64-1/staging_dir_mipsel/ \ CC=$OPENWRT/bin/mipsel-linux-gcc \ LD=$OPENWRT/bin/mipsel-linux-uclibc-ld make
Command Line Options
$ cmake -L . ... CMAKE_INSTALL_PREFIX:PATH= ... WITH_BZIP:BOOL=OFF WITH_MYSQL:BOOL=OFF WITH_OPENSSL:BOOL=OFF WITH_PCRE:BOOL=ON WITH_SQLITE3:BOOL=OFF WITH_WEBPAV_PROPS:BOOL=OFF WITH_XATTR:BOOL=OFF WITH_ZLIB:BOOL=ON $ cmake -DWITH_BZIP:BOOL=ON \ -DCMAKE_INSTALL_PREFIX:PATH=/home/jan/l-1.5.0-cmake/
Static Linking
To make it easier for embedded systems which don’t have support for dlopen() I added the option:
$ cmake -DBUILD_STATIC:BOOL=ON .
It will build all the modules as static libraries which are linked into the server at build-time. server.modules is used to initialized the modules as before.
1.5.0 works on win32 natively - again 15
Half a year ago I was traveling a bit and tried to get lighty to compile natively on win32
Some time has passed and I concentrated on the other stuff in the 1.5.0 tree, leaving the nasty win32 code in place for someone to pick up. Ben Harper aka rogojin has picked it up and released a win32 installer for the latest pre-release
A simple tests shows that staticfiles are working nicely and that http-proxying with mod-proxy-core works too. Nice work, Ben.
Faster Web 2.0 3
In Faster FastCGI I talked about using temp-files in /dev/shm to reduce the overhead of large FastCGI requests. Robert implemented it right away and it is available in the latest pre-release
Woken up far too early and having the first coffee I shared some ideas on how this could be useful to accelerate AJAX based applications.
Large Response content
Robert already did some benchmarking and it looks like you the X-LIGHTTPD-send-temp-file helps if you have to send more than 16kbyte of content. Otherwise it fits into one FastCGI packet and into the TCP recv-buffers. Boring, right ? Let’s take the next step and move the idea a bit further.
Pre-generating content
At the railsconf in June Zed Shaw and I discussed on some ways to accelerate rails apps and funky Web2.0 apps in particular.
Usually you have dependencies between components. Let’s say you are writing a mail app and you are pooling the “you got mail” service and want to show a fade-in showing it and in the same way refresh the folder-list to show the new mail-count.
We want to play nice and only load the components which really have changed, so you have 1 to 3 XMLHTTPRequests:- the “you got mail” fade-in
- if their is new mail, the the folder list gets updated
- if the new mail is in the the current folder, the folder view gets updated too
In the AJAX world you have smaller requests, but all of them have to load the session first, before they can generate their content. What if you turn it around move the above logic back into the server and the “you got mail” loads the session once and generates the content for all 3 parts as it knows that app will ask for it anyway in the next 1-2 seconds ?
It would make the folder-list and the folder-view requests instant, as they just have to check if the content-file exists and can stream it out with X-LIGHTTPD-send-temp-file right away.
Now you just have to mix in some cleanup on mtime in case the temp-file is not fetched [perhaps memcached ?], have to find a way how to guess the filename without reading the session first [that’s the expensive part we want to get rid off], ...
Read Ahead
In a way this is like read-ahead on Hard Disks or prefetch in CPUs or even browsersHot or not ?
Ok, what do you think about this idea ? Do you have problems that could be solved this way ?PRE-RELEASE: lighttpd-1.5.0-r1477.tar.gz 10
mod-proxy-core
Robert Jakabosky fixed and improved mod-proxy-core alot since the last pre-release:
- fixed unix-socket support
- added AJP13 and SCGI support
- fixed some nasty bugs
- added documentation
- added X-LIGHTTPD-send-temp-file
POSIX Async IO
I added native support for POSIX AIO which might bring async io for more platforms. While Linux AIO is pretty stable the POSIX aio support is pretty experimental. Perhaps it compiles for you.
I tried to compile it on Linux and FreeBSD.
server.network-backend = "posix-aio"
Try it
Check if it compiles and works for you.
http://www.lighttpd.net/download/lighttpd-1.5.0-r1477.tar.gz
Faster FastCGI 26
While I was throwing away from bogus data-copy operations from the mod-proxy-core code I stumbled over a simple question:
Why do we copy the HTTP response data from the backends at all ?
We are just forwarding them in most cases without touching them.
How about:
HTTP/1.1 200 OK Content-Type: text/html X-LIGHTTPD-send-tempfile: /dev/shm/fcgi-output/j37f467d
... and lighty removing the file when it has sent it.
/dev/shm is memory and the application is writing to it and only passes a reference to the webserver where it can take the content from.
Especially for large content which is generated on the fly this might help alot.
Crazy or Cool
Now it takes someone to implement and benchmark it. Adding it as easy: just add the header filter to mod_fastcgi and set is_temp on the file-chunk.
COMET meets mod_mailbox 14
Some time ago we got a request on how to implement COMET with lighttpd. I responded with a idea about a mod_multiplex which would allow the let the client open a COMET-channel and give the backend the possibility to feed multiple channels at once with the client to poll for new data.
Basicly it would separate the HTTP Request-Response cycle from the underlying connection. HTTP would be used to open the connection and reopen it in case it went away, but otherwise it would be just a data-channel for your JavaScript/AJAX content we want to send to the client when WE (the content-provider) want.
Let’s quote my understanding of COMET again:
My idea on this is to decouple the request from the COMET stream. AFAI understand COMET it is a ‘one-receiver-multiple-senders’ concept. The channel (a HTTP-response) is kept-open while the server/app can send multiple responses even without browser interaction to the client.
This is what you see in a web-chat application or what is used to send out stock-quotes in real-time to a large group of users.
In the classic HTTP world you either have to poll every one few seconds for new data or you keep the connection open and let it stream you the data. Polling is not instant and generates load even if no data is available. Streaming binds FastCGI (or similar) backend to the connection which limits the number of parallel users.
Decoupling the Connection from the content
We want
- a persistent connection between to server and client to minimize the setup costs (keep-alive) and have immediate responses
- a way to send data from a backend to multiple connections
- to run a backend only to generate content
As this doesn’t fit into the classic model, we have to break it a bit. Classic means:
1. server reads HTTP Request 2. server forwards the request to the backend and waits for its response 3. server sends HTTP Response to the client
As soon as the backends closes its connection to the server, the server will assume that there is no more data to transfer and waits for the next client-request.
That’s what we want to change. We want to decouple the backend from the client-connection. To take it one step further we want to implement something we already know every well: a mailbox.
Queuing messages in a mailbox
If you are at home when the postman rings twice ... hmm, let’s start again.
If you are at home when the postman delivers the a letter you can read it right away. He rings the bell, you say hello, take the letter, read it.
If you are working hard in the office, the postman will the deliver the letter too, just to your mailbox. Actually several companies deliver letters, packages, ... all to your mailbox. You’ll pick the up when you come home.
mod_mailbox
First we need a name for the mailbox. The client opens a mailbox on the server, the server ties the client connection to this mailbox and sends all the data it gets for this mailbox right away to the client.
If the client drop the connection it either re-establishes or the server will remove the mailbox after some time.
A backend is started as usual and can send its response to multiple mailboxes and doesn’t have to care about connections being up or down. It only delivers to mailboxes (queues) managed by the mod_mailbox.
In case the backend delivered new content into a mailbox while the connection was temporary closed it will be sent as soon as the client re-opens the mailbox.
Can I use it ?
Not yet. This is a idea how to implement COMET in a nice, performing fashion. On the way to 1.5.0 it will be implemented. Meanwhile I’m looking for comments on the idea and if it matches your needs for COMET and AJAX.