<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>lighty's life: Async IO on Linux </title>
    <link>http://blog.lighttpd.net/articles/2006/11/09/async-io-on-linux</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description></description>
    <item>
      <title>Async IO on Linux </title>
      <description>&lt;p&gt;&lt;em&gt;trunk/&lt;/em&gt; just got support Linux Native &lt;span class="caps"&gt;AIO&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;I implemented Async IO based on libaio which is a minimal wrapper around the aio-syscalls for the 2.6.x kernels.&lt;/p&gt;


&lt;h3&gt;Implementation&lt;/h3&gt;

	&lt;p&gt;It was a bit tricky to get it working as libaio is basicly undocumented, but hey &amp;#8230; that&amp;#8217;s why we are hackers :)&lt;/p&gt;


	&lt;p&gt;The async file IO support is part of Linux 2.6.9 and later and should be on every recent linux box. A separate library call libaio is providing very simple wrappers and is used as the base for the new network backend.&lt;/p&gt;


	&lt;p&gt;The idea is:&lt;/p&gt;


	&lt;ol&gt;
	&lt;li&gt;create a buffer in /dev/shm and mmap() it&lt;/li&gt;
		&lt;li&gt;start a async read() from the source file to the mmap() buffer&lt;/li&gt;
		&lt;li&gt;wait until the data is ready &lt;/li&gt;
		&lt;li&gt;use sendfile() to send the data from /dev/shm to the network socket&lt;/li&gt;
	&lt;/ol&gt;


	&lt;p&gt;Important for the performance: the data is never copied into user space. We only move it from one side of the kernel to the other side.&lt;/p&gt;


&lt;h4&gt;Hack ahead&lt;/h4&gt;

	&lt;p&gt;Sadly I had to add pthread to the dependencies. Having threads in a single-threaded server is a bit strange, but it is necessary.&lt;/p&gt;


	&lt;p&gt;fdevent_poll() was waiting for fd-events for 1s. While it was waiting the server was waiting. The handling the async-notifications is also blocking and we can&amp;#8217;t make them return as soon as one of them is done.&lt;/p&gt;


	&lt;p&gt;If necessary we start a io-getevent-thread which run in parallel to the fdevent_poll() call. The call which returns first is interrupting the other one by sending a &lt;span class="caps"&gt;SIGUSR1&lt;/span&gt; to the process. It makes the waiting calls (poll() and io_getevents()) return with a &lt;span class="caps"&gt;EINTR&lt;/span&gt; and we can continue handling the result of one of the two calls.&lt;/p&gt;


&lt;h3&gt;Benchmarks&lt;/h3&gt;

	&lt;p&gt;As testbed we have a &lt;span class="caps"&gt;RAID1&lt;/span&gt; (linux md) via two&lt;/p&gt;


	&lt;ul&gt;
	&lt;li&gt;&lt;span class="caps"&gt;ST3160827AS&lt;/span&gt; (SATA, 120Mb each)&lt;/li&gt;
		&lt;li&gt;nVidia Corporation &lt;span class="caps"&gt;CK8S&lt;/span&gt; as &lt;span class="caps"&gt;SATA&lt;/span&gt; controller&lt;/li&gt;
		&lt;li&gt;&lt;span class="caps"&gt;AMD&lt;/span&gt; Athlon&amp;#8482; 64 Processor 3000+&lt;/li&gt;
		&lt;li&gt;Linux 2.6.16.21-0.25-xen (SuSE 10.1)&lt;/li&gt;
	&lt;/ul&gt;


&lt;h4&gt;siege, 700Mb&lt;/h4&gt;

	&lt;p&gt;I&amp;#8217;ll compare linux-sendfile vs. linux-aio-sendfile.&lt;/p&gt;


&lt;core&gt;$ siege&amp;#8212;reps=1 -c 1&amp;#8212;benchmark http://127.0.0.1:1025/file-700M
&lt;/code&gt;

	&lt;table&gt;
		&lt;tr&gt;
			&lt;td&gt;conc&lt;/td&gt;
			&lt;td&gt;non-aio&lt;/td&gt;
			&lt;td&gt;aio [512k]&lt;/td&gt;
			&lt;td&gt;aio [1M]&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;1&lt;/td&gt;
			&lt;td&gt;52.38 MB/sec [9% idle]&lt;/td&gt;
			&lt;td&gt;89.85 MB/sec [70% idle]&lt;/td&gt;
			&lt;td&gt;107.50 MB/sec [67% idle] &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;2&lt;/td&gt;
			&lt;td&gt;39.94 MB/sec [8% idle]&lt;/td&gt;
			&lt;td&gt;94.52 MB/sec [70% idle]&lt;/td&gt;
			&lt;td&gt; 92.74 MB/sec [70% idle]
&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;5&lt;/td&gt;
			&lt;td&gt;35.45 MB/sec [7% idle]&lt;/td&gt;
			&lt;td&gt;31.81 MB/sec [86% idle]&lt;/td&gt;
			&lt;td&gt;72.84 MB/sec [70% idle]&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;10&lt;/td&gt;
			&lt;td&gt;.. &lt;/td&gt;
			&lt;td&gt;25.22 MB/sec [82% idle]&lt;/td&gt;
			&lt;td&gt; 32.87 MB/sec [90%] idle &lt;/td&gt;
		&lt;/tr&gt;
	&lt;/table&gt;




	&lt;p&gt;More important than the throughput is the &lt;span class="caps"&gt;CPU&lt;/span&gt; time that can be spent with other tasks now.&lt;/p&gt;


&lt;h3&gt;What&amp;#8217;s next ?&lt;/h3&gt;
Next is bug fixing, load testing (more parallel connections), random load, ...</description>
      <pubDate>Thu, 09 Nov 2006 02:59:00 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:6dd787d3-77a8-4e8f-b7d7-8edd148b39ab</guid>
      <author>jan</author>
      <link>http://blog.lighttpd.net/articles/2006/11/09/async-io-on-linux</link>
      <category>lighttpd</category>
      <category>aio</category>
      <category>linux</category>
      <trackback:ping>http://blog.lighttpd.net/articles/trackback/2199</trackback:ping>
    </item>
    <item>
      <title>"Async IO on Linux " by pulczynski</title>
      <description>We are serving large amount of thumbnail images (5-7kb) from squid-like directory structure. 

Avg file size: ~7kb
Total data: ~1Terabyte
Acess: totally random  
Disks:  4 sata disk, 16mb cache, linux sofware raid 0 (striping)
Fs: reiserfs
Deadline sheduler.
Core duo processor.

According to siege  and /proc: 

= 1.4 branch =
cpu waiting 95%
100rq/s (7MB)O
64 workers (!!!)
load ~equal to number of workers 

= trunk branch (lbaio) =
1000rq/s (70MB)
64 workers 
load: 5

10x FASTER (!!!).

 Aio really kicks ass in these conditions. 

These are only FIRST PRE_TEST, done 5 times for 30s peridion, runned from other machine i GB network.  Tommorow i will do some more test and send full results. (had to put server back online in old version due to lack of fastcgi support).

Too good to be true (and siege sometimes lies)

</description>
      <pubDate>Wed, 22 Nov 2006 19:51:56 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:05782133-cb33-4411-b624-f8a20a175560</guid>
      <link>http://blog.lighttpd.net/articles/2006/11/09/async-io-on-linux#comment-2274</link>
    </item>
    <item>
      <title>"Async IO on Linux " by origo</title>
      <description>Great work Jan, really! I don't rely heavily on file I/O for my site, but any efficiency and speed improvements are good, right?. I would also like to urge you to look into the fcgi code aswell. I run lighty with fcgi-php over network and i also have problems with 500 errors. I am more than willing to help test/develop any changes you might consider for this.</description>
      <pubDate>Tue, 21 Nov 2006 10:12:45 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:663249c1-8e10-465b-a819-be636c09044d</guid>
      <link>http://blog.lighttpd.net/articles/2006/11/09/async-io-on-linux#comment-2272</link>
    </item>
    <item>
      <title>"Async IO on Linux " by origo</title>
      <description>Great work Jan, really! I don't rely heavily on file I/O for my site, but any efficiency and speed improvements are good, right?. I would also like to urge you to look into the fcgi code aswell. I run lighty with fcgi-php over network and i also have problems with 500 errors. I am more than willing to help test/develop any changes you might consider for this.</description>
      <pubDate>Tue, 21 Nov 2006 10:12:27 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:85ef3b74-f0c8-4260-9d92-8817a5b253bd</guid>
      <link>http://blog.lighttpd.net/articles/2006/11/09/async-io-on-linux#comment-2271</link>
    </item>
    <item>
      <title>"Async IO on Linux " by origo</title>
      <description>Great work Jan, really! I don't rely heavily on file I/O for my site, but any efficiency and speed improvements are good, right?. I would also like to urge you to look into the fcgi code aswell. I run lighty with fcgi-php over network and i also have problems with 500 errors. I am more than willing to help test/develop any changes you might consider for this.</description>
      <pubDate>Tue, 21 Nov 2006 10:11:57 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:1f18c8ee-201f-4b2a-a03a-e244c41062f5</guid>
      <link>http://blog.lighttpd.net/articles/2006/11/09/async-io-on-linux#comment-2270</link>
    </item>
    <item>
      <title>"Async IO on Linux " by neu</title>
      <description>I have the same problem as namosys sometimes. I wish lighty would wait until it can get a backend before spitting out 500 right away... there isn't a timeout setting for this, right?</description>
      <pubDate>Mon, 13 Nov 2006 18:40:18 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:a808c9c3-a190-454a-9f6f-7683b910d3e1</guid>
      <link>http://blog.lighttpd.net/articles/2006/11/09/async-io-on-linux#comment-2214</link>
    </item>
    <item>
      <title>"Async IO on Linux " by namosys</title>
      <description>Jan please take a look on fcgi code. When the backend (php) runs slow, e.g., blocked by a slow MySQL query, lighty will practically refuse all dynamic requests (returns 500). The problem makes me unable to replace Apache with lighty.</description>
      <pubDate>Sat, 11 Nov 2006 11:34:28 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:64f8d055-58b8-49ce-8133-c25d769792c1</guid>
      <link>http://blog.lighttpd.net/articles/2006/11/09/async-io-on-linux#comment-2204</link>
    </item>
    <item>
      <title>"Async IO on Linux " by Jan Kneschke</title>
      <description>Alberto: Perhaps I'm missing something, but I don't see how epoll() can help us here. poll() vs. epoll() doesn't matter for less than 10 connections (what I was testing here).

Eric, stay on IRC and let us fix it there.</description>
      <pubDate>Thu, 09 Nov 2006 13:20:07 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:6549f0f0-c65a-496e-ae68-c26e885817fb</guid>
      <link>http://blog.lighttpd.net/articles/2006/11/09/async-io-on-linux#comment-2203</link>
    </item>
    <item>
      <title>"Async IO on Linux " by eric</title>
      <description>I tested it out, but lighttpd 1.5 will crash after serve 1 request, and I can not get file from lighttpd 1.5 with wget or browser, that file size is 0.

strace gave me this
14:03:25.028334 write(5, "2006-11-09 14:03:25: (src/network_linux_aio.c.166) sendfile failed: Bad file descriptor 6 \n", 91) = 91</description>
      <pubDate>Thu, 09 Nov 2006 06:39:29 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:4a17cc6d-452e-46fe-9473-c36d59f00fbd</guid>
      <link>http://blog.lighttpd.net/articles/2006/11/09/async-io-on-linux#comment-2202</link>
    </item>
    <item>
      <title>"Async IO on Linux " by Alberto</title>
      <description>Great, thanks a lot!

BTW, how does it compare against epoll?</description>
      <pubDate>Thu, 09 Nov 2006 05:32:16 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:664bb5f7-8bdf-491a-b748-2fe50e6edddd</guid>
      <link>http://blog.lighttpd.net/articles/2006/11/09/async-io-on-linux#comment-2201</link>
    </item>
    <item>
      <title>"Async IO on Linux " by eric</title>
      <description>great job, Jan
I have no idea about how to enalbe aio
like set
server.network-backend = "linux-aio-sendfile"
but lighttpd does not know it.
"2006-11-09 13:00:58: (network.c.535) server.network-backend has a unknown value: linux-aio-sendfile"
</description>
      <pubDate>Thu, 09 Nov 2006 05:07:00 +0100</pubDate>
      <guid isPermaLink="false">urn:uuid:200a8a14-634f-44c0-9259-3280f3a0e47b</guid>
      <link>http://blog.lighttpd.net/articles/2006/11/09/async-io-on-linux#comment-2200</link>
    </item>
  </channel>
</rss>
