<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>lighty's life: Tag linux</title>
    <link>http://blog.lighttpd.net/articles/tag/linux</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description></description>
    <item>
      <title>Async IO on Linux </title>
      <description>&lt;p&gt;&lt;em&gt;trunk/&lt;/em&gt; just got support Linux Native &lt;span class="caps"&gt;AIO&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;I implemented Async IO based on libaio which is a minimal wrapper around the aio-syscalls for the 2.6.x kernels.&lt;/p&gt;


&lt;h3&gt;Implementation&lt;/h3&gt;

	&lt;p&gt;It was a bit tricky to get it working as libaio is basicly undocumented, but hey &amp;#8230; that&amp;#8217;s why we are hackers :)&lt;/p&gt;


	&lt;p&gt;The async file IO support is part of Linux 2.6.9 and later and should be on every recent linux box. A separate library call libaio is providing very simple wrappers and is used as the base for the new network backend.&lt;/p&gt;


	&lt;p&gt;The idea is:&lt;/p&gt;


	&lt;ol&gt;
	&lt;li&gt;create a buffer in /dev/shm and mmap() it&lt;/li&gt;
		&lt;li&gt;start a async read() from the source file to the mmap() buffer&lt;/li&gt;
		&lt;li&gt;wait until the data is ready &lt;/li&gt;
		&lt;li&gt;use sendfile() to send the data from /dev/shm to the network socket&lt;/li&gt;
	&lt;/ol&gt;


	&lt;p&gt;Important for the performance: the data is never copied into user space. We only move it from one side of the kernel to the other side.&lt;/p&gt;


&lt;h4&gt;Hack ahead&lt;/h4&gt;

	&lt;p&gt;Sadly I had to add pthread to the dependencies. Having threads in a single-threaded server is a bit strange, but it is necessary.&lt;/p&gt;


	&lt;p&gt;fdevent_poll() was waiting for fd-events for 1s. While it was waiting the server was waiting. The handling the async-notifications is also blocking and we can&amp;#8217;t make them return as soon as one of them is done.&lt;/p&gt;


	&lt;p&gt;If necessary we start a io-getevent-thread which run in parallel to the fdevent_poll() call. The call which returns first is interrupting the other one by sending a &lt;span class="caps"&gt;SIGUSR1&lt;/span&gt; to the process. It makes the waiting calls (poll() and io_getevents()) return with a &lt;span class="caps"&gt;EINTR&lt;/span&gt; and we can continue handling the result of one of the two calls.&lt;/p&gt;


&lt;h3&gt;Benchmarks&lt;/h3&gt;

	&lt;p&gt;As testbed we have a &lt;span class="caps"&gt;RAID1&lt;/span&gt; (linux md) via two&lt;/p&gt;


	&lt;ul&gt;
	&lt;li&gt;&lt;span class="caps"&gt;ST3160827AS&lt;/span&gt; (SATA, 120Mb each)&lt;/li&gt;
		&lt;li&gt;nVidia Corporation &lt;span class="caps"&gt;CK8S&lt;/span&gt; as &lt;span class="caps"&gt;SATA&lt;/span&gt; controller&lt;/li&gt;
		&lt;li&gt;&lt;span class="caps"&gt;AMD&lt;/span&gt; Athlon&amp;#8482; 64 Processor 3000+&lt;/li&gt;
		&lt;li&gt;Linux 2.6.16.21-0.25-xen (SuSE 10.1)&lt;/li&gt;
	&lt;/ul&gt;


&lt;h4&gt;siege, 700Mb&lt;/h4&gt;

	&lt;p&gt;I&amp;#8217;ll compare linux-sendfile vs. linux-aio-sendfile.&lt;/p&gt;


&lt;core&gt;$ siege&amp;#8212;reps=1 -c 1&amp;#8212;benchmark http://127.0.0.1:1025/file-700M
&lt;/code&gt;

	&lt;table&gt;
		&lt;tr&gt;
			&lt;td&gt;conc&lt;/td&gt;
			&lt;td&gt;non-aio&lt;/td&gt;
			&lt;td&gt;aio [512k]&lt;/td&gt;
			&lt;td&gt;aio [1M]&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;1&lt;/td&gt;
			&lt;td&gt;52.38 MB/sec [9% idle]&lt;/td&gt;
			&lt;td&gt;89.85 MB/sec [70% idle]&lt;/td&gt;
			&lt;td&gt;107.50 MB/sec [67% idle] &lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;2&lt;/td&gt;
			&lt;td&gt;39.94 MB/sec [8% idle]&lt;/td&gt;
			&lt;td&gt;94.52 MB/sec [70% idle]&lt;/td&gt;
			&lt;td&gt; 92.74 MB/sec [70% idle]
&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;5&lt;/td&gt;
			&lt;td&gt;35.45 MB/sec [7% idle]&lt;/td&gt;
			&lt;td&gt;31.81 MB/sec [86% idle]&lt;/td&gt;
			&lt;td&gt;72.84 MB/sec [70% idle]&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;10&lt;/td&gt;
			&lt;td&gt;.. &lt;/td&gt;
			&lt;td&gt;25.22 MB/sec [82% idle]&lt;/td&gt;
			&lt;td&gt; 32.87 MB/sec [90%] idle &lt;/td&gt;
		&lt;/tr&gt;
	&lt;/table&gt;




	&lt;p&gt;More important than the throughput is the &lt;span class="caps"&gt;CPU&lt;/span&gt; time that can be spent with other tasks now.&lt;/p&gt;


&lt;h3&gt;What&amp;#8217;s next ?&lt;/h3&gt;
Next is bug fixing, load testing (more parallel connections), random load, ...</description>
      <pubDate>Thu, 09 Nov 2006 02:59:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:6dd787d3-77a8-4e8f-b7d7-8edd148b39ab</guid>
      <author>jan</author>
      <link>http://blog.lighttpd.net/articles/2006/11/09/async-io-on-linux</link>
      <category>lighttpd</category>
      <category>aio</category>
      <category>linux</category>
      <trackback:ping>http://blog.lighttpd.net/articles/trackback/2199</trackback:ping>
    </item>
  </channel>
</rss>
