<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>lighty's life: Tag concurrent,</title>
    <link>http://blog.lighttpd.net/articles/tag/concurrent%2C</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description></description>
    <item>
      <title>Optimizing Lighty for high-concurrent, large-file downloads</title>
      <description>&lt;p&gt;In lighttpd 1.4.6 we have added some modifications for sites which have handle some 100 files in parallel with size of more than 100Mb each.&lt;/p&gt;


	&lt;p&gt;The problem in earlier releases was that lighttpd had to wait until the disk had seeked to the right place, read a few 100 kbyte to send it out. And this for each request as this scenario was completly trashing the disk-buffering. The IO-wait went sky-high and we were completly bound to the disk-io.&lt;/p&gt;
&lt;p&gt;You could see this by running vmstat:&lt;/p&gt;


&lt;pre&gt;
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  7 306852  51396    152 237004  632  492 34952  7531 6983 11677  3 25  0 72
&lt;/pre&gt;

	&lt;p&gt;You see, we have 72% io-wait and 25% are spent in the kernel doing something. Only 3% are spent in userland (lighty).&lt;/p&gt;


	&lt;p&gt;When we investigated the problem we layed out a plan:&lt;/p&gt;


&lt;pre&gt;
/* Optimizations for the future:
 *
 * adaptive mem-mapping
 *   the problem:
 *     we mmap() the whole file. If someone has alot large files and 32bit
 *     machine the virtual address area will be unrun and we will have a failing
 *     mmap() call.
 *   solution:
 *     only mmap 16M in one chunk and move the window as soon as we have finished
 *     the first 8M
 *
 * read-ahead buffering
 *   the problem:
 *     sending out several large files in parallel trashes the read-ahead of the
 *     kernel leading to long wait-for-seek times.
 *   solutions: (increasing complexity)
 *     1. use madvise
 *     2. use a internal read-ahead buffer in the chunk-structure
 *     3. use non-blocking IO for file-transfers
 *   */
&lt;/pre&gt;

	&lt;p&gt;And that&amp;#8217;s what we did step-by-step:&lt;/p&gt;


	&lt;ul&gt;
	&lt;li&gt;instead of mmap()ing the whole file we use a moving window of 512kb&lt;/li&gt;
		&lt;li&gt;we used madvise() to tell the kernel to load this area into memory &lt;span class="caps"&gt;NOW&lt;/span&gt;&lt;/li&gt;
		&lt;li&gt;we have a switch to use local-buffering instead of asking the kernel to do so&lt;/li&gt;
	&lt;/ul&gt;


If you want to use it, set:
&lt;pre&gt;
server.network-backend = "writev" 
&lt;/pre&gt;

	&lt;p&gt;If you want to try out the local-buffering define &lt;span class="caps"&gt;LOCAL&lt;/span&gt;_BUFFERING in network_writev.c.&lt;/p&gt;


	&lt;p&gt;Up to now we don&amp;#8217;t put any pressure on async-file-io as the coding impact is quite large and we don&amp;#8217;t expect a big win over the current behaviour.&lt;/p&gt;


	&lt;p&gt;Both ways result in better caching the data we need which reduces the number of seeks and in the end reduces io-wait for  us.&lt;/p&gt;


	&lt;p&gt;Problem solved, next one :)&lt;/p&gt;</description>
      <pubDate>Fri, 11 Nov 2005 14:28:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:94ecbe62-bee5-4490-abda-d1d0c139cbbd</guid>
      <author>jan</author>
      <link>http://blog.lighttpd.net/articles/2005/11/11/optimizing-lighty-for-high-concurrent-large-file-downloads</link>
      <category>high</category>
      <category>concurrent,</category>
      <category>large-file</category>
      <trackback:ping>http://blog.lighttpd.net/articles/trackback/26</trackback:ping>
    </item>
  </channel>
</rss>
