<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Russ Garrett &#187; tuning</title>
	<atom:link href="http://russ.garrett.co.uk/tag/tuning/feed/" rel="self" type="application/rss+xml" />
	<link>http://russ.garrett.co.uk</link>
	<description></description>
	<lastBuildDate>Wed, 02 Jun 2010 21:20:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Linux Kernel Tuning</title>
		<link>http://russ.garrett.co.uk/2009/01/01/linux-kernel-tuning/</link>
		<comments>http://russ.garrett.co.uk/2009/01/01/linux-kernel-tuning/#comments</comments>
		<pubDate>Thu, 01 Jan 2009 15:44:42 +0000</pubDate>
		<dc:creator>Russ</dc:creator>
				<category><![CDATA[Systems Admin]]></category>
		<category><![CDATA[kernel]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[sysctl]]></category>
		<category><![CDATA[tuning]]></category>

		<guid isPermaLink="false">http://russ.garrett.co.uk/?p=32</guid>
		<description><![CDATA[As promised in my netbooting post, here&#8217;s an annotated walkthrough of the Linux kernel tuning parameters that we use fairly constantly at Last.fm. Many of these parameters are documented in the files under Documentation/ in a Linux source tree, however it&#8217;s generally a pain to find parameters in that mess, so I will distill some [...]]]></description>
			<content:encoded><![CDATA[<p>As promised in my <a href="/2008/12/03/diskless-web-serving-for-fun-and-profit/">netbooting post</a>, here&#8217;s an annotated walkthrough of the Linux kernel tuning parameters that we use fairly constantly at Last.fm.</p>
<p>Many of these parameters are documented in the files under <a href="http://lxr.linux.no/linux+v2.6.28/Documentation">Documentation/</a> in a Linux source tree, however it&#8217;s generally a pain to find parameters in that mess, so I will distill some of that here. I&#8217;ll update this as I learn more.</p>
<h1>Networking Tuning</h1>
<p>These are the most important settings, especially if you&#8217;re using Gigabit networking (which everyone should be!). Although these are fairly aggressive, there shouldn&#8217;t be any penalty to applying them to every server (we tend to). They are all sysctl settings.</p>

<div class="wp_syntax"><div class="code"><pre class="ini" style="font-family:monospace;">net.core.rmem_max <span style="color: #000066; font-weight:bold;">=</span><span style="color: #660066;"> 16777216</span>
net.core.wmem_max <span style="color: #000066; font-weight:bold;">=</span><span style="color: #660066;"> 16777216</span></pre></div></div>

<p>The hard limits for the maximum amount of socket buffer space, in bytes. Of course 16MB <em>per socket</em> sounds like a lot, but most sockets won&#8217;t use anywhere near this much, and it&#8217;s nice to be able to expand if necessary.</p>

<div class="wp_syntax"><div class="code"><pre class="ini" style="font-family:monospace;">net.ipv4.tcp_rmem <span style="color: #000066; font-weight:bold;">=</span><span style="color: #660066;"> <span style="">4096</span> <span style="">87380</span> 16777216</span>
net.ipv4.tcp_wmem <span style="color: #000066; font-weight:bold;">=</span><span style="color: #660066;"> <span style="">4096</span> <span style="">65536</span> 16777216</span></pre></div></div>

<p>These are the corresponding settings for the IP protocol, in the format <code>(min, default, max)</code> bytes. The max value can&#8217;t be larger than the equivalent <code>net.core.{r,w}mem_max</code>.</p>

<div class="wp_syntax"><div class="code"><pre class="ini" style="font-family:monospace;">net.ipv4.tcp_mem</pre></div></div>

<p><strong>Don&#8217;t touch <code>tcp_mem</code></strong> for two reasons: Firstly, unlike tcp_rmem and tcp_wmem it&#8217;s in pages, not bytes, so it&#8217;s likely to confuse the hell out of you. Secondly, it&#8217;s already auto-tuned very well by Linux based on the amount of RAM.</p>

<div class="wp_syntax"><div class="code"><pre class="ini" style="font-family:monospace;">net.ipv4.tcp_max_syn_backlog <span style="color: #000066; font-weight:bold;">=</span><span style="color: #660066;"> 4096</span></pre></div></div>

<p>Increase the number of outstanding syn requests allowed.<br />
Note: some people (including myself) have used tcp_syncookies to handle the problem of too many legitimate outstanding SYNs. I quote the Linux documentation:</p>
<blockquote><p>Note, that syncookies is fallback facility.<br />
It MUST NOT be used to help highly loaded servers to stand<br />
against legal connection rate. If you see synflood warnings<br />
in your logs, but investigation shows that they occur<br />
because of overload with legal connections, you should tune<br />
another parameters until this warning disappear.</p></blockquote>

<div class="wp_syntax"><div class="code"><pre class="ini" style="font-family:monospace;">net.core.netdev_max_backlog <span style="color: #000066; font-weight:bold;">=</span><span style="color: #660066;"> 2500</span></pre></div></div>

<p>Standard network driver tuning improves speed for gigabit ethernet connections.</p>
<h1>VM</h1>

<div class="wp_syntax"><div class="code"><pre class="ini" style="font-family:monospace;">vm.min_free_kbytes <span style="color: #000066; font-weight:bold;">=</span><span style="color: #660066;"> 65536</span></pre></div></div>

<p>This tells the kernel to try and keep 64MB of RAM free at all times. It&#8217;s useful in two main cases:</p>
<ul>
<li>Swap-less machines, where you don&#8217;t want incoming network traffic to overwhelm the kernel and force an OOM before it has time to flush any buffers.</li>
<li>x86 machines, for the same reason: the x86 architecture only allows DMA transfers below approximately 900MB of RAM. So you can end up with the bizarre situation of an OOM error with tons of RAM free.</li>
</ul>

<div class="wp_syntax"><div class="code"><pre class="ini" style="font-family:monospace;">vm.swappiness <span style="color: #000066; font-weight:bold;">=</span><span style="color: #660066;">0</span></pre></div></div>

<p>It&#8217;s said that altering swappiness can help you when you&#8217;re running under high memory pressure with software that tries to do its own memory management (i.e. MySQL). We&#8217;ve had limited success with this and I&#8217;d much prefer to use software which doesn&#8217;t pretend to know more about your hardware than the OS (i.e. PostgreSQL). Not that I&#8217;m bitter.</p>

<div class="wp_syntax"><div class="code"><pre class="ini" style="font-family:monospace;">vm.overcommit_memory<span style="color: #000066; font-weight:bold;">=</span><span style="color: #660066;">1</span></pre></div></div>

<p>The overcommit_memory sysctl isn&#8217;t something you&#8217;ll usually have to change if your software isn&#8217;t insane, but our netboot setup uses it so I thought I&#8217;d mention it. From the documentation:</p>
<ul>
<li><strong>0</strong> &#8211; Heuristic overcommit handling. Obvious overcommits of<br />
address space are refused. Used for a typical system. It<br />
ensures a seriously wild allocation fails while allowing<br />
overcommit to reduce swap usage.  root is allowed to<br />
allocate slighly more memory in this mode. This is the<br />
default.</li>
<li><strong>1</strong> &#8211;   Always overcommit. Appropriate for some scientific<br />
applications.</li>
<li><strong>2</strong> &#8211;   Don&#8217;t overcommit. The total address space commit<br />
for the system is not permitted to exceed swap + a<br />
configurable percentage (default is 50) of physical RAM.<br />
Depending on the percentage you use, in most situations<br />
this means a process will not be killed while accessing<br />
pages but will receive errors on memory allocation as<br />
appropriate.</li>
</ul>
<p>For more info on this, see the <a href="http://lxr.linux.no/linux+v2.6.28/Documentation/vm/overcommit-accounting">overcommit accounting</a> documentation.</p>
<h1>Disk Tuning</h1>
<p>Really the only thing to note here is to set <code>elevator=deadline</code> in your kernel command line if you&#8217;re using RAID. This changes the IO scheduler to <code>deadline</code>, which has empirically been found to be best for almost all server workloads.</p>
<p>You can probably get a percent or two more out by tuning other settings, but we&#8217;ve found it&#8217;s not worth it.</p>
]]></content:encoded>
			<wfw:commentRss>http://russ.garrett.co.uk/2009/01/01/linux-kernel-tuning/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>
