<?xml version="1.0" encoding="utf-8" ?>

<rss version="2.0" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   >
<channel>
    <title>Esoteric Curio</title>
    <link>http://www.lethargy.org/~jesus/</link>
    <description>Theo's Contributions to Technological Surreality</description>
    <dc:language>en</dc:language>
    <generator>Serendipity 1.1 - http://www.s9y.org/</generator>
    <pubDate>Thu, 25 Sep 2008 13:22:22 GMT</pubDate>

    <image>
        <url>http://www.lethargy.org/~jesus/templates/default/img/s9y_banner_small.png</url>
        <title>RSS: Esoteric Curio - Theo's Contributions to Technological Surreality</title>
        <link>http://www.lethargy.org/~jesus/</link>
        <width>100</width>
        <height>21</height>
    </image>

<item>
    <title>Irony: The website is what?</title>
    <link>http://www.lethargy.org/~jesus/archives/132-Irony-The-website-is-what.html</link>
            <category>Damaged Bits</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/132-Irony-The-website-is-what.html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=132</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=132</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    &lt;p&gt;The Internet is not without a sense of irony:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://thewebsiteisdown.com/&quot;&gt;The website is down.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The website is up.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&lt;span style=&quot;text-decoration: line-through;&quot;&gt;In case they fix it&lt;/span&gt; They fixed it... thewebsiteisup.com just spewed PHP errors at the time I wrote this.&lt;/p&gt;&lt;/blockquote&gt; 
    </content:encoded>

    <pubDate>Sat, 20 Sep 2008 10:40:49 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/132-guid.html</guid>
    
</item>
<item>
    <title>Zetaback. Respect.</title>
    <link>http://www.lethargy.org/~jesus/archives/131-Zetaback.-Respect..html</link>
            <category>OpenSolaris</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/131-Zetaback.-Respect..html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=131</wfw:comment>

    <slash:comments>5</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=131</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    &lt;p&gt;With the release of &lt;a href=&quot;http://opensolaris.org/os/community/zfs/&quot;&gt;ZFS&lt;/a&gt; on Solaris 10, I sat down and marveled at the opportunities for off-site backups.  I have already &lt;a href=&quot;http://lethargy.org/~jesus/archives/114-ZFS.-Respect..html&quot;&gt;written a bit about ZFS detailing why I think it kicks so much ass&lt;/a&gt;.  With zfs send and zfs receive, one can manage block-level incremental backups and restores.  What&#039;s missing?  An elegant hack leveraging that to provide a simple and reliable backup infrastructure for a network of ZFS capable machines (including Mac OS X and FreeBSD now, BTW).&lt;/p&gt;

&lt;p&gt;So, I sat down and wrote &lt;a href=&quot;https://labs.omniti.com/trac/zetaback&quot;&gt;Zetaback&lt;/a&gt; -- which is currently 1032 lines of perl code (including complete documentations) plus a thin agent on remote machines that is 290 lines of perl code (including complete documentation).  I&#039;d like to note that the only reason there is documentation, let alone complete documentation, is because of &lt;a href=&quot;http://omniti.com/is/eric-sproul&quot;&gt;Eric Sproul&lt;/a&gt;.  This really demonstrates to me that &quot;Keep It Simple Stupid&quot; still works for important tasks.&lt;/p&gt;

&lt;p&gt;Zetaback is a rather full features backup and restore system.  It can manage multiple hosts, multiple ZFS per host, both frequency and retention policies on full and incremental backups.  It can report policy violators (things that haven&#039;t been backed up within the policy).  It can manage the archiving of backups.  It provides both non-interactive and interactive restores.  It has an excellent command line syntax.  And most importantly, it has saved my ass more times than I can count.&lt;/p&gt;

&lt;p&gt;I&#039;m not usually big on awards... I find the single unexpected email from someone saying: &quot;damn that was useful, thanks!&quot; to be more gratifying most of the time.  However, Zetaback was one of the first projects &lt;a href=&quot;http://omniti.com/&quot;&gt;we&lt;/a&gt; put up on &lt;a href=&quot;http://labs.omniti.com/&quot;&gt;labs&lt;/a&gt;, so being a 3rd place winner in the &lt;a href=&quot;http://www.opensolaris.org/os/project/awards/awards_land/Entries/&quot;&gt;OpenSolaris Community Innovation Awards&lt;/a&gt; is pretty exciting.&lt;/p&gt; 
    </content:encoded>

    <pubDate>Fri, 19 Sep 2008 10:02:15 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/131-guid.html</guid>
    
</item>
<item>
    <title>Last second scaling hack</title>
    <link>http://www.lethargy.org/~jesus/archives/130-Last-second-scaling-hack.html</link>
            <category>Damaged Bits</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/130-Last-second-scaling-hack.html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=130</wfw:comment>

    <slash:comments>3</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=130</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    &lt;p&gt;So, you have an app.  You can&#039;t change the code.  Now this isn&#039;t the common case when I try to scale things. I usually roll up my sleeves and ignore application stack boundaries.  This is a unique case where for political reasons, I can&#039;t touch the app.  So.. the app was a tiny little site, then it got popular on facebook and collegehumor and instead of pushing 5-10 megabits, it was falling apart at around 105 megabits due to resource saturation (one box wasn&#039;t enough) and ended up needing to push 200 megabits.&lt;/p&gt;

&lt;p&gt;200 megabits isn&#039;t all that much traffic anymore, but when the application wasn&#039;t written to scale horizontally, you are at the mercy of its raw performance and must scale vertically.  If the application hasn&#039;t had a lot of focus on profiling and performance tuning, it means you are going to hit that extremely painful price point of vertical scaling.  In this case, the architecture went live with an expectation of a 20Mbit/s peak and BOOM.  Because it needed to be fixed quickly, purchasing new hardware is now a problem for scheduling reasons more than financial ones, we have plenty of similar hardware available, just nothing with twice the RAM and twice the cores and twice the disks.&lt;/p&gt;

&lt;p&gt;The reason that this app couldn&#039;t scale is because it used not only a shared DB (which is very very common) it required filesystem use and thus needed a shared filesystem.  So, how do you fix that without modifying the app?  You study the app and look for patterns of use that can be exploited.&lt;/p&gt;

&lt;p&gt;First we looked at the database.  In this case, it was not being pushed very hard.  We could easily handle a tenfold increase in traffic without exhausting database resources...  That was a relief, because scaling a database &quot;behind the scenes&quot; without any application access can be more than a few hour exercise.  Next we found that the app itself (PHP) was taxing memory, CPUs and disk I/O pretty heavily.  The most important was memory and CPU, but disk I/O was a close second.  This meant that if we just installed the app on another machine and NFS exported the first machine&#039;s mounts, it would &quot;work&quot; but not achieve out performance requirements because of I/O saturation.  Quick testing in this arena showed about 15% increase in capacity -- just not enough.&lt;/p&gt;

&lt;p&gt;So, this app needs a shared FS.  Why?  Well the user uploads assets, and then through the life of their session, the app serves them back to that user.  EASY, session sticky load balancing (by source IP or by introduced cookie on the load balancer).  Because of the nature of this app, session sticky load balancing produced extremely inequitable load distribution and we would have had to bump up to three servers.  Not ideal, but acceptable -- this is triage.  One step forward, flat on our face:  it appears that under certain circumstances, the images I upload are served to another.&lt;/p&gt;

&lt;p&gt;So, basically, all I need is to glue the static assets (uploaded by users) together under a common URL (and push 200Mbs or so).  Some assets are on one server, some on another, and I have no way of knowing which server owns the asset without looking in the FS... or asking over HTTP and getting a 404 back.&lt;/p&gt;

&lt;p&gt;I just happen to have a &lt;a href=&quot;http://varnish.projects.linpro.no/&quot;&gt;Varnish&lt;/a&gt; instance to provide content acceleration for other bits of infrastructure.  And Varnish has (as its major selling point, IMO) the VCL language that allows me to script how it handles requests and satisfies them.&lt;/p&gt;

&lt;p&gt;If I get a request, I want to try server one, if I get a 404, I&#039;d like to retry the request against server two.  As the number of servers goes up, this solution completely falls apart as the 404 isn&#039;t that cheap.  I want it fast, efficient, and it&#039;d be great to cache it.  If it isn&#039;t fast and efficient, I&#039;ve simply moved my problem instead of addressing it.  This works well because serving a 404 on server one is cheap.  Remember, triage.&lt;/p&gt;

&lt;pre&gt;
backend obscuredserver1 {
  .host = &quot;10.225.209.89&quot;;
  .port = &quot;80&quot;;
}
backend obscuredserver2 {
  .host = &quot;10.225.209.90&quot;;
  .port = &quot;80&quot;;
}

sub vcl_recv {
  if (req.http.host ~ &quot;^fqdn\.of\.caching\.server$&quot;) {
    if (req.restarts == 0) {
      set req.backend = obscuredserver1;
    } else {
      set req.backend = obscuredserver2;
    }
  }
  if (req.request != &quot;GET&quot; &amp;amp;&amp;amp; req.request != &quot;HEAD&quot;) {
    pipe;
  }
  lookup;
}

sub vcl_fetch {
  if (req.http.host ~ &quot;^fqdn\.of\.caching\.server$&quot; &amp;amp;&amp;amp;
      req.restarts == 0 &amp;amp;&amp;amp; obj.status == 404) {
    restart;
  }
  if (!obj.cacheable) {
    pass;
  }
  if (obj.http.Set-Cookie) {
    pass;
  }
  set obj.prefetch = -30s;
  deliver;
}
&lt;/pre&gt;

&lt;p&gt;Now, this is a excerpt, my varnishes here have some other logic for other services that I can&#039;t share... However, they are rather lightly used.  That particular instance went from serving an average of 6 Mbits/second to peaking at 200 Mbits/second.  And the system load jumped from 0.01 to 0.06.  It&#039;s nice when a triage exercise results in a quick hack that doesn&#039;t bust at the seams -- we&#039;ve got plenty of headroom.&lt;/p&gt;

&lt;p&gt;While I, in no way, consider this successful scaling.  I consider it successful triage by creative engineering (a.k.a. hack).  And for those that like pretty pictures, these demonstrate that when you encounter capacity issues, it isn&#039;t always pretty and graceful.  &lt;a href=&quot;http://en.wikipedia.org/wiki/Queueing_theory&quot;&gt;Queueing theory&lt;/a&gt; is complicated and sometimes results in everyone getting screwed.  Here&#039;s a visualization of queueing theory making trouble.&lt;/p&gt;

&lt;br/&gt;
&lt;div style=&quot;text-align: center; border: 1px solid #666; padding: 1em;&quot;&gt;
&lt;img src=&quot;http://images.omniti.net/www.lethargy.org/%7Ejesus/misc/bad%20days.png&quot; height=&quot;196&quot; width=&quot;450&quot; /&gt;&lt;br /&gt;
Queueing theory rears its ugly head
&lt;/div&gt;
&lt;br/&gt;&lt;br/&gt;
&lt;div style=&quot;text-align: center; border: 1px solid #666; padding: 1em;&quot;&gt;
&lt;img src=&quot;http://images.omniti.net/www.lethargy.org/%7Ejesus/misc/good%20days.png&quot; height=&quot;199&quot; width=&quot;450&quot; /&gt;&lt;br /&gt;
What it looks like with some headroom.
&lt;/div&gt; 
    </content:encoded>

    <pubDate>Tue, 16 Sep 2008 00:17:20 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/130-guid.html</guid>
    
</item>
<item>
    <title>OpenSSH and SecurID, still a good choice.</title>
    <link>http://www.lethargy.org/~jesus/archives/129-OpenSSH-and-SecurID,-still-a-good-choice..html</link>
            <category>Damaged Bits</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/129-OpenSSH-and-SecurID,-still-a-good-choice..html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=129</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=129</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    &lt;p&gt;A long time ago, I wrote integration into the portable version of OpenSSH to allow direct authentication against an RSA ACE (SecurID) server. I&#039;ve received many thanks over time for the work and I&#039;m aware that it is used at some (very large) organizations. However, as with most security related things, people tend not to talk about what they do. As it is open source and no registration is required to download the patch, I think I might have underestimated the deployments.&lt;/p&gt;

&lt;p&gt;Quite some time ago, Jim Matthews over at NASA took over maintenance of the patch. This sort of seamless transition of ownership is why I really love open source. Jim does a great job.&lt;/p&gt;

&lt;p&gt;Since that patch&#039;s inception, it has been hosted on my &lt;a href=&quot;http://lethargy.org/%7Ejesus/projects/&quot;&gt;old static projects page&lt;/a&gt;. That meant that James has to send me a copy to post every time a new version of the patch came out. How 1998. Anyway, since &lt;a href=&quot;http://omniti.com/&quot;&gt;we&lt;/a&gt; went through all the effort of setting up &lt;a href=&quot;https://labs.omniti.com/&quot;&gt;open source hosting&lt;/a&gt;, how about I use it!  The &lt;a href=&quot;https://labs.omniti.com/trac/openssh-securid&quot;&gt;OpenSSH+SecurID&lt;/a&gt; integration effort has moved to labs!  Get your one-time-password, two-factor security while it&#039;s hot!&lt;/p&gt; 
    </content:encoded>

    <pubDate>Tue, 02 Sep 2008 11:01:47 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/129-guid.html</guid>
    
</item>
<item>
    <title>XML/XSLT and DocBook for docs</title>
    <link>http://www.lethargy.org/~jesus/archives/128-XMLXSLT-and-DocBook-for-docs.html</link>
            <category>Damaged Bits</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/128-XMLXSLT-and-DocBook-for-docs.html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=128</wfw:comment>

    <slash:comments>2</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=128</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    I&#039;ve been writing docs for &lt;a href=&quot;https://labs.omniti.com/trac/reconnoiter&quot;&gt;Reconnoiter&lt;/a&gt;. I selected &lt;a href=&quot;http://www.docbook.org/&quot;&gt;DocBook&lt;/a&gt; for two reasons. First, I hoped that number of polished documents I&#039;ve seen written in DocBook would mean that if this manual grows in size and usefulness we might be able to achieve some polish &quot;on the cheap.&quot; Second, our &lt;a href=&quot;https://labs.omniti.com/&quot;&gt;open-source site&lt;/a&gt; has a really nice automated systems for auto-publishing project documentation... if it is in DocBook. That said, DocBook is a complete pain in the ass. It isn&#039;t broken or bad, but it really gets in the way of writing the documentation. There are so many marks, and their use is specific and contextual. I understand that there is a reason for all the marks (provide semantic meaning to what you write), but you must be fluent with the entire specification and practiced to achieve anything useful with DocBook.&lt;br /&gt;&lt;br /&gt;The fact that DocBook uses &lt;a href=&quot;http://www.w3.org/XML/&quot;&gt;XML&lt;/a&gt; is tiresome. XML itself isn&#039;t bad. Despite my extreme ineptitude and writing XML that will &lt;a href=&quot;http://en.wikipedia.org/wiki/Lint_programming_tool&quot;&gt;lint&lt;/a&gt;, I&#039;m rather fond of XML. It just so happens that because I&#039;m fond of XML, the software product I&#039;m using leverage XML in configuration files and support files. This leaves me in the painful position of documenting XML in XML and now I have all sorts of escaping requirements that I&#039;d like to not use CDATA for (cause I want to lint them).&lt;br /&gt;&lt;br /&gt;Now, enter the next phase of inconvenience (read: torture). Because Reconnoiter is modular, the documentation for the modules needs to be programmatically accessible to assist with online configuration validation. To achieve this, we choose to put the module docs right next to the module itself and from there we produce DocBook snippets for inclusion in the reference manual.  The module documentation is in XML, the configuration of the module is in XML, the configuration of Reconnoiter is in XML, DocBook is in XML... We just need to take them all and compose documentation (with all the right escaping). Enter &lt;a href=&quot;http://www.w3.org/XML/&quot;&gt;XSLT&lt;/a&gt;: a tool of torture. Now, I&#039;m intimately familiar with the inner workings of XSLT transforms from the C side hving worked extensively on &lt;a href=&quot;https://labs.omniti.com/trac/fastxsl&quot;&gt;FastXSL for PHP&lt;/a&gt;, but actually writing XSL documents is something I strive to avoid.  This conflicts with one of my core principles: &quot;use the right tool for the job.&quot; The entire purpose of XSL is to translate XML documents into other documents (XML being a first-class target). XSLT it is.&lt;br /&gt;&lt;br /&gt;I&#039;m left in a position of documenting a component of the system in XML (easy enough) and then writing XML to run on the first XML (which contains both raw docbook XML snipets that cannot be escaped as well as XML configuration snipets which must be escaped) to produce a second XML to be included by a larger XML DocBook document. This now conflicts with another of my core principles: &quot;&lt;a href=&quot;http://simple.wikipedia.org/wiki/K.I.S.S.&quot;&gt;K.I.S.S.&lt;/a&gt;&quot; (I&#039;ll note the wikipedia entry says it&#039;s used on the Internet in a way that implies it started there.  This term far far predats the Internet. Oh, and the last S is definitely &quot;Stupid&quot;).&lt;br /&gt;&lt;br /&gt;Now, I don&#039;t much like DocBook because of my lack of fluency and its interruptive influence on the process of writing documentation. This is mostly my shortcoming, not DocBook&#039;s. I never can seem to write an XML document that lints the first time around... Again, my shortcoming, not XMLs.  XSLT is an &quot;ornate, complexly syntaxed, functionally limited shortcoming.&quot; I really hate XSLT.  &lt;a href=&quot;http://wiki.theory.org/YourLanguageSucks#XSLT.2FXPath_sucks_because:&quot;&gt;Here are someone else&#039;s reasons&lt;/a&gt;.  They are good, but mine reason is that my tasks are usually simple and there is no simple variant of XSLT that can do the job; i.e. the jobs I do are simple and I think XSLT&#039;s cost (learning) rarely outweighs the value.&lt;br /&gt;&lt;br /&gt;This makes me think that most documentations should be written by a tag-team.  A documentation writer that writes in something like &lt;a href=&quot;http://perldoc.perl.org/perlpod.html&quot;&gt;POD&lt;/a&gt; (the ultimate in K.I.S.S.) and a documentation compiler that consumes that and produces all the required DocBook stuff.  Within a company, it&#039;s easy to put this process in place, but it&#039;s hard to get this sort of collaboration on open-source projects.&lt;br /&gt;&lt;br /&gt;It&#039;s hard enough to get an engineer to write good documentation.  Painful, persistent and minor technical obstacles really don&#039;t help.  There should be a really kick-ass opensource docbook editor for technical manuals that integrates with subversion and mult-file layouts.  This won&#039;t ever make XSLT better, but it would sure allow engineers to concentrate on content instead of markup.  Anyone know of one?&lt;br /&gt; 
    </content:encoded>

    <pubDate>Sun, 31 Aug 2008 12:01:28 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/128-guid.html</guid>
    
</item>
<item>
    <title>My favorite cookbook</title>
    <link>http://www.lethargy.org/~jesus/archives/127-My-favorite-cookbook.html</link>
            <category>Food</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/127-My-favorite-cookbook.html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=127</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=127</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    &lt;p&gt;Many people say that their favorite cookbook is &lt;a href=&quot;http://www.amazon.com/gp/product/0026045702?ie=UTF8&amp;tag=lethargy-20&amp;amp;linkCode=as2&amp;camp=1789&amp;amp;creative=9325&amp;amp;creativeASIN=0026045702&quot;&gt;Joy of Cooking&lt;/a&gt;&lt;img src=&quot;http://www.assoc-amazon.com/e/ir?t=lethargy-20&amp;l=as2&amp;amp;o=1&amp;amp;a=0026045702&quot; alt=&quot;&quot; style=&quot;border: medium none  ! important; margin: 0px ! important;&quot; border=&quot;0&quot; height=&quot;1&quot; width=&quot;1&quot; /&gt;.  I love that cookbook, it is excellent.  However, when it comes to just getting the job done, I find the ultimate reference manual to be &lt;a href=&quot;http://www.amazon.com/gp/product/0824102878?ie=UTF8&amp;tag=lethargy-20&amp;amp;linkCode=as2&amp;camp=1789&amp;amp;creative=9325&amp;amp;creativeASIN=0824102878&quot;&gt;The Original Fannie Farmer 1896 Cookbook: The Boston Cooking School&lt;/a&gt;&lt;img src=&quot;http://www.assoc-amazon.com/e/ir?t=lethargy-20&amp;l=as2&amp;amp;o=1&amp;amp;a=0824102878&quot; alt=&quot;&quot; style=&quot;border: medium none  ! important; margin: 0px ! important;&quot; border=&quot;0&quot; height=&quot;1&quot; width=&quot;1&quot; /&gt;&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;Just thought I&#039;d share.&lt;/p&gt; 
    </content:encoded>

    <pubDate>Wed, 27 Aug 2008 17:43:57 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/127-guid.html</guid>
    
</item>
<item>
    <title>Sweet Shrimp Goulash</title>
    <link>http://www.lethargy.org/~jesus/archives/126-Sweet-Shrimp-Goulash.html</link>
            <category>Food</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/126-Sweet-Shrimp-Goulash.html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=126</wfw:comment>

    <slash:comments>1</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=126</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    &lt;p&gt;Okay, it&#039;s not quite Goulash, but I&#039;ll call it what I want.  If you want it red, add some paprika.&lt;/p&gt;

&lt;p&gt;I made this the other night and it worked well for me.&lt;/p&gt;

&lt;h3&gt;Glaze:&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;1/2 cup of rice wine vinegar&lt;/li&gt;
&lt;li&gt;2 tbsp of honey&lt;/li&gt;
&lt;li&gt;1/3 tbsp of lemon juice&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Bring to boil and reduce to coat the back of a spoon.&lt;/p&gt;

&lt;h3&gt;Goulash&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;1 red onion chopped large&lt;/li&gt;
&lt;li&gt;6 cloves are garlic shredded or pressed&lt;/li&gt;
&lt;li&gt;3 jalepenos sliced (seeds in)&lt;/li&gt;
&lt;li&gt;2 thai chilis sliced (seeds in)&lt;/li&gt;
&lt;li&gt;3 table spoons of vegetable oil&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sweat the the above until onions loose their sharpness (not yet translucent).  Immediately add:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;4 cups of assorted exotic mushrooms (shitake and others to liking)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cook until mushrooms take on juices and onions are translucent.  Immediately add:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1 lbs of 26-30 count shrimp, pealed, tails off&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once shrimp are pink evenly add glaze. Cook until shrimp are done (30 seconds or so past pink).&lt;/p&gt;

&lt;p&gt;Enjoy!  It has some bite.&lt;/p&gt; 
    </content:encoded>

    <pubDate>Wed, 27 Aug 2008 15:11:25 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/126-guid.html</guid>
    
</item>
<item>
    <title>Varnish, get your patch on.</title>
    <link>http://www.lethargy.org/~jesus/archives/125-Varnish,-get-your-patch-on..html</link>
            <category>OpenSolaris</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/125-Varnish,-get-your-patch-on..html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=125</wfw:comment>

    <slash:comments>1</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=125</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    &lt;p&gt;&lt;a href=&quot;http://varnish.projects.linpro.no/&quot;&gt;Varnish&lt;/a&gt; is a &quot;bad ass&quot; new HTTP caching accelerator.  It&#039;s developed by &lt;a href=&quot;http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/misc.html#BIKESHED-PAINTING&quot;&gt;some crufty old BSD hacker&lt;/a&gt; and has a lot of Linux users.  By and large, it has ignored Solaris.  This sort of neglect isn&#039;t malicious, it is just neglect... you know: &quot;out of sight, out of mind.&quot;&lt;/p&gt;

&lt;p&gt;Well, check out &lt;a href=&quot;http://varnish.projects.linpro.no/svn/trunk/varnish-cache/&quot;&gt;Varnish trunk&lt;/a&gt; and give &lt;a href=&quot;http://lethargy.org/%7Ejesus/misc/varnish-solaris-trunk-3071.diff&quot;&gt;this patch&lt;/a&gt; a spin.  Let me know what you think.&lt;/p&gt;

&lt;p&gt;Perhaps one day, the Solaris networking team (or someone else) will satisfy this pretty abysmal shortcoming: &lt;a href=&quot;http://bugs.opensolaris.org/view_bug.do?bug_id=4641715&quot;&gt;BugID 4641715&lt;/a&gt;.&lt;/p&gt; 
    </content:encoded>

    <pubDate>Sun, 10 Aug 2008 21:50:00 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/125-guid.html</guid>
    
</item>
<item>
    <title>BWPUG: The essential PostgreSQL.conf</title>
    <link>http://www.lethargy.org/~jesus/archives/124-BWPUG-The-essential-PostgreSQL.conf.html</link>
            <category>BWPUG</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/124-BWPUG-The-essential-PostgreSQL.conf.html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=124</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=124</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    &lt;p&gt;Howdy folks,&lt;/p&gt;

&lt;p&gt;This is a reminder that our monthly meetup is scheduled to take place this coming MONDAY, August 11th.  As requested, we&#039;ve moved the meetings from Wednesday to Monday to facilitate some of the would-be-attendees that have contacted me out of band.&lt;/p&gt;

&lt;p&gt;This month&#039;s presentation is titled &quot;The essential PostgreSQL.conf&quot;. With almost 200 configuration parameters, some people might think the postgresql.conf is a bit heady, but the truth is there are only about 2 dozen that you really need for everyday use. This talk will discuss the different types of configuration settings, and give an overview of the ones you&#039;ll want to know when running PostgreSQL. Speakers for the talk are Greg Smith, Software Engineer at Truviso, and Robert Treat, Database Architect at OmniTI.&lt;/p&gt;

&lt;p&gt;Look forward to seeing you all there!&lt;/p&gt;

&lt;pre&gt;
2008-08-11 @ 6:30pm
OmniTI
7070 Samuel Morse Dr. Ste 150
Columbia, MD 21046
&lt;/pre&gt;

&lt;p&gt;Best regards,&lt;/p&gt;

&lt;p&gt;Theo&lt;/p&gt;
 
    </content:encoded>

    <pubDate>Fri, 08 Aug 2008 13:55:44 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/124-guid.html</guid>
    
</item>
<item>
    <title>OSCON2008 Presentation</title>
    <link>http://www.lethargy.org/~jesus/archives/123-OSCON2008-Presentation.html</link>
            <category>BWPUG</category>
            <category>Damaged Bits</category>
            <category>OpenSolaris</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/123-OSCON2008-Presentation.html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=123</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=123</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    &lt;p&gt;Hello from OSCON.  I gave my full-stack introspection crash course talk today.  It has been quite a while since I&#039;ve presented anything in a 40 minute format, but I think the talk went quite well.  I got a lot of positive feedback.&lt;/p&gt;

&lt;p&gt;I decided to take a risky approach inspired by &lt;a href=&quot;http://wikis.sun.com/display/DTrace/dtrace.conf&quot;&gt;dtrace.conf(08)&lt;/a&gt; by demonstrating dtrace on a live, mission-critical system we run at &lt;a href=&quot;http://omniti.com/&quot;&gt;OmniTI&lt;/a&gt;.  The risks of this are: network connections flake out, dtrace doesn&#039;t work correctly or I do something stupid and cause some service unavailability.  Well, as I use dtrace just about every day, I wasn&#039;t worried about the breaking things.  And while my network connection winked out for about one minute and dtrace has some &lt;a href=&quot;http://forums.sun.com/thread.jspa?messageID=10099871&quot;&gt;annoying &lt;b&gt;sub-second&lt;/b&gt; aborts&lt;/a&gt;, I think the demonstration was quite effective.&lt;/p&gt;

&lt;p&gt;Many people gave positive commentary at the end and afterward.  People asked for the slide to be put online... and while they have no real content of value (as the demo was everything), I put them here anyway:&lt;/p&gt;

&lt;div style=&quot;width:425px;text-align:left&quot; id=&quot;__ss_526241&quot;&gt;&lt;a style=&quot;font:14px Helvetica,Arial,Sans-serif;display:block;margin:12px 0 3px 0;text-decoration:underline;&quot; href=&quot;http://www.slideshare.net/guestaeae3b/oscon2008-fullstack-introspection-crash-course?src=embed&quot; title=&quot;OSCON2008 Full-stack Introspection Crash Course&quot;&gt;OSCON2008 Full-stack Introspection Crash Course&lt;/a&gt;&lt;div class=&quot;youtube-video&quot;&gt;&lt;object style=&quot;margin:0px&quot; width=&quot;425&quot; height=&quot;355&quot;&gt;&lt;param name=&quot;movie&quot; value=&quot;http://static.slideshare.net/swf/ssplayer2.swf?doc=oscon2008-1216885292078669-8&quot;&gt; &lt;/param&gt;&lt;param name=&quot;allowFullScreen&quot; value=&quot;true&quot;&gt; &lt;/param&gt;&lt;param name=&quot;allowScriptAccess&quot; value=&quot;always&quot;&gt; &lt;/param&gt;&lt;embed src=&quot;http://static.slideshare.net/swf/ssplayer2.swf?doc=oscon2008-1216885292078669-8&quot; type=&quot;application/x-shockwave-flash&quot; allowscriptaccess=&quot;always&quot; allowfullscreen=&quot;true&quot; width=&quot;425&quot; height=&quot;355&quot;&gt; &lt;/embed&gt;&lt;/object&gt;&lt;/div&gt;&lt;div style=&quot;font-size:11px;font-family:tahoma,arial;height:26px;padding-top:2px;&quot;&gt;view &lt;a href=&quot;http://www.slideshare.net/guestaeae3b/oscon2008-fullstack-introspection-crash-course?src=embed&quot; title=&quot;View OSCON2008 Full-stack Introspection Crash Course on SlideShare&quot;&gt;presentation&lt;/a&gt; (tags: &lt;a style=&quot;text-decoration:underline;&quot; href=&quot;http://slideshare.net/tag/oscon&quot;&gt;oscon&lt;/a&gt; &lt;a style=&quot;text-decoration:underline;&quot; href=&quot;http://slideshare.net/tag/dtrace&quot;&gt;dtrace&lt;/a&gt;)&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In addtion to the slide stack, I&#039;ve included the simple scripts that I used during the demonstration.  I ran
&lt;a href=&quot;http://www.lethargy.org/~jesus/misc/oscon2008/qps.d&quot;&gt;qps.d&lt;/a&gt;,
&lt;a href=&quot;http://www.lethargy.org/~jesus/misc/oscon2008/query_speed.d&quot;&gt;query_speed.d&lt;/a&gt;, and
&lt;a href=&quot;http://www.lethargy.org/~jesus/misc/oscon2008/query_speed2.d&quot;&gt;query_speed2.d&lt;/a&gt; on the database server and
&lt;a href=&quot;http://www.lethargy.org/~jesus/misc/oscon2008/r3.sh&quot;&gt;r3.sh&lt;/a&gt;,
&lt;a href=&quot;http://www.lethargy.org/~jesus/misc/oscon2008/perl.d&quot;&gt;perl.d&lt;/a&gt;,
&lt;a href=&quot;http://www.lethargy.org/~jesus/misc/oscon2008/pcpu.d&quot;&gt;pcpu.d&lt;/a&gt;,
&lt;a href=&quot;http://www.lethargy.org/~jesus/misc/oscon2008/papcpu.d&quot;&gt;papcpu.d&lt;/a&gt;, and
&lt;a href=&quot;http://www.lethargy.org/~jesus/misc/oscon2008/papcpu2.d&quot;&gt;papcpu2.d&lt;/a&gt; on the web server.  These require The Devel::DTrace perl module, &lt;a href=&quot;http://labs.omniti.com/trac/project-dtrace/browser/trunk/postgresql&quot;&gt;PostgreSQL patches&lt;/a&gt;, and &lt;a href=&quot;http://labs.omniti.com/trac/project-dtrace/browser/trunk/apache22&quot;&gt;Apache 2.2.8 patches&lt;/a&gt;.  Some of them are approximations of correctness, so weigh the output appropriately (the perl ones).  Enjoy!&lt;/p&gt; 
    </content:encoded>

    <pubDate>Thu, 24 Jul 2008 03:58:09 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/123-guid.html</guid>
    
</item>
<item>
    <title>Scalability and concessions</title>
    <link>http://www.lethargy.org/~jesus/archives/122-Scalability-and-concessions.html</link>
            <category>Damaged Bits</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/122-Scalability-and-concessions.html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=122</wfw:comment>

    <slash:comments>3</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=122</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    &lt;p&gt;Oren Hurvitz has a great post about &lt;a href=&quot;http://hurvitz.org/blog/2008/06/linkedin-architecture&quot;&gt;LinkedIn&#039;s architecture&lt;/a&gt;. It&#039;s well-written and well thought out. Their architecture has evolved on what appears to be a steady and safe path of improvement. It is well worth a read.&lt;/p&gt;

&lt;p&gt;I would like to comment on something I see repeated again and again and is likely misinterpreted by young scalability architects. The statement of what you should expect to lose when you scale up/out. Oren writes:&lt;/p&gt;

&lt;blockquote&gt;The presentation ends with some tips about scaling. These are oldies but goodies:&lt;br/&gt;&lt;ul&gt;&lt;li&gt;Can’t use just one database. Use many databases, partitioned horizontally and vertically.&lt;/li&gt;&lt;li&gt;Because of partitioning, forget about referential integrity or cross-domain JOINs.&lt;/li&gt;&lt;li&gt;Forget about 100% data integrity.&lt;/li&gt;&lt;li&gt;At large scale, cost is a problem: hardware, databases, licenses, storage, power.&lt;/li&gt;&lt;li&gt;Once you’re large, spammers and data-scrapers come a-knocking.&lt;/li&gt;&lt;li&gt;Cache!&lt;/li&gt;&lt;li&gt;Use asynchronous flows.&lt;/li&gt;&lt;li&gt;Reporting and analytics are challenging; consider them up-front when designing the system.&lt;/li&gt;&lt;li&gt;Expect the system to fail.&lt;/li&gt;&lt;li&gt;Don’t underestimate your growth trajectory.&lt;/li&gt;&lt;/ul&gt;&lt;/blockquote&gt;

&lt;p&gt;Now, I agree with much of that. The spammers comment should be revised to &quot;&lt;em&gt;Fraud happens and the bigger you are, the bigger the bullseye.&lt;/em&gt;&quot; Be aware and protect your assets. Everything from Cache! on down: hard and fast rules. The cost argument is odd.  While it is completely correct, it&#039;s also rather obvious.  If your business model ties audience size and site use to revenue (which it should), then the cost should simply scale sub-linearly w.r.t. revenues (i.e. no big deal).  However, there are a few that remain on that list that should be cherished and the loss of them should pain you.&lt;/p&gt;

&lt;p&gt;&quot;&lt;em&gt;[You] Can&#039;t use just one database&lt;/em&gt;&quot; -- this is a conclusion you should arrive at after analysis. We have one client that supports 10 million users on a cluster of partitioned databases. We have another that supports 35 millions users on one database without issue and room for growth.&lt;/p&gt;

&lt;p&gt;&quot;&lt;em&gt;Because of partitioning, forget about referential integrity or cross-domain JOINs.&lt;/em&gt;&quot; Think. Think hard. Think harder. Sometimes it is possible to partition in a fashion that allows for integrity. While I&#039;m sure (or at least hope) that the LinkedIn guys had some sleepless nights making the decision to break foreign constraints, it isn&#039;t conveyed. You should absolutely have some sleepless nights over a decision like that. My bank supports many more users and transactions than LinkedIn -- and it damn well better have FKs and 100% integrity. So, while you still may partition in such a fashion that requires a loss of enforced integrity, the decision should be a heavy one.&lt;/p&gt;

&lt;p&gt;&quot;&lt;em&gt;Forget about 100% data integrity.&lt;/em&gt;&quot; WTF? While I&#039;m sure it was the end of the post and he was being smart, someone somewhere might actually take the advice to forget about data integrity. You never, ever, ever forget about it. We have some &quot;one big database&quot; architectures where data integrity has been an issue due to memory bit-flips (corrupt data on disk) -- it&#039;s a BFP (big f@#$ing problem) and we treat it that way. Sometimes you make an architectural decision that will make the loss of integrity much more probable (partitioning and losing FK constraints is a ripe example). It&#039;s still something that should be attended to with great attention and diligence. you should never forget about data integrity and always put forth the effort required to reach as close to 100% as possible. When you lose data integrity you end up with a big pile of shit in your database. I&#039;ll leave you with a rather crass metaphor:&lt;/p&gt;

&lt;blockquote&gt;There&#039;s an expectation that there is no shit on your living room floor. Don&#039;t shit in your living room. Don&#039;t let your dog shit in your living room. If you&#039;re a dog owner, you know your dog could have an accident. You bought the dog. You chose to increase the probability of finding shit in your living room. Don&#039;t ignore it or forget it. Clean up the shit when it happens. If you get suddenly ill while playing your Wii naked and shit on your living room floor (be it probable or improbable)... respect yourself -- clean it up. Never forget the goal: a 100% shit-free living room.
&lt;/blockquote&gt; 
    </content:encoded>

    <pubDate>Wed, 02 Jul 2008 11:52:03 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/122-guid.html</guid>
    
</item>
<item>
    <title>Reconnoiter and another platform</title>
    <link>http://www.lethargy.org/~jesus/archives/121-Reconnoiter-and-another-platform.html</link>
            <category>OpenSolaris</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/121-Reconnoiter-and-another-platform.html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=121</wfw:comment>

    <slash:comments>3</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=121</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    &lt;a href=&quot;http://labs.omniti.com/trac/reconnoiter&quot;&gt;Reconnoiter&lt;/a&gt; is coming along.  Unlike most open source project, I tend not to talk about mine until their are really useful to people.  Over the last year, I&#039;ve adopted the unhealthy attitude that useful means &quot;shiny front-end.&quot;  So, I&#039;m blogging to break that attitude and talk a bit about project that doesn&#039;t have a shiny front-end... yet.&lt;br /&gt;&lt;br /&gt;Reconnoiter is built out of years of frustration using tools like RRDTOOL, Munin, Cacti, ZenOSS, Nagios, etc. etc.  I have a lot of problems with these tools.  First, they are not efficient.  I need a powerful machine to monitor a mere 10k services.  And it actually gets to be an engineering challenge to monitor 100k services with these tools.  Also, the graphs are about 10 years old with respect to design and usability.  I want something new, something fresh, and something that doesn&#039;t need a damn web UI to configure.  Several people have asked, why are you reinventing the wheel?  Why don&#039;t you just improve an existing product?  My answer is that I want a well-thought-out product foundation so that I can trust all the bits.  I want reponsibilities decoupled at the right spots.  I want data in a form that the world can query and run reports the likes of which I have not concieved.  I don&#039;t want the load on my monitoring machines to be 8.  I want my monitoring system to check services and metrics when it planned to, not several minutes (or event 2 seconds) after it told me it would.  Simply put, I expect it to work well, all the time.  And, of course, I want it to work how I would expect it to work.&lt;br /&gt;&lt;br /&gt;Reconnoiter was born out of the need to monitor the internals of many disconnected data centers with between 10 and 1000 machines in each facility.  Monitoring can mean a lot of things, here I consider it to be the collection of metrics and awareness of their availability.  In and of itself, monitoring is pretty useless, but it is the foundation for two critical pursuits in Internet infrastructure and business management: fault detection and trending.&lt;br /&gt;&lt;br /&gt;Fault detection is as simple as understanding when something has faulted.  However, knowing something is broken is easier than knowing something is about to break.  Is it better to know that your machine just crashed because the chip slagged to the motherboard, or that the temperatures in rack 043 are rising unexpectedly?  Answer: both, but I hope I only learn the latter and not the former.  Truly, there are too many things to monitor... hundred or thousands of metrics on each piece of equipment.  I can&#039;t reasonable go in and configure good/bad thresholds on each one.  I want anomaly detection.  I want a system that I can say: &quot;this looks right, tell me when it stops looking right.&quot;  That, to me, is a much need companion to tradition fault detection.&lt;br /&gt;&lt;br /&gt;To me, trending is much more than drawing graphs... it is about intelligent data correlation, regression analysis/curve fitting and looking into the past to see how much you fucked up getting where you are now -- in the vain hope that you learn from your mistakes and plan better next time.&lt;br /&gt;&lt;br /&gt;Reconnoiter is an attempt to build these things.  Building a system requires starting with pain (need), solid structure and plumbing (good engineering).  So, reconnoiter is underway.  And this post is in mid-step:&lt;br /&gt;&lt;br /&gt;It started on OpenBSD, and added support for FreeBSD, Mac OS X, Linux.&lt;br /&gt;&lt;br /&gt;As of changeset [292], we have Solaris/OpenSolaris support.&lt;br /&gt;&lt;br /&gt;We have a pretty nice front-end for trending under construction, but it isn&#039;t there yet.  We&#039;ll have numeric data combined with textual &quot;event&quot; data on the same graphs.  All that convenient stuff.  Here&#039;s the rather plain-Jane graph you get now (because some people won&#039;t even read a post if it doesn&#039;t have a pretty graph):&lt;br /&gt;&lt;br /&gt;&lt;div align=&quot;center&quot;&gt;&lt;img style=&quot;max-width: 800px;&quot; src=&quot;http://www.lethargy.org/%7Ejesus/uploads/noit_bw_graph.png&quot; /&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;Honestly, I don&#039;t know what the value of this post is, but people around here keep telling me that people should be aware of an open-source tool like this, even if it isn&#039;t finished (read: usable) yet.  I say it isn&#039;t usable yet, but on our development instances here, we monitor 2892 production metrics across two data centers and the load never peaks past 0.10.  I&#039;m pretty excited about where this is going.  Honestly, my favorite part right now is that I can configure and control the noitd checking nodes via a telnet console and it acts as if it is a piece of network equipment rather than an &quot;application&quot; -- as it should be IMHO.&lt;br /&gt; 
    </content:encoded>

    <pubDate>Fri, 27 Jun 2008 17:27:27 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/121-guid.html</guid>
    
</item>
<item>
    <title>Dissecting today's Internet traffic spikes</title>
    <link>http://www.lethargy.org/~jesus/archives/118-Dissecting-todays-Internet-traffic-spikes.html</link>
            <category>Damaged Bits</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/118-Dissecting-todays-Internet-traffic-spikes.html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=118</wfw:comment>

    <slash:comments>13</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=118</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    Today&#039;s Internet has changed quite a bit from the Internet I used to know.  The Internet has always been successful because of net neutrality.  What&#039;s net neutrality?  It&#039;s complicated, but essentially it means that anyone anywhere can publish with equal rights.  These aren&#039;t the kind of rights people usually talk about... I&#039;m not speaking of freedom of speech.  Instead, I&#039;m talking about content being simply bits.  It doesn&#039;t matter if it comes from &lt;a href=&quot;http://cnn.com/&quot;&gt;CNN&lt;/a&gt; or &lt;a href=&quot;http://lethargy.org/&quot;&gt;this blog&lt;/a&gt;, you as a reader can download the bits that make up the pages you see without bias or preferential treatment.  This makes it darn easy to be a publisher and leads to a fabulous ecosystem with an overwhelming amount of varied content.  However, with more content it is easy to recognize that much of it is utter trash.  Yes. Yes. I know that one man&#039;s trash is another man&#039;s treasure.  However, it presents opportunities for sites that help you navigate the wasteland.&lt;br /&gt;&lt;br /&gt;Many popular sites today are popular because they link to articles and news items and photographs and movies all over the Internet; they are &quot;interest aggregation services.&quot;  And while the Internet has (for now) a decent preservation of net neutrality when it comes to simple web content, not all publishers are on equal footing.  Not long ago, anyone could run a server anywhere (their basement) with DSL or cable or (gasp) dial-up -- now, the challenge is coping with unexpected attention.&lt;br /&gt;&lt;br /&gt;Years ago, the site &lt;a href=&quot;http://slashdot.org/&quot;&gt;slashdot&lt;/a&gt; coined a term &quot;slashdotted&quot; which meant that a site received so much sudden traffic that service degraded beyond an acceptable point and the site was effectively unavailable.  This often happened to sites that were at the end of small pipes (DSL, T1, etc.) and occasionally (though rarely) due to bad engineering.  While slashdot might have coined the term, they simply don&#039;t have the viewership numbers that other large sites today have.&lt;br /&gt;&lt;br /&gt;At the &lt;a href=&quot;http://omniti.com/&quot;&gt;$DAYJOB&lt;/a&gt;, I work on sites that aren&#039;t on the end of T1 lines.  Sites with gigabits or tens of gigabits of connectivity.  Sites with 50 millions users.  Sites powered by thousands of machines. I also work on sites that service millions of people from just a handful of machines (efficiency certainly has its advantages sometimes).  I find it particularly interesting that already popular sites (with significant baseline bandwidth) are seeing these unexpected surges.  For a long time, my blog has been on this same machine which is a vhost for several other web sites.  I&#039;ve had traffic spikes from places like slashdot, reddit, digg, etc.  And, no surprise, I couldn&#039;t actually see the bandwidth jump on the graphs... 10Mbits to 11Mbs?  That&#039;s not a spike.&lt;br /&gt;&lt;br /&gt;Things are changing.  Sites like &lt;a href=&quot;http://digg.com/&quot;&gt;Digg&lt;/a&gt; are becoming ever more popular and people are drawn to them as a means of sifting the waste of the Internet.   This means as more people rely on &lt;a href=&quot;http://digg.com/&quot;&gt;Digg&lt;/a&gt; and &lt;a href=&quot;http://reddit.com/&quot;&gt;Reddit&lt;/a&gt; and other similar sites, the number of unexpected viewers of your content can rise more sharply.&lt;br /&gt;&lt;br /&gt;What does all of this mean?  It means that the old rule of thumb that your infrastructure should see 70% resource utilization at peak is starting to falter.  The typical trends used to look like this (this is last week&#039;s graph from a retail client with a user base of 3 million):&lt;br /&gt;&lt;br /&gt;&lt;div align=&quot;center&quot;&gt;&lt;img style=&quot;border: 1px solid rgb(200, 200, 200); padding: 4px; max-width: 800px;&quot; src=&quot;http://www.lethargy.org/%7Ejesus/uploads/Picture%201.png&quot; /&gt;&lt;br /&gt;&lt;div align=&quot;left&quot;&gt;&lt;br /&gt;We see a nice peak, a nice valley.  Thursday afternoon, we see a nice traffic spike.  Well, this used to be what I called a traffic spike.  Now, different services have different spike signatures.  It resembles traffic model of classic Internet advertising, except that there is genuine interest and thus dramatically higher conversion rates.  It&#039;s a simple combination of placement, frequency and exposure.  Because content, unlike ad banners, exists for an extended period of time (sometimes forever), the frequency is very high.  Digg and Reddit have excellent placement with very little exposure (things move out quickly).  A site like CNN or NYTimes usually provides mediocre placement (unless you are on the front page) and excellent exposure.&lt;br /&gt;&lt;br /&gt;Lately, I see more sudden eyeballs and what used to be an established trend seems to fall into a more chaotic pattern that is the aggregate of different spike signatures around a smooth curve.  This graph is from two consecutive days where we have a beautiful comparison of a relatively uneventful day followed by long-exposure spike (nytimes.com) compounded by a short-exposure spike (digg.com):&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div align=&quot;center&quot;&gt;&lt;img style=&quot;border: 1px solid rgb(200, 200, 200); padding: 4px; max-width: 800px;&quot; src=&quot;http://www.lethargy.org/%7Ejesus/uploads/graph.png&quot; /&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;The disturbing part is that this occurs even on larger sites now due to the sheer magnitude of eyeballs looking at today&#039;s already popular sites.  Long story short, this makes planning a real bitch.&lt;br /&gt;&lt;br /&gt;And the interesting thing is perspective on what is large...  People think Digg is popular -- it is.  The &lt;a href=&quot;http://nytimes.com/&quot;&gt;New York Times&lt;/a&gt; is too, as is CNN and most other major news networks -- if they link to your site, you can expect to see a dramatic and very sudden increase in traffic. And this is just in the United States (and some other English speaking countries)... there are others... and they&#039;re kinda big.&lt;br /&gt;&lt;br /&gt;What isn&#039;t entirely obvious in the above graphs?  These spikes happen inside 60 seconds.  The idea of provisioning more servers (virtual or not) is unrealistic.  Even in a cloud computing system, getting new system images up and integrated in 60 seconds is pushing the envelope and that would assume a zero second response time.  This means it is about time to adjust what our systems architecture should support.  The old rule of 70% utilization accommodating an unexpected 40% increase in traffic is unraveling.  At least eight times in the past month, we&#039;ve experienced from 100% to 1000% sudden increases in traffic across many of our clients.&lt;br /&gt;&lt;br /&gt;I talk about scalability a lot.  It&#039;s my job.  It&#039;s my passion.  I regularly emphasize that scalability and performance are truly different beasts.  One key to scalability is that a &quot;systems design&quot; scales.  Architectures are built to be able to scale, they are not built &quot;at scale.&quot;  It&#039;s just too expensive to build a system to serve a billion people (until you have a billion people).  It&#039;s cheap to &lt;em&gt;design&lt;/em&gt; a system to serve a billion people.  Once you have a billion people accessing your site, you can likely justify executing on your design.  Google is successful for this reason: their ideas scale and they can build into them as demand rises.  On the flip side, traffic anomalies in the form of spikes are unexpected (by their definition) and scaling a system out to meet the &lt;em&gt;unexpected&lt;/em&gt; demand is almost unreasonable.  I would even argue that it is more of a performance-centric issue.  I want every asset I serve to be as cheap to serve as possible allowing me to handle larger and larger spikes.&lt;br /&gt;&lt;br /&gt;The reason I find all of this stuff interesting is that understanding &lt;a href=&quot;http://omniti.com/does/scalability-and-performance&quot;&gt;performance and scalability&lt;/a&gt;, understanding the &lt;a href=&quot;http://scalableinternetarchitectures.com/blog/pages/about&quot;&gt;principles of scalable systems design&lt;/a&gt; and having &lt;a href=&quot;http://omniti.com/does/scalability-and-performance/process&quot;&gt;sound and efficient processes for handling performance issues&lt;/a&gt; is becoming crucial for sites regardless of their size.  This takes insight and practice and it reminds me of Knuth&#039;s famous saying:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;That&#039;s all well and good, but which 97% of the time?  My response to Knuth&#039;s statement (with which I completely agree) is:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Understanding what is and isn&#039;t &quot;premature&quot; is what separates senior engineers from junior engineers.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;Let&#039;s add perspective on the word &quot;sudden.&quot;  Most network monitoring systems poll SNMP devices (like switches, load-balancers, and hosts) once every five minutes (we do this every 30 seconds in some environments).  Some people say, &quot;my site scales! bring it on.&quot; We see these spikes happen inside 60 seconds and they occasionally induce a ten-fold increase over trended peaks.  Often times, this spike can be well underway for several minutes before your graphing tools even pick up on it.  Then, before you have time to analyze, diagnos and remediate... poof... it&#039;s gone.  Be careful what you wish for.&lt;br /&gt;&lt;br /&gt;This, in many ways, is like a tornado.  Our ability to predict them sucks.  Our responses are crude and they are quite damaging.  However, predicting these Internet traffic events isn&#039;t even possible -- there are no building weather patterns or early warning signs.  Instead we are forced to focus on different techniques for stability and safety.  The idea of a DoS, a DDoS or the sometimes similar signature of a sudden popularity spike doesn&#039;t increase my heart rate anymore -- it&#039;s just another day on the job.  However, I thought I&#039;d share the four guidelines that I believe are key to my sanity in these situations:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;em&gt;Be Alert&lt;/em&gt;: build automated systems to detect and pinpoint the cause of these issues quickly (in less than 60 seconds).&lt;/li&gt;&lt;li&gt;&lt;em&gt;Be Prepared&lt;/em&gt;: understand the bottlenecks of your service systemically.  Understanding your site inside and out.  Contemplate how you would respond if a specific feature or set of features on your site were to get &quot;suddenly popular.&quot;&lt;/li&gt;&lt;li&gt;&lt;em&gt;Perform Triage&lt;/em&gt;: understand the importance of the various services that make up your site.  If you find yourself in a position to sacrifice one part to ensure continued service of another, you should already know their relative importance and not hesitate in the decision.&lt;/li&gt;&lt;li&gt;&lt;em&gt;Be Calm&lt;/em&gt;: any action that is not analytically driven is a waste of time and energy.  be quick, not rash.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;Back to those other countries... Enter China and their recently lessened censorship and we have a looming tidal wave for smaller sites that achieve sudden popularity.  Spikes of several hundred megabits per second are difficult to account for when your normal trend is around twenty megabits per second.    The following graph is traffic induced from a link from a popular foreign news site (that I can&#039;t read).  I call it: &quot;ouch:&quot;&lt;br /&gt;&lt;br /&gt;&lt;div align=&quot;center&quot;&gt;&lt;img style=&quot;border: 1px solid rgb(200, 200, 200); padding: 4px; max-width: 800px;&quot; src=&quot;http://www.lethargy.org/%7Ejesus/uploads/graph_image.php.png&quot; /&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt; 
    </content:encoded>

    <pubDate>Tue, 20 May 2008 13:56:59 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/118-guid.html</guid>
    
</item>
<item>
    <title>BWPUG Meetup Reminder</title>
    <link>http://www.lethargy.org/~jesus/archives/116-BWPUG-Meetup-Reminder.html</link>
            <category>BWPUG</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/116-BWPUG-Meetup-Reminder.html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=116</wfw:comment>

    <slash:comments>3</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=116</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    &lt;p&gt;Hi all!&lt;/p&gt;

&lt;p&gt;Just a friendly reminder that we&#039;ll be having our first meetup tomorrow as planned.  I thought as a good kick-off we could all collaboratively share what we do with PostgreSQL.  We&#039;ll start off with a whirlwind tour of how OmniTI uses PotsgreSQL, taking a brief look at ZFS, DTrace and large datasets.  After that I think it would be good to get to know each other -- maybe we&#039;ll hit a local pub afterwards!&lt;/p&gt;

&lt;p&gt;I look forward to seeing you there!&lt;p&gt;

&lt;blockquote&gt;
Meetup starts at 6:30pm&lt;br /&gt;
7070 Samuel Morse Dr. Ste 150&lt;br /&gt;
Columbia, MD 21046&lt;br /&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you have issues getting in the building, ring me on my cell -- it will be posted on the doors.&lt;/p&gt;

&lt;p&gt;Best regards,&lt;/p&gt;

&lt;p&gt;Theo&lt;/p&gt;
 
    </content:encoded>

    <pubDate>Tue, 13 May 2008 20:49:56 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/116-guid.html</guid>
    
</item>
<item>
    <title>OSCON 2008: And now for something completely different.</title>
    <link>http://www.lethargy.org/~jesus/archives/115-OSCON-2008-And-now-for-something-completely-different..html</link>
            <category>Damaged Bits</category>
    
    <comments>http://www.lethargy.org/~jesus/archives/115-OSCON-2008-And-now-for-something-completely-different..html#comments</comments>
    <wfw:comment>http://www.lethargy.org/~jesus/wfwcomment.php?cid=115</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://www.lethargy.org/~jesus/rss.php?version=2.0&amp;type=comments&amp;cid=115</wfw:commentRss>
    

    <author>nospam@example.com (Theo Schlossnagle)</author>
    <content:encoded>
    &lt;p&gt;I just registered for &lt;a href=&quot;http://en.oreilly.com/oscon2008/public/content/home&quot;&gt;OSCON&lt;/a&gt;.  They say I should advertise that I am a speaker.  Here goes.&lt;/p&gt;

&lt;p&gt;For &lt;a href=&quot;http://blogs.oreilly.com/digitalmedia/2005/08/oscon-day-0-scalable-internet.html&quot;&gt;the&lt;/a&gt; &lt;a href=&quot;http://conferences.oreillynet.com/cs/os2005/view/e_sess/6412&quot;&gt;last&lt;/a&gt; &lt;a href=&quot;http://conferences.oreillynet.com/cs/os2006/view/e_spkr/1788&quot;&gt;several&lt;/a&gt; &lt;a href=&quot;http://conferences.oreillynet.com/cs/os2007/view/e_sess/12458&quot;&gt;years&lt;/a&gt;, I&#039;ve presented multiple talks at the O&#039;Reilly Open Source Conference.  My Scalable Internet Architectures talk has been quite popular and drawn large crowds.  It is an interesting talk as it doesn&#039;t really change with time.  As I say, &quot;if principles of good engineering changed frequently, I&#039;d never drive on bridges.&quot;  The talk is about sound engineering approaches to building really large consumer-facing websites.  Almost all of it is open-source centric, which is why it fits so well at OSCON.  While my Scalable talk was not accepted this year, I&#039;ve got another talk lined up that will rock your world.&lt;/p&gt;

&lt;p style=&quot;text-align: center&quot;&gt;&lt;a href=&quot;http://en.oreilly.com/oscon2008/public/content/home&quot;&gt;&lt;img src=&quot;http://conferences.oreillynet.com/banners/oscon/speaker/oscon2008_banner_speaker_210x60.gif&quot; style=&quot;padding: 3px; border: 1px solid #999;&quot; border=0&gt;&lt;/a&gt;
&lt;p&gt;

&lt;p&gt;I am quite excited that my other proposal was accepted.  This year I will be giving  a session about &lt;a href=&quot;http://en.oreilly.com/oscon2008/public/schedule/detail/2903&quot;&gt;using DTrace to perform &quot;full-stack&quot; introspection&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Using DTrace we will deep dive into the amazingly cool questions one can ask. Is my application really hitting disk? If so, what line of code is causing it? My process is being descheduled by the kernel, why? I have 100 Apache process and some randomly segfault, how do I get a stack trace when that happens? The app I am running doesn’t have the right debugging output, I need to know more!&lt;/p&gt;
&lt;p style=&quot;margin-top: 1em&quot;&gt;DTrace is an oracle. The value of the answers depends on the quality of the questions. Learn to ask good question and prepare to be amazed at the possibilities.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I&#039;ve given a variation on this presentation at a few places now (both internal to OmniTI and external) and had really positive feedback.  I&#039;ll be taking these prior presentations and polishing them up for a 45 minute escapade that will open your eyes to new possibilities.  &lt;a href=&quot;http://opensolaris.org/os/community/dtrace/&quot;&gt;DTrace&lt;/a&gt; is an amazing tool and once you get used to it, you can really take it for granted.  I do.  When people watch the presentation and say &quot;&lt;a href=&quot;http://www.imdb.com/title/tt0425112/&quot;&gt;by the power of Greyskull&lt;/a&gt;,&quot; I know I&#039;ve made my point.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://en.oreilly.com/oscon2008/public/content/home&quot;&gt;Come to OSCON&lt;/a&gt;.  Immerse yourself in technology.&lt;/p&gt; 
    </content:encoded>

    <pubDate>Sun, 27 Apr 2008 22:02:28 -0400</pubDate>
    <guid isPermaLink="false">http://www.lethargy.org/~jesus/archives/115-guid.html</guid>
    
</item>

</channel>
</rss>