Skip to: categories | main content
Esoteric Curio
About meWith the release of ZFS on Solaris 10, I sat down and marveled at the opportunities for off-site backups. I have already written a bit about ZFS detailing why I think it kicks so much ass. With zfs send and zfs receive, one can manage block-level incremental backups and restores. What's missing? An elegant hack leveraging that to provide a simple and reliable backup infrastructure for a network of ZFS capable machines (including Mac OS X and FreeBSD now, BTW).
So, I sat down and wrote Zetaback -- which is currently 1032 lines of perl code (including complete documentations) plus a thin agent on remote machines that is 290 lines of perl code (including complete documentation). I'd like to note that the only reason there is documentation, let alone complete documentation, is because of Eric Sproul. This really demonstrates to me that "Keep It Simple Stupid" still works for important tasks.
Zetaback is a rather full features backup and restore system. It can manage multiple hosts, multiple ZFS per host, both frequency and retention policies on full and incremental backups. It can report policy violators (things that haven't been backed up within the policy). It can manage the archiving of backups. It provides both non-interactive and interactive restores. It has an excellent command line syntax. And most importantly, it has saved my ass more times than I can count.
I'm not usually big on awards... I find the single unexpected email from someone saying: "damn that was useful, thanks!" to be more gratifying most of the time. However, Zetaback was one of the first projects we put up on labs, so being a 3rd place winner in the OpenSolaris Community Innovation Awards is pretty exciting.
Varnish is a "bad ass" new HTTP caching accelerator. It's developed by some crufty old BSD hacker and has a lot of Linux users. By and large, it has ignored Solaris. This sort of neglect isn't malicious, it is just neglect... you know: "out of sight, out of mind."
Well, check out Varnish trunk and give this patch a spin. Let me know what you think.
Perhaps one day, the Solaris networking team (or someone else) will satisfy this pretty abysmal shortcoming: BugID 4641715.
Hello from OSCON. I gave my full-stack introspection crash course talk today. It has been quite a while since I've presented anything in a 40 minute format, but I think the talk went quite well. I got a lot of positive feedback.
I decided to take a risky approach inspired by dtrace.conf(08) by demonstrating dtrace on a live, mission-critical system we run at OmniTI. The risks of this are: network connections flake out, dtrace doesn't work correctly or I do something stupid and cause some service unavailability. Well, as I use dtrace just about every day, I wasn't worried about the breaking things. And while my network connection winked out for about one minute and dtrace has some annoying sub-second aborts, I think the demonstration was quite effective.
Many people gave positive commentary at the end and afterward. People asked for the slide to be put online... and while they have no real content of value (as the demo was everything), I put them here anyway:
In addtion to the slide stack, I've included the simple scripts that I used during the demonstration. I ran qps.d, query_speed.d, and query_speed2.d on the database server and r3.sh, perl.d, pcpu.d, papcpu.d, and papcpu2.d on the web server. These require The Devel::DTrace perl module, PostgreSQL patches, and Apache 2.2.8 patches. Some of them are approximations of correctness, so weigh the output appropriately (the perl ones). Enjoy!

Today someone asked me: "You speak about ZFS a lot. I know other people that talk about the latest filesystems with praise, but generally speaking they just don't have much to offer. Is ZFS that different?"
My answer is "yes." But, of course, I can't leave it at that. I'm not going to make a performance argument -- ZFS is fast in some cases and slow in others -- just like everything else. I think one of the things we've seen in the last 10 years is that everyone felt the need to come out with their own filesystem -- at least on Linux. So, you have to as yourself why. My personal opinion is that filesystems on Linux suck.
Most filesystems on the market support snapshots. No open source filesystems on Linux (that I'm aware of) support snapshots. Of course, you can use LVM to do block-level snapshots. First off, that's a pain in the ass w.r.t. storage provisioning. Other systems make the process of allocating and managing snapshots "not my problem." (simple and easy). Let's be frank, ext2 and ext3 are nothing to write home about. reiserfs, xfs, jfs, the list goes on and on.
There are a few closed-source filesystems that are really nice. Specifically Veritas Filesystem (VxFS) and its excellent layered volume manager VxVM which appears to have heavily inspired geom on FreeBSD. DEC thought it was so cool that they pulled it white-label into Tru64. Respect.
So, what makes ZFS so different? ZFS is a disruptive technology as it abolishes the sacred line in the sand between block devices, volume management and filesystems. This means it just make storage management easy. When I say easy... I mean easy.
So you want more space? Add more disks. Want to move from from failing disks to replacements? Tell zfs to add the new ones and tell it to remove the old ones. Read that report by Google about disk errors? ZFS checksums all data. My personal experience says checksums are good. Snapshots? Sure snapshot to your heart's content. We snapshot some systems hourly and never ever delete the old ones. Snapshots are really cool, but what if you could rollback to a snapshot? zfs rollback. What if you wanted to make a read/write copy of the fileystem or an old snapshot? zfs clone. You want to store a lot of raw data? zfs has built-in compression. Oh, and it is open-source.
Simply put. ZFS. Respect.

