Skip to: categories | main content
Entries from September 2007
About me
Ten years may seem like a long time to some and just a passing moment to others. It seems to me that the content of those years defines the perceived duration. I have three daughters with ages five, three and four months and it just seems like yesterday that my eldest was born. The stress of raising them and adjustments that Lisa and I have made to accommodate children has, of course, lasted the same five years -- but feels like ten. OmniTI is my "other child." And the stresses and accommodations that I and everyone close to me have shouldered for OmniTI have been both heavy and relentless.
I love OmniTI like a child, I made it, it is what it is today because I (and the other owners and its growing staff) raised it to be wonderful; we made it in our image as much as we could. Like children, OmniTI frustrates me, we have setbacks, it does things I am not proud of on occasion, and sometimes I have to wake up at 3am and care for it when it struggles. Also, like my other children, it inspires me and makes me want to do better and be better.
On September 4th 1997, Sherry (my mother) and I incorporated OmniTI, Inc. and started doing business. So, naturally, on September 4th this year, clan OmniTI went to a local bar and celebrated being in business for ten years. I didn't prepare a speech, I didn't give a pep-talk; instead, I decided that day was for me. I sat back and looked on with tremendous pride at OmniTI. What is OmniTI? OmniTI is its people and their sacrifices.
In 1999, we decided to conquer the world of Internet messaging and developed the Ecelerity MTA which is now the flagship product of Message Systems. Conquering the world we are! On the other side, our consulting practice is sought out by some of the largest, highest profile web sites on the planet for expert security, performance and scalability advice. In 2002 we moved out of our respective basements into real offices. In 2006 we brought on Chris Shiflett to lead our web application security practice. In 2006, we launched OmniTI Labs, which will provide us the opportunity to give back without red tape.
We started with two people and now have almost fifty and I honestly believe that OmniTI's success is due to "we." We have a team attitude. A commitment to excellence, passion for technology, and customer focus mean very little coming from one person. However, when everyone in an organization has these character traits you have what we have at OmniTI. It means it is a child I can be proud of; it makes me smile, laugh, cry, occasionally sick with worry but most often I beam with pride.
I would like to thank all of our staff, our Board, my brother and his family, my mother and father, my wife and my children for their tremendous sacrifice that resulted in OmniTI. We are OmniTI.
Sometimes I can be a jackass about semantics. I don't always use the right words, but I should be corrected when I choose poorly. The reason we have so many of these word things is because most have outright different meanings and those that are synonymous have nuance that makes one more appropriate than another in a certain context.
I deal with a lot of large systems and many large systems are complicated. The more complicated things get, the more clearly they must be described and documented or you're left completely bewildered and confused. This brings me to a topic that annoys me to no end: database lingo.
Partitioning and Federation... they are similar, but different.
A partition is a structure that divides a space into two parts. Multiple partitions can break up that space into an arbitrary number of parts. In computer operating systems, this even has a more specific definition referring to the division of resources into portions. As a verb it means to divide something (typically a space) into small pieces.
A federation is a set of things (usually states or regions) that together compose a centralized unit but each individually maintains some aspect of autonomy. In computer systems this is often applied to security systems where several autonomously operating systems providing security to a certain set of users or over a certain set of facilities together provide a consistent and complete security infrastructure. In databases, it means that several databases hold information, but certain instances are completely responsible for different portions of the data commonly based off characteristics of the data itself.
So, how are these different? It's subtle on one level as they both describe methods dividing datasets into smaller parts. Federation is typically across machines. Federating data on a single machine is an inappropriate use of the term. Federation more often applies to schemes that divide on logical boundaries, such as the geographic definition above. The Internet is more global, so lets think of countries instead. If we were to take each country and design our systems such that all data related to each country existed on a different server, we have a geographically federated systems. Another common (and practical) example is federating based on quality of service (paying users vs. free users). The motivation behind this is clear, it makes the task of ensuring service levels on the database easier because the data set is smaller and it allows one to prioritize the investment to improve an aspect of the system because of the logical separation (e.g. more immediacy and money can be applied to ensuring availability of the servers that service paying users.)
Partitioning is a more general concept and federation is a means of partitioning. Partitioning can be applied to databases at many levels. One common use is taking a single large table and splitting it into parts in order to place those parts that are accessed more frequently on faster (more expensive) storage. However, partitioning isn't limited to a single machine. That partitioning schema was to allow use of more than one (and even a different type/cost) disk spindle. It can also be applied to multiple database instances; it is a loose term. However, partitioning does not imply a logical separation. It is often used to simply split our data up so that more hardware can be leveraged to process it. Google's information, for example, is partitioned all over the place and then they ask all the system components (servers) to participate in answering questions via their "map and reduce" system. Some partitioning schemes require mapping questions across many nodes and some partitioning schemes provide a priori knowledge about which components hold what data allowing more targeted questioning.
The techniques for choosing on which component to store a particular piece of data are wildly varying, each with its own advantages and disadvantages. Understanding how you will be storing data and more importantly what questions you will be asking over the data set dictate the partitioning scheme that is most appropriate. Sometimes federating is right, other times a more generalized partitioning scheme is more suitable.
This brings me to my last point, and the motivation for this post.
A shard is a piece of broken ceramic, glass, rock (or some other hard material) and is often sharp and dangerous. Sharding is the act of creating shards. Somehow, somewhere somebody decided that what they were doing was so cool that they had to make up a new term for what people have been doing for many many years. It is partitioning... sometimes that partitioning is proper federation. You don't need a cool name to effectively accomplish what's been around for a long time. Moreso, you don't need a name that implies you broke something irreparably.

