Hawaiʻi's Technology Community

Six Sigma Web Servers

Many of you have heard of Six Sigma as a methodology (primarily for large organizations) for quality and process improvement. One early success story was that of Motorola in the early 2000s, who attributed $17 billion in savings due to its application of Six Sigma across its business units. Since then it has been used in various industries across the world for everything from lowering manufacturing defects to improving the consistency of paint. Six Sigma relies heavily on statistical math in its repertoire of tools. One small application of one such formula I have found useful over the years in web application development is the poisson ("pwah-sone") distribution as used for server capacity planning.

Appallingly, a lot of web projects don't think about server capacity planning and just "wing it". When they do, the thought process goes something like: "well this size server(s) worked for that application, so this server should work for this kinda similar application too". Developers more serious about their craft may actually do some load forecasting and empirical testing,"I think I will have a 1,000 visitors every day and each user will click on average 10 times each visit, so I should procure hardware that can handle 10,000 clicks a day". After which, more experienced pros may double, triple or even quadruple server size based on what their numbers suggest for some padding to account for random bursts of traffic. That can work, but what if you don't want to pay for all that excess capacity (and you don't have on-demand scalability)? Or what if you just want a more precise discussion around your server planning? Enter the poisson distribution!

The poisson distribution models discrete (individually countable) outcomes that are essentially random (outcomes that are independent of one another). It can be used to think about things like:

number of customers in a bakery each day
number of cars passing through an intersection
number of late deliveries per 100 deliveries
number of planes arriving every hour
and of course, number of requests to your server(s) every minute

Mathematically, poisson distribution is defined as:

x = 0, 1, 2, 3, 4...
P(x) = probability of x occurring
e = the natural number (2.71828...)
lambda = the long run average

Given lambda, or an average expected rate of something, it can tell you the probability of what another value is during a time interval. With our previous example, we have a lambda of 10K/day. More precisely, we know the 10K clicks will be confined to a 12-hour period when users across the country are awake. So my lambda is really 10K/12-hours or 14/second. From there, we do some plug-and-chug into the formula above. This tells us that the probability in any given second that we will be handling 14 requests is roughly 10.6%. The probability we will see twice that (28) is roughly .03%. Now since we are talking about server capacity, we are probably more interested in the cumulative probability, or the probability of x being equal to or less than a value. We see that there is a ~57% chance we will see 14 or fewer clicks/second, and we're over ~99.97% certain we will never see more than twice our average (28). How big of a server to build is suddenly a more quantifiable and precise discussion. If your average will be 14 clicks/second you certainly don't want something that can only handle 14 clicks/second, or 43% of the time your server(s) will not meet capacity. You probably want something that can handle at least 20 clicks/second at the bare minimum for a 95% success rate. From there it gets noticeably and increasingly more expensive to accommodate the tiniest of probabilities your server(s) will see more than 25 clicks/second.

Calculating poisson distribution is very easy with tools like Microsoft Excel or googling for poisson calculators on the web, yet it can quickly give you a lot of fidelity when planning and procuring your infrastructure capacity. Just remember poisson only applies for random events. If you know your traffic spikes every Friday or users only come during lunch, then you'll need a different poisson curve with its own lambda for these special usage times. Hope you found this simple but powerful formula useful. Good luck keeping your servers from burning to the ground!

Six Sigma Web Servers

You need to be a member of TechHui to add comments!

Sponsors