Math is not my best skill, so it was somewhat shocking to me to find out how much I liked statistics. My grad school degree required one year of it, so it wasn’t a choice. But by truly understanding statistics (and the use of a calculator to reduce the math demons), I found you could make sense of the numbers in your work and personal life.
Subsequently, I moved into performance where mean (or average) was the most commonly used statistic. And by commonly, I mean for everything. Being new to the field, I went along with the program…for a while. But then, we started getting user performance complaints even when we were meeting response time SLAs. This was my wake-up call.
I had been ignoring my statistics grounding. Mean doesn’t mean much if you don’t know what the highs and lows look like. You need to know how the data is distributed. After all, when you have one hand in ice water and the other in boiling water, you should be comfortable based on statistical averages. But you’re not. The statistic you need here is standard deviation (SD). This shows you how valid mean is to the customer experience. If SD is small, mean may be a reasonable approximation of the truth. If it is large, mean is meaningless. Large standard deviations in response time mean the customer will not be happy, even if you are meeting SLAs. (Large SDs in resource utilization are another matter – can you spell trouble?) To compute these, if your solutions don’t, use Excel plug-ins for statistics.
So what can you do with this information, now that you have it? If SD is small and you meet SLAs, you are probably in good shape, but should gather data by hour to be sure this isn’t an averaging error. If SD is large, you need to tune. But this is reactive management – who needs that? To be proactive, you need to be able to know this information before it becomes a problem.
Consider using 90th or 95th percentile, which means that in 90 (or 95%) of the cases, the response time (or other metric) is below that value. This will more accurately reflect user experience and resource demand. Some applications have regular “spikes” and this can be misleading – know your business and your application to interpret the data further. And best of all, find yourself a way to establish what is normal for your system, so you can detect quickly any variances from normal. Software solutions can help.
Get more meaning from your statistics by being skeptical about mean.
More on this subject: