Share This:

 

I just got asked this question for the hundredth time:

 

"I can get CPU and memory utilization graphs from performance monitors.  BMC Service Assurance also includes monitoring and alerting products.  So what's the difference between Capacity Management products and those?"

 

This question really goes to the heart of what capacity management and its close cousin, performance analysis, are all about.  The analogy I use to explain this is the sprinter with quick reflexes versus the long-distance runner with the stamina.

 

Service Assurance is certainly about assuring service levels, but there are at least two aspects to it: reactive and preventive.  First, consider the reactive part.

 

A sysadmin will tell you that his greatest joy in life is not having his pager going off at night.  Monitoring products provide the "twitchy muscles" for the sysadmin, because he is a sprinter.  They quickly react with blinking lights and beeps whenever a CPU goes over threshold or a disk runs out of space.  These twitchy muscles have to be quick, because that's what a sprinter needs.

 

Capacity planners, on the other hand, are the long-distance runners of the IT world.  They are not primarily about solving misconfiguration problems, although they do often find them in the course of their work.  Instead, their task is preventive: they want to figure out how much CPU, memory, disk, network is needed to meet or exceed service levels, and how to maintain those service levels reliably as new services are rolled out and infrastructure is acquired or replaced.

 

The CM products help you find out how much capacity is available to meet application service levels, given the demand today, and to predict application performance in the future---and to do this on an ongoing basis, daily, weekly, and for months and years.  For the most part, the CM products assume that your network, array, and server is installed correctly, and that your application is properly tuned.  The real interesting questions for the CM products come after this.

 

The questions these products answer have to do with throughput, response time, and utilization. These questions require a deep understanding of how applications, operating systems, hypervisors, firmware, and distributed hardware behave.  The CM products give you enough insight into this behavior to be able to model it.  Generally speaking, this understanding requires effort, time, and a lot of data.  These products are the slow muscle fibers that a distance runner needs.