Search BMC.com
Search

Capacity Management

5 Posts authored by: Sudheer Apte
Share: |


In reporting memory usage for VMWare VMs, a perennial question we get is: should we be looking at "active" memory, or "consumed" memory for a VM?

 

And the answer, as it so often is, is "it depends"--- it depends on what exactly you are trying to measure.

 

If you want to measure how efficiently you are using your VMWare environment's memory resources to run workloads, then you want to compare the total memory that your workloads have been given (given), against the total available memory resources on your hosts (avail).  If you are just starting to virtualize your workloads, you would expect this efficiency number given/avail to be much less than 1.  As you pack more and more workloads, you would expect it to rise close to 1.  And as you become an expert at managing your environment and leveraging the capabilities of VMWare ESX, it could exceed 1.  (I will explain how it could exceed 1 later).

 

In the above calculations, for the quantity given, i.e., the amount of memory given to the workloads, VMWare publishes two counters through its API that can be used:

  • memory granted per VM
  • memory consumed per VM

 

Either of these could be summed over all the VMs to compute given; the only difference is that the "consumed" counter subtracts some kinds of memory savings achieved by ESX, so if you use "consumed" instead of "granted", you get a somewhat lower efficiency number. Just pick one of these counters and use it consistently.  For the quantity avail, we can use the total host memory sizes of all the hosts.

 

You can compute this efficiency measure on a per host basis, too, using the same counters.

 

On the other hand, if you want to measure whether a workload is actually using the memory it has been given, then you want to compare the memory the VM is actively using (the "active" memory counter reported by VMWare) to the guest memory size or to memory granted.  This is a much narrower question about a single VM's behavior during a short period of time.

 

So, to summarize:

 

  • For measuring over-all efficiency, look at memory consumed (or granted).
  • For analyzing memory usage of individual VMs during a short period, look at memory active.

 

As for how efficiency can exceed 1: VMWare ESX includes several memory management features that optimize the use of the host's memory resources to get higher VM densities. Three of these features (transparent page sharing, ballooning, and swapping) are explained in detail in the VMWare paper Understanding Memory Resource Management in VMWare ESX Server.(PDF). These features make it possible to run VMs on less real memory than they have been granted in total, making the efficiency measure greater than 1.

Share: |


So, what's a good value to use for disk I/O thresholds?


First: the concept of I/O operations: each OS-level I/O is issued as a SCSI command, and the SCSI command, whether it's a read, a write, or a control command, is the unit of work in the I/O stack. The throughput is expressed in terms of the number of I/O operations per second, or IOPS. So, I/O thresholds are easiest to express in terms of IOPS.  For our purposes, we can ignore variable-size I/Os and the tricks that can be done to combine together multiple I/Os to speed up the I/O processing, and assume that you can get hold of an operating system metric that gives you average IOPS during a period.


Now, the operating system can also monitor the total time taken for the operation through the I/O stack.  Different operating systems, of course, call it by different names, but many do report average response time over a period if asked to.  For a given setup, the average response time of an I/O operation can give an indication of whether there is congestion in the I/O stack or not.


Similarly, there can be some measure of the queue length for I/O operations, i.e., the average number of operations waiting to be executed---anything over 1 indicates that there is some congestion.  Note that there may be other reasons for congestion--- there may be something wrong with the I/O stack, or there may be other I/O being done on the stack, e.g., an array may be shared with some other large workload.  But barring these, it's possible to put together a model of how much the maximum I/O threshold should be.


Nowadays, with the virtualization of servers, there is renewed focus on disk I/O as a shared, scarce resource.  The hypervisor vendors have responded with pretty good instrumentation in this area.  IBM has always had excellent instrumentation, and VMWare has improved its coverage with vSphere 4.1. VMWare calls the response time "disk latency".  In terms of what values of latency to use, we can give some rough guidelines here. (Your mileage may vary).

 

In practice, if the hosts are not pegged, then command latencies are generally governed by the disk speed and I/O transport in our experimental setup.  One lab setup uses a shared Fibre Channel SAN with LUNs shared between only a few (< 10) ESX hosts at a time.  We have a lightly loaded CLARiiON CX family array (a CX4-120) situated close to the servers in terms of hops, with large striped RAID groups and a 15-way DAE with fast disks.


Under these circumstances, total read or write command latencies should be well below 10 ms, showing up as zero or close to zero in the VMWare infrastructure.  They will increase if you add a significant amount of sustained and random I/O load to the same RAID group, either from one of the same servers or from a different server, but ordinarily this setup is too powerful to break a sweat.


A cheaper way to see higher command latencies is to use a directly-attached disk.  The ESX host is probably booting off a local SCSI disk.  If you can find such a disk in vCenter and create a VM hosted on a datastore on this disk, you can get nice numbers in the 20-30 millisecond range with even a light load, depending upon the type of disk.  You can make it go higher than 50-60 ms by pounding away at it with IOmeter.

Share: |


Capacity reports are often trying to answer this simple question: how much of my capacity is being used, and how much is available?


It's intuitive to report that memory or CPU is 50% utilized, because it tells us that we know both the used and the available capacity.  But the same level of information is not available for disk I/O or network I/O.  What does it mean that a server is using 30 kB/second of disk I/O?  How much more I/O capacity is available before it runs out?


There are ways to model the maximum I/O capacity, but they require a lot more knowledge about the disk susbsystem than is usually available.  You need to know what kind of transport (SCSI, iSCSI, Fibre Channel, NFS, etc.) is being used, over how many elements, and how the disk subsystem is organized at the other end.  In addition, if it's a shared disk subsystem like an array, then you also have to know what other I/O traffic is incident upon it.


If you used the above information to create a model, you could theoretically come up with a maximum capacity.  Then you could use that as the threshold against which to compare the actual I/O for reporting.  But if that avenue is closed for all practical purposes, capacity planners (or the software they use) struggle to find the maximum value for reporting.


There is a "fake" method of reporting, which looks at a large collection of servers and simply uses the maximum I/O number as the threshold.  This is mere fudging of the numbers. It may make the report look nice, and many products do indeed do this, but it's of limited use.


There's the obvious, thorough way of finding the maximum capacity: use I/O generation programs like Iometer to generate large amounts of I/O, and find out at what point the response time for a transaction exceeds a usable threshold like 10 seconds.  But this method, while it would work, is impractical in production environments.  Besides, it still doesn't account for all the possible external factors (like other servers doing I/O to a shared disk subsystem) that could affect the capacity of your server.


There are other techniques that are commonly used, which I will explain in a subsequent blog post.


Share: |


With VMWare resource pools, you can delegate capacity management to other admins.  Resource pools let you think about your ESX cluster as a single unit of resources, without worrying about host boundaries too much.  But they do have their peculiarities, which are a rich source of FAQs.

 

A frequent list of questions is about the four parameters you can set on resource pools.

 

The principle is actually straightforward:

  • Limit and Shares: control the growth of VMs, i.e., how much capacity VMs can use.
  • Reservation and Expandable: control admission, i.e., whether new VMs can be powered on or not.

 

Let's see how this principle works, with some example questions from customers:

 

Question 1.  I set Expandable = false, Limit = 150GB, Reservation = 150GB.  I already have 2 guests in the pool of allocated size 120GB.  So now, can I power on a new VM of allocated size 50GB?

 

Answer: No, because the pool has only 30GB of reserved capacity left, and it cannot borrow from its parent.

 

Question 2. Same as Question 1, but I increase the Limit to 250GB.  Now can I power on a new VM of allocated size 50GB?

 

Answer: No, because it still cannot borrow from its parent.  The Limit setting has nothing to do with it.  The Limit setting controls how much the existing two VMs can grow, provided their host has enough memory.

 

Question 3. Same as Question 1, but I set Expandable = true. Now can I turn on a new 50GB VM?

 

Answer: If the parent pool has at least 20GB of reserved capacity available, then the pool can borrow that on top of its own remaining 30GB reserved capacity, and let the new VM power on.

 

If you're not doing anything fancy, then here's some advice: just leave Expandable and Limit alone.  Their defaults are true and false respectively.  This lets the pool borrow from its parent if there is capacity, and lets VMs grow if the resources are available.  If you want to make sure a certain business application get preferential treatment, then set Reservation on its pool to a high number.

Share: |


 

I just got asked this question for the hundredth time:

 

"I can get CPU and memory utilization graphs from performance monitors.  BMC Service Assurance also includes monitoring and alerting products.  So what's the difference between Capacity Management products and those?"

 

This question really goes to the heart of what capacity management and its close cousin, performance analysis, are all about.  The analogy I use to explain this is the sprinter with quick reflexes versus the long-distance runner with the stamina.

 

Service Assurance is certainly about assuring service levels, but there are at least two aspects to it: reactive and preventive.  First, consider the reactive part.

 

A sysadmin will tell you that his greatest joy in life is not having his pager going off at night.  Monitoring products provide the "twitchy muscles" for the sysadmin, because he is a sprinter.  They quickly react with blinking lights and beeps whenever a CPU goes over threshold or a disk runs out of space.  These twitchy muscles have to be quick, because that's what a sprinter needs.

 

Capacity planners, on the other hand, are the long-distance runners of the IT world.  They are not primarily about solving misconfiguration problems, although they do often find them in the course of their work.  Instead, their task is preventive: they want to figure out how much CPU, memory, disk, network is needed to meet or exceed service levels, and how to maintain those service levels reliably as new services are rolled out and infrastructure is acquired or replaced.

 

The CM products help you find out how much capacity is available to meet application service levels, given the demand today, and to predict application performance in the future---and to do this on an ongoing basis, daily, weekly, and for months and years.  For the most part, the CM products assume that your network, array, and server is installed correctly, and that your application is properly tuned.  The real interesting questions for the CM products come after this.

 

The questions these products answer have to do with throughput, response time, and utilization. These questions require a deep understanding of how applications, operating systems, hypervisors, firmware, and distributed hardware behave.  The CM products give you enough insight into this behavior to be able to model it.  Generally speaking, this understanding requires effort, time, and a lot of data.  These products are the slow muscle fibers that a distance runner needs.

Filter Blog

By date:
By tag:
It's amazing what I.T. was meant to be.