In TSCO, VMware Cluster CPU and Memory utilization is reporting over 100% periodically

Version 1
    Share This:

    This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.


    PRODUCT:

    TrueSight Capacity Optimization


    COMPONENT:

    Capacity Optimization


    APPLIES TO:

    TSCO 10.3, 10.5, 10.7, 11.0, 11.3.01,11.5,11.5.01



    DETAILS:

     

    CPU

    The behavior we are seeing is that VMware Cluster level CPU utilization is being reported over 100% periodically.  The periods where cluster CPU is over 100% in TrueSight Capacity Optimization (TSCO) appear to correspond to times where vCenter's cluster CPU charts also report over 100% utilization.  

     

      

    TSCO Cluster details

      
    User-added image  

    vSphere Web Client corresponding performance charts

      
    User-added image  

     

      

    Memory

      
    Similar behavior can be seen in relation to Memory Utilization being reported for VMware Clusters.  For example, the chart below shows unexpected peaks over 100% for Memory Utilization for a VMware cluster in TSCO. 

    This is an example 
    User-added image

    Two possible reasons for the values are not matching between TrueSight Capacity Optimization (TSCO) and vCenter (CPU util > 100%):   
       
    1. There are minor inaccuracies in the measurement in each of the VMs (this is a known problem, acknowledged by VMware). In a very busy system all the inaccuracies are adding up to more than 10% of CPU at the cluster level. vCenter caps utilization at 100%, so we will never see utilization over 100% in vCenter. TSCO does not cap at 100% since that masks any problem that exists in data collection.    
    2.  
    3. TSCO and VMware have different formula for calculating cluster CPU utilization. We compute cluster CPU utilization by summing up VMs CPU usage in MHz and dividing by effective CPU MHz = (SUM (VM.cpumhz) / cluster.effective_cpu_mhz)).  This formula is consistent with VMware documentation, but there is a small possibility that they changed this formula in last vCenter update.
       Same consideration can be done for memory. The formula is
      SUM(VM.memconsumed + VM.memoverhead) / cluster.effective_mem

    Additionally, it can be commonly seen through the vSphere web client cluster CPU utilization values are reporting over 100% -- it typically isn't to the extent that the above example charts report over 100% (for example, in one in-house environment the cluster was saturated so utilization hovers around 100% but is usually in the 100% - 103% with a spike up to 108%).  So it does look like the vCenter data is able to report over 100% utilization at the cluster level.

    Note in the above CPU chart the TSCO data seems to correlate well to the vCenter data. So, in the two places where there is a spike over 100% in TSCO the values in the chart also exceed 100% (the chart is capped at 100% but you can see the values have exceeded it).  

    Q: What is CPU utilization in that chart?  Is it average, peak, 95th percentile?

    In the Views when a single number is shown in a table for a metric like CPU utilization that would generally be the 95th percentile CPU.  So, for example, in the Cluster Details chart you sent the "CPU Util" bar chart value of 92% is the 95th percentile CPU utilization for that period.  But, in a time series chart that is just the AVERAGE CPU utilization reported.  So, that is basically the measured data coming from vCenter and you could create the same chart directly in TSCO by charting the CPU_UTIL metric for the cluster. 

    So, when you are in the Summary table those will be 95th percentile but when you click into a particular cluster that moves you into a Data Explorer view and those will be reporting the measured CPU_UTIL values obtained from vCenter for the customer (not a 95th percentile of the collected data). 

      

     


    Article Number:

    000131049


    Article Type:

    Product/Service Description



      Looking for additional information?    Search BMC Support  or  Browse Knowledge Articles