For the CO Gateway Server/Capacity Agent, what are the recommendations for high granularity data summarization?

Version 7
    Share This:

    This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.


    PRODUCT:

    BMC Performance Assurance for Servers


    APPLIES TO:

    TrueSight Capacity Optimization (Gateway Server/Capacity Agent) BMC Performance Assurance for Servers



    QUESTION:

     
       The TrueSight Capacity Optimization collector can create UDR data with 1 minute intervals and Investigate History maintains data at a 10 second summarization interval for the first hour by default. What are the recommendations, limitations, and areas to be concerned about when trying to collect and report highly granular data in Perform through Investigate, Visualizer, Perceiver, and PrintUDR?   
       
       
        
         
    • TrueSight Capacity Optimization (Gateway Server/Capacity Agent) 11.x, 10.x
    •    
    • BMC Performance Assurance for Virtual Servers  9.0, 7.5.10, 7.5.00,  7.4.10,  7.4.00,  7.3.00
    •    
    • BMC Performance Assurance for Servers  9.0, 7.5.10, 7.5.00,  7.4.10,  7.4.00,  7.3.00
    •   
       
       
        
         
    • Unix
    •    
    • Windows
    •   
       


    ANSWER:

     

    Legacy ID:KA318071

      
      The minimum supported spill rate/summarization interval for data analysis is 1 minute and for Visualizer file creation through Manager is 2 minutes. So, summarization rates of less than 1 minute into Visualizer are technically impossible within the current design of the product so that eliminates them from consideration as a real world configuration.  
     
    The Perform product is functionally capable of 2 minute summarization interval data from data collection through transfer, processing, population, and visualization in Perceiver and Visualizer. But there are usability and performance implications of processing high granularity data through the perform product as your regularly scheduled Manager run.  
     
    The primary areas of concern are:   
        
    • Visualizer does not handle high granularity data well. When there are too many data points on a graph the colored lines become black areas in the charts rather than discernable lines.
    •   
    • Disk, Network, and CPU resources required to collect, transfer, process, and store UDR data increase roughly linearly with the summarization interval. So, a 5 minute summarization interval will require 3 times the resources of a 15 minute summarization interval and a 1 minute interval will require 15 times the resources. For a large environment even a 5 minute will have considerable implications on the scale of the hardware necessary to process and maintain the daily UDR data (particularly in a typical 8 hour nightly processing window) and one must consider sufficient spare resources for any sort of processing recovery due to any sort of unexpected nightly data processing problems.
    •   
    • There will likely be increased management costs associated with a higher granularity summarization interval. The environment will require more careful management because there will be less room for error (since there generally aren't sufficient available processing resources available to recover more than a short data processing outage on the console and UDR left on a remote node due to a failed data transfers will generally need to be dealt with quickly (or automatically deleted) to avoid filling up the file system on the remote node (which will frequently cause data collection to fail from that point forward until the Perform Agent is restarted).
    •  
       
    So high granularity data is technically feasible but is significantly more costly in hardware and management resources and thus should not be pursued without careful consideration of the implications and costs. I believe that a 2 minute interval over a large number of machines would be prohibitively expensive to manage (both in terms of hardware and staffing resources required). I know of one moderately large environment that is running 5 minute intervals and they are currently experiencing considerable difficulty with hardware requirements in relation to CPU and I/O subsystem requirements.  
     
    For high granularity and long period data retention of Investigate History data the primary limitation will be the 2 GB UDR file size limit. 45 days at 10 second samples is about 390,000 samples so for some groups (such as Process Statistics) it would be impossible to maintain this much history. For basic groups it would be possible (such as the System Statistics group) and it would probably work well. But for groups with the potential for many records per sample (Disk Statistics, File System Statistics, and so on) there could be significant risk of hitting the 2 GB file size limit. The problem with this is that once that limit is reached Perform data collection for that group will fail until the Perform Agent is restarted (for both Investigate history and daily capacity planning data collection).  
     
    For a select set of groups (such as System Statistics, CPU Statistics) I think 10 seconds samples for 45 days would probably work. But it would be something you'd certainly want to test in a lab environment first. That data would not be accessible to Analyze, Predict, or Visualizer so it would generally be necessary to implement a custom reporting infrastructure using a tool like PrintUDR for data extraction.    Also      
      

    PrintUDR

    When accessing high granularity UDR data files, the best practice recommendation is to run PrintUDR against Daily capacity planning UDR data.  The reason is that you can access these UDR data files directly with PrintUDR (so this avoids the whole "UDR data file -> bgsagent -> network -> printudr" path and it becomes the more simple "UDR data file -> printudr" path.  But, that would require the submission of a special separate collection request to create the daily capacity planning UDR data files at the desired summarization level and time period.  That would then allow the use of the '-r [repository]' flag to access them. 

    If that isn't an option the standard way to access the Investigate History data would be through the bgsagent (so not using the '-r [repository]') flag.  But the agent hasn't been designed to transfer huge volumes of data over the network -- it is expecting to receive and respond to the type of request that the Investigate GUI would make against it -- not a large dump of a 24 hour period of data with 1 minute summarization. 

    Another non-standard way to access the Investigate History UDR data via PrintUDR would be to run PrintUDR directly against the Investigate History Repository (-r [repository]).  Note that this is a   non-standard and officially   unsupported usage of PrintUDR.  Although this is a non-standard configuration because it would result in PrintUDR accessing UDR data files while they were open and being actively updated by the bgsagent, it is a more simple data extraction method than relying on the bgsagent process to provide the data to PrintUDR.  So, it may be a configuration worth investigating -- particularly if the bgsagent data access doesn't provide sufficient stability when extracting a large volume of data. 

    Subject Matter Experts who have reviewed the three PrintUDR data extraction options strongly prefer extraction from special capacity planning UDR data collection requests with PrintUDR configured to extract data from the UDR data files directly.  

    Investigate History Repository Access by PrintUDR Note

    Note that when PrintUDR accesses an Investigate History repository the data returned may be incomplete if there are multiple UDR data files at a single summarization level (for example multiple L1, L2, or L3 summarization level UDR data files) due to an Investigate History data retention period change without a cleanup of the Investigate History directory.  This occurs because PrintUDR will pick one of the files to represent the summarization level in the extraction -- and that file might not be the one that contains the data currently being collected  This is the same behavior often seen in the Investigate UI when attempting to access historical data in an Investigate History Repository with stale UDR data files associated with old data summarization confirmations. 

      

     


    Article Number:

    000031193


    Article Type:

    FAQ/Procedural



      Looking for additional information?    Search BMC Support  or  Browse Knowledge Articles