TrueSight Capacity Optimization (TSCO) - How to interpret "Virtual Memory MB" on Visualizer "Process Memory Metrics" graphic

Version 2
    Share This:

    This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.


    PRODUCT:

    BMC Performance Assurance


    APPLIES TO:

    TrueSight Capacity Optimization TSCO



    QUESTION:

    (1) IBM/AIX API reporting less "Virtual" memory than "Real" memory, as reported using the BPA/Performance Assurance Agent

    (2) What does "Virtual" memory represent for a Linux process?

     



    LP: BMC Performance Assurance 9.5.00 (and earlier)
    DR: BMC Performance Assurance 9.5.00

    Details: On the 'Graphics->Process->Process Memory Metrics' view in Visualizer the value of the 'virtual memory metric' is lower than the value of the 'real memory metric' on AIX systems for a given process, which is not possible.

    Issue Summary: Real / virtual memory metrics look wrong on 'process memory graphic' in Visualizer for AIX systems.

     


    ANSWER:

     

    Legacy ID:KA422138

      

    First, let's look at the mechanics of how Virtual and Real Memory is being reported for an process in Visualizer, exactly which metric is that, and finally what does it represent.

      


    In the Visualizer database, these metrics are from the CAXPROCD (PROCD) table:

    Real Memory Total MB:            TOT_MEM
    Virtual Memory MB:                  VIRTU_SIZE

      

    The Visualizer Graphics -> Process -> Process Memory Metrics  chart is being used to display the values from the database.

      

    The Analyze component assigns the values for these database fields using the Process_Statistics udr metrics RSS and SIZE, respectively.  As always, the processes selected for reporting via the Visualizer database are the highest-consuming CPU processes (30) during the selected interval.

      

    So in order to interpret what the reported values mean, you need to know for each specific operating system platform (and possibly version), which API metrics are used to supply the values for RSS and SIZE and how the operating system vendor documents those metrics.

      

     

      

     

      

    AIX PLATFORM EXAMPLE

      



    The reported observation was that Virtual is smaller than Real Memory for an AIX process.  When results for another AIX server were checked, Virtual was always larger than Real Memory.  

      

    For the server in question, selecting one oracle process 10616916 on 31 March 2014 at 12:00 AM, the values from the Visualizer Process Memory Metrics graphic are:

      



    Real Memory Total MB     71.59

      

    Virtual Memory MB         161.68

      



    These correspond exactly with these Process_Statistics metrics (in KB instead of MB):

      



    RSS      73304
    SIZE   165560

      



    Other memory metrics for this process are:

      



    Private RSS    3288
    DRSS             3288
    TRSS            70016
    Shared RSS  70016
    ShmRSS              0
    StkRSS                0

      

    For this process, RSS is smaller than SIZE.

      

      

    So what are the specific definitions of where RSS and SIZE udr metrics are gathered for AIX ?

      

    =================================================================================================================

      

    Looking at the AIX header file for the SIZE metric of Process_Statistics, it says: 

      

    unsigned long   pi_size;        /* size of image (pages) */ 

      

    which is then converted from pages to KB by the collector. 

      

     

      

    Using standard unix commands,  ps v and ps -l, we can compare the values reported for RSS and SZ by the AIX with those from the data collector: 

      

    $ ps v 

      

        PID    TTY STAT  TIME PGIN  SIZE   RSS   LIM  TSIZ   TRS %CPU %MEM COMMAND

      

     352404  pts/1 A     0:00    2   672   956 32768   240   284  0.0  0.0 /bin/ksh

      

     610392  pts/1 A     0:00    0   652   936 32768   240   284  0.0  0.0 -ksh 

      

     651306  pts/1 A     0:00    0   740   844 32768    79   104  0.0  0.0 ps v 

      

     

      

    $ ps -l                

      

           F S UID    PID   PPID   C PRI NI ADDR    SZ    WCHAN    TTY  TIME CMD

      

      200001 A 357 352404 610392   1  60 20 7ba0b400   908           pts/1  0:00 ksh

      

      200001 A 357 512242 352404   6  63 20 7174a400   820           pts/1  0:00 ps

      

      240001 A 357 610392 569448   0  60 20 c309c400   888           pts/1  0:00 ksh

      

     

      

    Look is used to dynamically report the values from the data collector at about the same time as ps was run.  For example for process with PID 352404:

      

     

      

            PID                            = 352404

      

        RSS                            = 956

      

        SIZE                           = 908

      

     

      

    The data collection results match those of ps -l (RSS) and ps v (SZ).  

      

    The documentation for these metrics (from man ps) is:    

      

     

      

     

      

     

      
     
       RSS  
      

     

      

     

      
     
       (v flag) The real-memory (resident set) size of the process (in 1 KB units).      
      

     

      

     

      
     
       SZ   
     
        (-l and l flags) The size in 1 KB units of the core image of the process.   
      

     

      

     

      
     
       SIZE  
     
       (v flag) The virtual size of the data section of the process (in 1 KB units).    
      

     

      

     

      

    ======================================================================================================================

      

     

      

    The conclusion is that the AIX data collector is reporting exactly what ps is reporting, and for some processes, SZ is less than RSS.  Logically this wouldn't be expected since SZ should be the maximum amount of memory that the process could ever use, so should always be larger than the amount of RSS (real/resident) memory that's recorded.

      

    The next step we are recommending is to escalate this question to IBM, since BMC has verified that we are collecting what we expect to, and using an IBM API to obtain the data values.  For the sample udr data, about 25% of the processes are reporting less "Virtual" memory than "Real" memory.   This should be useful for documenting your question to IBM.

      

     

      

    LINUX PLATFORM EXAMPLE

      




    Checking the results for java process 44286 on 26 January 2015 at 12:05 AM, the values from the Visualizer Process Memory Metrics graphic are:

      



    Real Memory Total MB      525.16

      

    Virtual Memory MB          1981.32

      



    These correspond to these Process_Statistics metrics (in KB instead of MB):

      




    RSS        537768
    SIZE     2028872

      



    Other memory metrics for this process are:

      



    Private RSS       438784
    DRSS                537768
    TRSS                          0
    Shared RSS       98984
    ShmRSS                    0
    StkRSS                      0

      



    So what is the specific definition of the SIZE udr metric for Linux?

      

    ==================================================================================================================================

      

    From man ps on Linux we find these definitions 

      

     

      

    SZ       size in physical pages of the core image of the process. This includes text, data, and stack space. Device mappings  are currently excluded; this is subject to change. 

      

     VSZ      virtual memory size of the process in KB (1024-byte units). Device mappings are currently excluded; this is subject to change.

      

     

      

    This command can be used to display both of these metrics

      

    ps -o pid,rss,trss,sz,vsz,comm

      
        
      

     The Linux data collector use the /proc/[pid] files to get the by process memory metrics:

      

    (1) In /proc/[pid]/statm we find

      
    /proc/[pid]/statm Provides information about memory usage, measured in pages. The columns are: size (1) total program size (same as VmSize in /proc/[pid]/status) resident (2) resident set size (same as VmRSS in /proc/[pid]/status) share (3) shared pages (i.e., backed by a file) text (4) text (code) lib (5) library (unused in Linux 2.6) data (6) data + stack dt (7) dirty pages (unused in Linux 2.6) 
      
        
      
      (2) And for /proc/pid/stat we find
      
        
      
       
    rss %ld Resident Set Size: number of pages the process has in real memory. This is just the pages which count toward text, data, or stack space. This does not include pages which have not been demand-loaded in, or which are swapped out.
      
      
       
    vsize %lu Virtual memory size in bytes.
      
      
       So the summary is that "size", "vsize", VSZ, SIZE, and VIRT (from the top command) all represent the same value.
      
        
      
      =================================================================================================================================
      
        
      
      So observing values for the virtual process size for Linux processes, this is clearly a measure of maximum address space size.  It's not uncommon to see a single process for which the virtual size exceeds the size of both real and virtual memory for the entire system.  This doesn't represent an inconsistency in measurements or a potential performance problem because if those pages are never loaded into memory, they aren't going to get swapped out.  If that process ever tried to use all of the pages, a real memory shortage would occur first, immediately followed by a system virtual memory shortage.  The process metric that would be helpful in understanding how much of the potential virtual exposure is actually realized would be pages swapped out (which is currently not collected).
      
    Related Products:  
       
    1. BMC Performance Assurance

     


    Article Number:

    000313223


    Article Type:

    FAQ/Procedural



      Looking for additional information?    Search BMC Support  or  Browse Knowledge Articles