Monthly Report for $group.getName()

Produced by: Capacity Planning
Run Time: $now.getDate()
Analysis Period: $start.getShort() to $end.getShort()

Introduction

This analysis serves to validate whether the application is experiencing abnormal utilization or violating thresholds as per our standards. Further details are available in the Details and Analysis section.

We have created several rules for analyzing various aspects of UNIX, Linux, and MS-Windows servers.  These rules help us determine if further analysis is required, or if the server is performing within normal parameters and without capacity concerns.

·         The results of rules within current parameters, which would normally be indicated in green, are omitted from this report.

·         For each day a rule is broken, the number of days in violation will be highlighted yellow for 1-6 days, and red for 7+ days.

·         Normal Business Hours is defined as Monday-Friday, 8am to 5:59pm EST.

·         For each rule, this study will use the following color-coding:

Blank or Green

Within parameters.

Yellow

Exceeding parameters.

Red

Consistently exceeding parameters.


Overview and Rating

During the analysis period, $start.getShort() to $end.getShort(), $group.getName()’s peak business hour occurred at %15%% of available CPU resources were consumed. The peak batch period started at 18:00 on %16%% of available CPU time was consumed during the batch’s busiest hour.



Overview

Rule 1 – Processing Capacity
This rule analyzes the processing capacity of each server, and is a trigger for a possible lack of CPU resources during the analysis period.  This rule examines the CPU Utilization and Run Queue Length metrics for each server in the group. This rule evaluates 24 hours per day, including batch windows.  As such, some false positives will occur.  Each row in the table represents the number of days the rule was broken during the month. The table contains two columns per server: CPU Utilization and Run Queue Length per Processor. For further details, refer to the CPU section below.

This rule evaluates 24 hours per day, including batch windows.  As such, some false positives will occur.

The rule: Count the number of days during the analysis period where CPU utilization is over 30% and Run Queue Length is greater than 0 per CPU, for 1 or more hours that day.

%1%

Rule 2 – Memory Capacity (a)
Analyzing memory capacity is difficult, and requires two different rules.  The first memory rule examines the Virtual Memory utilization for each server in the group during the analysis period, business hours only.  Once all of the server's virtual memory has been allocated to specific applications, the server cannot start new programs and executing programs may fail if they try to grow. For further details, refer to the Memory section below.

The rule: Count the number of days during the analysis period where Swap Utilization is over 60%, for 1 or more hours that day.

%2%

Rule 3 – Memory Capacity (b)
The second memory capacity rule analyzes each server's Memory Paging rate during the analysis period, for business hours only.  Paging activity usually indicates memory contention and is a capacity/performance trigger for a detailed analysis of the memory requirements for the server. For further details, refer to the Memory section below.

The rule: Count the number of days during the analysis period where the Memory Paging rate is above 0 pages/second for more than one hour that day.

%3%



Details & Analysis

Collection Status
 This section analyzes the data collections that provide the data used to do this analysis, by server for the analysis period.  The table lists the number of hours of collected data per day by server. Normally, we collect data 24 hours a day.

%0%

 


A. CPU Utilization – the Application as a Whole
This section analyzes the CPU resources for the entire application. By combining the CPU resources and their usage across all servers, this section attempts to identify the key resource utilization periods. 

Table: Busiest Business Hour for the Application
%17%

%13%


The busiest batch day assumes over-night batch runs starting at 6:00pm on business days, excluding Fridays.  For example, this means that Wednesday’s batch starts at 6pm and completes at 7:59am Thursday.   

%14%


B. CPU Utilization by Server
This section examines the CPU utilization for each server on the busiest day, the average business day, the busiest business day, and the busiest batch day during the analysis period. Please note that for all three charts, each server’s particular day may be different.  Refer to the tables that contain the particular dates for each server.  The Busiest Day is calculated by examining each server’s CPU utilization for every hour of the day during the analysis period, and determining which 24-hour period used the most CPU time. The Average Business Day is calculated by averaging each hour across all business days of the analysis period (all midnights, all 1AMs, all 2AMs, etc.).  The Busiest Business Day is calculated by examining CPU utilization only during normal business hours per calendar day. The Busiest Batch Day is calculated by examining CPU utilization for the batch periods starting at 6:00pm on business days, excluding Fridays.

Table: Busiest Day by Server, with CPU Utilization for the Busiest Hour, all Days
%8%
Table: Busiest Business Day by Server, with CPU Utilization for the Busiest Hour
%9%
Table: Busiest Batch Day by Server, with CPU Utilization for the Busiest Hour
%10%

The 3 tables above are used to draw the following 4 charts.

%4%

%5%

%7%

%6%


C. Memory
The following charts provide more diagnostic information for physical and virtual memory utilization, and possible memory contention. For UNIX and Linux servers, physical memory utilization is usually above 90%, and is considered normal; therefore no chart will be supplied. 

Once all of a server's virtual memory has been allocated to specific applications, the server cannot start new programs and currently running programs cannot grow.  We recommend that virtual memory utilization remain below 80%. 

%12%


The Paging Rate by server acts as cross-platform metric to determine memory contention. High paging rates, above 100 pages per second, usually indicate memory contention.  Backups can cause excessive paging without memory contention; therefore, excessive paging during backups is considered a false positive and should not be a concern.

For MS-Windows servers, paging activity can also reflect an application’s request for data from the disk subsystem, and is a false positive.  For example: reading a file.  Nevertheless, a detailed analysis is called for to determine the root-cause of the paging rate.

%11%