In BMC Performance Assurance, the General Manager Lite utility is designed to monitor the collection, transfer, processing, and population status of nightly Manager runs and provide a time series view and the stability of the environment over the last 30 days (by default). It can also track the progression of that data into the BCO datawarehouse as well, if BCO has been implemented.
There are two components to General Manager Lite:
|Component||BPA 9.0 Requirements||BPA 7.5.10 Requirements|
|General Manager Lite Core|
9.0.03 (9.0 SP3)
7.5.10 Cumulative Hot Fix #2 (#202020)
and later (recommended)
|BCO/BPA Status and Recovery script|
9.0.03 Cumulative Hot Fix #4 (#30040)
7.5.10 Cumulative Hot Fix #2 (#202020)
The BCO/BPA Status and Recovery (BCO_BPAStatusAndRecoveryManager.pl) script is optional, and must be run on a UNIX console. When used, the benefits are:
- The script prompts you for the information necessary to implement General Manager Lite (rather than requiring a manual setup, which is described below)
- The script will automate the execution of the General Manager scripts via the BPA pcron facility
Implementation with the BCO_BPAStatusAndRecoveryManager.pl script
The BCO_BPAStatusAndRecoveryManager.pl script must run on a UNIX console and you will need to enter the requested information as indicated. The section below explains the typical prompts and response guidance:
>> Enter BPA Console Name[s] (Multiple Consoles comma separated (Example console1, console2)) (Current Value=localhost)
The General Manager Lite (GMLite) component must run on a Linux system where the BPA console is installed, but it can communicate with all BPA Unix, Linux, and Windows consoles in your environment in order to build a centralized view of your BPA data processing. For this prompt, specify a list of BPA consoles for GMLite to contact to obtain BPA data processing information on a nightly basis.
>> Enter Daily Script Execution Time (HH:MM) (Current Value=20:00)
This prompt is for when GMLite scripts should be executed each day. By default the script will execute at 8 PM. This time should be (a) sometime after your last Manager run has finished processing for the day (b) data import into BCO should be complete (if applicable), and (c) at a time when recover populates of data into BCO could be attempted (if applicable).
>> Enter BPA GeneralManager Port (Current Value=10129)
This is by default port 10129, and that port is not commonly customized.
>> Enter BPA Output Directory Where the Data will be put (Current Value=$BEST1_HOME/local/manager/status/GeneralManagerLite)
Specify where the GMLite output should be written on this console if you don't want to use the default location.
>> Enter gnuplot install directory (Do not specify anything if you wish use one in your path) (Current Value=undefined)
General Manager Lite can create a web page that includes charts reporting the number of computers configured, collected, transferred, processed, populated into the BPA database, and imported into BCO. This functionality requires that the 'gnuplot' utility be installed on your BPA console. If GNUplot is installed and you would to enable this functionality, specify the gnuplot location here. On Linux the default installation path for GNUplot is /usr/bin.
>> Enter are you configuring BCO BPA ETL Status reporting [Y|N] (You will need ORACLE_HOME, DSN, user name and password) (Current Value=N)
If you are importing BPA data into BCO, General Manager Lite can be configured to monitor the success rate of the import of BPA data into BCO. Answer 'Y' if GMLite should be configured to monitor BCO population success.
>> Enter BCO Oracle DSN (must be configured via tnsnames.ora (see http://www.orafaq.com/wiki/Tnsnames.ora)) (Current Value=undefined)
When integrated with BCO, supply the TNS Name of your BCO Database (as defined in the $ORACLE_HOME/network/admin/tnsnames.ora file).
>> Enter BCO ORACLE_HOME (Current Value=undefined)
When integrated with BCO, supply the path to your Oracle Client installation on the BPA console. If you are using Unix Populate this can be the same path specified in the $BEST1_HOME/local/setup/MpopulateOracleHome.loc file.
>> Enter BCO Oracle Password (Displayed encrypted) (Current Value=undefined)
When integrated with BCO, supply the password for the BCO_OWN database user (schema owner).
>> Enter BCO Oracle User Name (Current Value=undefined)
When integrated with BCO,, supply the BCO database account that owns the BCO installation (by default 'BCO_OWN'). In older BCO installations this may be CPIT_OWN. You can validate by checking in the BCO web interface under Administration -> System -> Configuration -> General -> Database Username (Schema Owner).
>> Enter Number of Days to recover starting from today (Current Value=2)
When integrated with BCO, this is the number of days that General Manager Lite should look back for recovery of failed BPA data imports into BCO.
>> Enter Number per day of top BPA visualizer file errors to recover (Current Value=10)
When integrated with BCO, this is the number of Visualizer files that General Manager Lite should attempt to recover each day. The reason to specify a limit is to throttle the amount of recovery activity to prevent recovery populates from interfering with the nightly load of BPA data into BCO.
>> Enter BPA vis file directory (Current Value=undefined)
When integrated with BCO, this is the archive directory where the BPA Visualizer files are to be copied by General Manager Lite when they need to be recovered by the BPA Recovery ETLs configured in BCO.
Example Output (UNIX only) from the BCO_BPAStatusAndRecoveryManager.pl script
INFO: Using path /usr/adm/best1_9.0.00/bgs/scripts/BCO_BPAStatusAndRecoveryManager.pl
Info: Using BEST1_HOME=/usr/adm/best1_9.0.00
Info : reading input file /usr/adm/best1_9.0.00/local/setup/BCO_BPAStatusAndRecoveryManager.opt
Please answer the following questions regarding the operation of GeneralManagerLite
Enter BPA Console Name[s] (Multiple Consoles comma separated( Example console1,console2)) (Current Value=localhost)
[ Hit Return to Accept Current Value ]) ?vl-hou-cus-sp55.bmc.com
Enter Daily Script Execution Time (HH:MM) (Current Value=20:00)
[ Hit Return to Accept Current Value ]) ?08:00
Enter BPA GeneralManager Port (Current Value=10129)
[ Hit Return to Accept Current Value ]) ?
Current Value=10129 kept
Enter BPA Output Directory Where the Data will be put (Current Value=$BEST1_HOME/local/manager/status/GeneralManagerLite)
[ Hit Return to Accept Current Value ]) ?
Current Value=$BEST1_HOME/local/manager/status/GeneralManagerLite kept
Please answer the following questions regarding the operation of BCO_BPAtimeAnalysisWebPageCreate
Enter gnuplot install directory (Do not specify anything if you wish use one in your path) (Current Value=undefined)
[ Hit Return to Accept Current Value ]) ?/usr/bin
Enter are you configuring BCO BPA ETL Status reporting [Y|N] (You will need ORACLE_HOME, DSN, user name and password) (Current Value=N)
[ Hit Return to Accept Current Value ]) ?Y
Please answer the following questions regarding the operation of BCOStatus
Enter BCO Oracle DSN (must be configured via tnsnames.ora (see http://www.orafaq.com/wiki/Tnsnames.ora)) (Current Value=undefined)
[ Hit Return to Accept Current Value ]) ?ORA112DB_SP71
Enter BCO ORACLE_HOME (Current Value=undefined)
[ Hit Return to Accept Current Value ]) ?/data1/oracle/product/11.2.0/client_1
Enter BCO Oracle Password (Displayed encrypted) (Current Value=undefined)
[ Hit Return to Accept Current Value ]) ?BmcCapac1ty_OWN
Enter BCO Oracle User Name (Current Value=undefined)
[ Hit Return to Accept Current Value ]) ?BCO_OWN
Please answer the following questions regarding the operation of BCORecover
Enter Number of Days to recover starting from today (Current Value=2)
[ Hit Return to Accept Current Value ]) ?
Current Value=2 kept
Enter Number per day of top BPA visualizer file errors to recover (Current Value=10)
[ Hit Return to Accept Current Value ]) ?
Current Value=10 kept
Enter BPA vis file directory (Current Value=undefined)
[ Hit Return to Accept Current Value ]) ?/data1/best1data/bcovisrecover
Warning : The directory does not exist, attempting to create /usr/adm/best1_9.0.00/local/manager/status/GeneralManagerLite
Info : testing BPA console=vl-hou-cus-sp55.bmc.com
Info : Running /data1/oracle/product/11.2.0/client_1/bin/tnsping ORA112DB_SP71
TNS Ping Utility for Linux: Version 18.104.22.168.0 - Production on 17-SEP-2013 09:52:16
Copyright (c) 1997, 2009, Oracle. All rights reserved.
Used parameter files:
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = vl-sjc-cus-sp71.labs.bmc.com)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = ORA112DB)))
OK (650 msec)
Info : BCO ETL will be queried
pcrontab: can't find task ID in your pcrontab file.
Info : unable run pcrontab to unschedule : /usr/adm/best1_9.0.00/bgs/scripts/pcrontab.sh -unschedule 03 : return 1536
Info : no runs to unschedule
Info : Scheduling : /usr/adm/best1_9.0.00/bgs/scripts/pcrontab.sh -schedule 03 "00 08 * * * /usr/adm/best1_9.0.00/bgs/scripts/BCO_BPAStatusAndRecoveryManager.pl -r > /usr/adm/best1_9.0.00/local/manager/log/BCO_BPAStatusAndRecoveryManager.log 2>&1"
General Manager Lite web page output
If gnuplot (www.gnuplot.info) is available on the BPA Linux console where General Manager Lite is scheduled, then it will automatically create some web reports that summarize the data collection, transfer, processing, populate, and BCO import success of your BPA environment.
The reports are created by default in the $BEST1_HOME/local/manager/status/GeneralManagerLite/BCO_BPAWebReport directory, and can be viewed via a local web browser running on your Linux console or shared out via a web server running on the BPA console.
Below is an example of the chart output from three different BPA consoles. You likely will need to zoom in to see it better, but I wanted to show an image that had many nodes in it to give you a better idea of what this looks like.
The legend for the data is represented by:
- Red line -- The number of configured BPA computers. This is the number of computers included in domain/policy files in an active BPA Manager run
- Green line -- The number of computers that successfully collected data in the environment
- Blue line -- The number of computers that successfully transferred data back to the BPA console
- Purple line -- The number of computers that were successfully processed and included in a Visualizer file by the BPA console
- Cyan line -- The number of computers that were successfully imported into BCO by the BPA ETLs
Detailed Information about the underlying General Manager Lite scripts
In order to obtain this functionality, the following are required:
1. A BPA console
- Unix console, 7.5.10 or later
- Unix Console, 9.0.00, needs 9.0.03 (9.0 SP3 or later)
- Linux console, for 9.5 and later
2.You need a perl script (GeneralManagerLite.pl) that must run on a UNIX console and an updated GeneralManagerClient (this enhancement is recorded as QM001745812). These are available as part of 7.5.10 UNIX console patches, beginning with June 2012.
Additional updates have been made since June 2012, QM001764244, and this is included in 7.5.10 SP2 console patch from December 2012. Additional enhancements have been made since December 2012, including support for Windows consoles (QM001781969 and QM001779687). These were included in 7.5.12 Cumulative Patch 2 (May 2013).
The script only needs to be run on from one of your consoles (UNIX only). It will gather processing statistics from all of your BPA (UNIX and/or Windows) consoles.
What the tool does
1. Identifies all the nodes in your environment (this is output as allNodes.csv file)
2. Identifies all manager run/domain mappings (this is output as domain2ManagerMap.csv file)
3. Identifies all duplicate nodes in your environment (this is output as duplicateNodes.csv file)
4. Identifies all failed nodes and categorizes them into collect, transfer and processing errors (this is output as failedNodes.csv and failedNodesNoAgent.csv files)
5. Obtains the remote agent logs for collect, transfer, and processing errors (including proxy collection).
How to run the tool manually
$BEST1_HOME/bgs/scripts/GeneralManagerLite.pl -c <Console Name> [-o <Output Directory > -p <General Manager Port> -l -d -i <manager run pattern>]
Console Name BPA console with GeneralManagerServer running or a command separated list of BPA consoles
Output Directory Output Directory where results will be deposited : default is the current directory
GeneralManagerPort General Manager Port : default 10129
-l Get the Remote Agent and Proxy Logs for detailed analysis
NOTE: this can take a lot of disk space and is the default
-d Save results in date-stamped directories for the last 30 days (recommended configuration)
-i Ignore/remove results for manager runs which match the pattern specified;
multiple patterns may be specified by using a comma
Note that the BCO_BPAStatusAndRecoveryManager implementation method described above is just a semi-automated method for running this script and supplying the necessary input parameters.
Instructions for using the manually run script
1. Find a location with a considerable amount of disk space if you are planning to acquire the optional log files (using -l) as they will consume a lot of disk space. If you have specified that you want the collect logs, logs are about 7 MB per node. So if you have 1000 nodes with collect failures, you will need at least 10 GB of free space.
$BEST1_HOME/bgs/scripts/GeneralManagerLite.pl -c <Console Name> [-o <Output Directory > -p <General Manager Port> -l -d ]
The script will generate a subdirectory for each BPA console which contains all relevant csv and log files. Then the files are zipped. These files can be sent to Customer Support as a summary of the entire BPA console environment. It can take a while for the script to run (at 10 seconds per node, 1000 failing nodes will take 3 hours). The more failures there are, the longer the processing takes.
Only "today" is captured when the script is run. If you want to keep track of "history", you can set this up by specifying the -d flag (automatically keeps the last 30 days of results in date-stamped directories and removes data older than 30 days). You should schedule the script to run every day, but note that it will overwrite the results from the previous day unless the -d option is used (or you manually archive the files).
If you have "special" manager runs, such as ones with no data collection where data is simply being reprocessed, you should remove these from the output by using the -i option. Otherwise, they will produce incorrect results since they don't have the full complement of activities occurring. Note that this is implemented by using a pattern match so that you don't have to specify the full names of manager runs.
NOTE: If you are a Windows-console only installation, you can use a Linux VM to do a BPA console install in order to run the script. You don't need the console to be actually running any Manager runs.
Interpreting the output
"Nodes" which didn't get successfully put into the database for a particular day are divided between failedNodes.csv and failedNodesNoAgent.csv because the type of follow-up required is likely to be different between the two groups of nodes. The error code associated with each node's status is provided in the .csv file: C means collect failure, T means data transfer failure, P means processing (no data created for input to the CDB) failure.
The error code numbers are attached to this article (CollectTransferErrorCodes.xls or KA373639), and are detailed in the associated logs for that node (if requested using -l). This enables a summary level understanding of how many failures there are for the date, and how many nodes have the same kind of failure. The purpose is to provide a convenient way to troubleshoot groups of nodes rather than doing them one at a time. The details for each node are available in the associated log (if requested via -l), so low level reporting is fully supported as well.
The attached failedNodesNoAgent.csv lists all nodes with Collect errors 91, 92, or 94:
91 Error SD_COMM_BAD_HOST Service daemon invalid host name provided (cannot find server or DNS error), the agent name is not known by the OS.
92 Error SD_COMM_BAD_PORT Service daemon not installed on the remote node (connection refused) The product is not installed on the agent computer or the service daemon is not running.
94 Error SD_COMM_CONNECT_TIMEOUT Service daemon connection timed out, node offline, or the agent node is off the network.
Best Practices for Doing a Daily Health Check for BPA Consoles
(1) Review the GeneralManagerLite output as described above. This gives an overall summary of how many nodes are under management, and the status of each node. Also comparing results from day-to-day immediately highlights any change in the overall health as well as pinpoints the source of the changes.
An unsupported script is available (console_status_email_option.sh) to summarize this daily review and to email you the results. The script is coded for a 9.0 console and is attached to this article and described in the attached Word document (OptionalStatus_email.docx)
(2) Using the General Manager GUI (displayed in Perceiver or BCO 9.0), Console Operations -> "Recover Runs" view.
Alternatively, you can use failedNodes.csv (output from GeneralManagerLite) or export the "Recover Runs" to csv if you prefer.
The idea here is to initiate any Recovery actions first, then work on the data collection problems which typically require more analysis to resolve.
(3) Sort by "Populate Status". For any Manager run which is not "OK", select the run, and then select "Recover".
(4) Sort by "Transfer Fail". For any Manager run which doesn't have a value of 0, select the run, and then select "Recover".
(5) Sort by "Collect Fail". Use the corresponding Console Reports -> "Node History" view to establish the precise problem (using the error code), how many nodes have the same problem, and if the problem is persistent (using 3 or 5 day history setting). Perform remediation as indicated by the error code and cause. Note that the results of successful remediation may not appear for up to 2 days depending on the problem fixed and how often the Manager run is scheduled for execution.
If you've specified the optional log gathering feature, the corresponding logs have already been retrieved from the remote nodes and zipped so that they can be sent to Customer Support.
(6) Rerun the GeneralManagerLite script after the recovery actions have been completed in order to assess the "recovered" overall health of the data flow for today.
When additional troubleshooting time is available, determining the root cause for repeating Population, Processing, or Transfer errors can avoid the need to "Recover" the run(s) each day.
The failedNodesNoAgent.csv lists all nodes which are listed as under management by BPA, but no collection agent is present. Typically this requires an internal ticket to get the agent software installed (either on a proxy or local agent). Note that this condition can occur when a node has its OS upgraded, but the corresponding BPA agent wasn't upgraded at the same time.
Tool which summarizes the results for all BPA consoles (using General Manager), and gathers logs for nodes with collection errors https://kb.bmc.com/infocenter/index?page=content&id=KA366377
console_status_email_option.sh Script that can be used to summarize daily activity and to email you the results, script is coded for a 9.0 console
OptionalStatus_email.docx Example email action script sent
CollectTransferErrorCodes.xls describes various error codes received from implementation of this methodology
I hope you are able to take advantage of this methodology. I think you will find it most useful in a larger environment.
Let me know what you think, or if you have a topic that you think would be useful to delve into here.
We're pulsing for you at BMC TrueSight Support Blogs