Skip navigation
Share:|

The Script is designed to Monitor Grid Manager, Repository Manager, CDP's and Peers for critical attributes like for Peer it keeps a tab on Available Memory, Used Memory, Total Heap Memory, JVM Uptime, Management Grid Status, Main Grid Status, CDP/Peers Components Status (Health, Library Manager, Configuration, Context, Scheduler, Metrics, Job Manager, Adapter Manager, Activity Processor) and All Adapter status active on any Peers in the Grid. In case script find any issue in the adapters, grid or peer components it sends an email notification to the eMail address mentioned in the "Config.xml"

 

The script/tool has been written in python 3.6.6 version and can be run Windows or Linux machine irrespective of your BAO environment, Script leverages BAO REST API and Python web scraping techniques to collect data from REPO, Grid Manager and Peers.

 

To run the Script/Tool in Windows you can either directly run the executable file without the need of installing python 3.6.6 and its supporting library or you can install the python 3.6.6 and add required modules. (required module : requests)

 

To run script in Linux you would have to download and install python 3.6.6 or above and add "requests" module via pip or manually.

 

There are two version of the script attached, version v1.0 has been designed and tested primarily for BAO version 7.9.x and version v2.0 has been designed for BAO/TSO 8.x and above. All configurable parameter of the script/tool are controlled by "Config.xml" which should be kept in the same folder as script.

 

Version v1.0 features :

  • Monitors Grid Status, Active Threads,  Available Memory, Uptime etc

  • Peer Components Status

  • Adapter Status on all peers

  • Send Email Notification

  • Silent mode where all the result of the script get logged in log file.

 

Version v2.0 features: (version 2 has optimized code as from BAO/TSO 8.x peer components status are available through API and web scraping in not required to collect the same)

  • Monitor Repository

  • Monitors Grid Status, Active Threads,  Available Memory, Uptime etc

  • Peer Components Status

  • Adapter Status on all peers

  • Send Email Notification

  • Silent mode where all the result of the script get logged in log file.

 

Enhancement in development:

  • Create Incident ticket in ITSM tool to report the issue.
  • Basic Remediation to fix identified problem

 

Version v2.1

  • Password Encryption in Config.xml file
  • Password encryption tool

 

Steps to configure the script. (Note: Ensure script or executable and Config.xml file are kept in same folder)

1. Getting the Grid ID for your Grid Manager, login to the Grid Manager and go to Manage>Grids>Select your Grid Name and click on pencil sign for edit. copy the ID  and Grid Name as show below and past it in "Config.xml"

2. Provide your CDP server IP/FQDN and port details along with BAO/TSO admin account in the "Config.xml", also in "schedule-interval" provide the time interval in secs for script to check Grid Status, default is set to 90 sec however I would recommend to use at least 300+ sec (i.e. 5 mins + ) as "schedule-interval".

3. If you have only one Peer (i.e CDP) set "monitor-peers" to "False", else to monitor additional peers like HACDP, AP's and LAP's set it to "True" and provide peer details as show below in the respective peer tag.

4. SMTP server details are crucial for this script and you can customize email notification subject or body text, script support both "secure" and "unsecure" connection, for my local mail server I have set the field "Secure" as "False" however most mail servers required secure connections so set it to true.

 

5. If you are using self signed or CA signed certificates in your grid and peers set the "AllowInsecureRequest" as "True", else for manage MPKI signed certificate this can be set as "False", If you are running the script for initial test and wants to print the status/result on screen, set "print-onscreen" as "True". However I would suggest to set this on "False" once the configuration is completed properly so that script can be run in background and you can review things from "BAOMonitioring.log" (Note: Set the logging-level to "DEBUG" when 'print-onscreen' is set to "False" and if 'print-onscreen' is set to "True" set the logging-level to "INFO" )

 

6. Set the threshold limit for "Available Memory" is set via memory tag, its basically the memory limit after which script will start sending email notification to inform that available free memory of Java heap has dropped below set limit (default has been set as 200 MB) and Active Threads is set via threads tag, it basically is a trigger if BAO/TSO is consuming more thread than set limit as shown below

7. Repository URL, needs to be set only for version v2 script as show below.

Sample Config.xml

<ConnectionDetail>
   <server>192.168.32.123</server>
   <port>38080</port>
   <schedule-interval>90</schedule-interval>
   <user>aoadmin</user>
   <password>admin123</password>
   <GridID>urn:jxta:uuid-44D91B15DA074253825D94A184EE308802</GridID>
   <GridName>MyGrid</GridName>
   <AllowInsecureRequest>True</AllowInsecureRequest>
   <print-onscreen>True</print-onscreen>
   <EmailSetting>
   <SMTPServer>127.0.0.1</SMTPServer>
   <SMTPPort>587</SMTPPort>
   <Secure>False</Secure>
   <EmailUsername>support@test.local</EmailUsername>
   <EmailPassword>admin123</EmailPassword>
   <NotificationEmailAddress>admin@test.local</NotificationEmailAddress>
   <Subject>BAO Monitoring script found some error in the Grid</Subject>
   <Message>BAO Monitoring script reported concerns for below components</Message>
   </EmailSetting>
   <monitor-peers>True</monitor-peers>
   <peers peerType="HACDP">
   <peer>https://hacdp.test.local:38080</peer>
   </peers>
   <peers peerType="AP">
   <peer>https://ap1.test.local:58080</peer>
   <peer>https://ap2.test.local:68080</peer>
   </peers>
   <peers peerType="LAP">
   <peer>http://lap.test.local:48080</peer>
   </peers>
   <Threshold-limit>
   <memory>200</memory>
   <threads>400</threads>
   </Threshold-limit>
   <repoUrl>https://bao.test.com:28080/baorepo</repoUrl>
   <logging-level>INFO</logging-level>
</ConnectionDetail>

 

Note: If you don't have HACDP or AP or LAP, comment the particular tag in Config.xml using <!-- <tag>--> or remove the tag.

 

Sample Config.xml for 2.1 version.

<ConnectionDetail>
   <server>bao.test.local</server>
   <port>38080</port>
   <schedule-interval>90</schedule-interval>
   <enable-encryption>True</enable-encryption>
   <user>aoadmin</user>
   <encrypted-password>V1ZkU2lhOSGhOYWswOQ==</encrypted-password>
   <GridID>urn:jxta:uuid-E5D1AB3B3CB94AA48A93A7E2BE2814A702</GridID>
   <GridName>MyGrid</GridName>
   <AllowInsecureRequest>True</AllowInsecureRequest>
   <print-onscreen>True</print-onscreen>
   <EmailSetting>
   <SMTPServer>smtp.test.local</SMTPServer>
   <SMTPPort>465</SMTPPort>
   <Secure>True</Secure>
   <EmailUsername>admin@test.local</EmailUsername>
   <Email-encrypted-password>V1ZkU2RHRlhOSGhOYWswOQ==</Email-encrypted-password>
   <NotificationEmailAddress>support@test.local</NotificationEmailAddress>
   <Subject>BAO Monitoring script found some error in the Grid</Subject>
   <Message>BAO Monitoring script reported concerns for below components</Message>
   </EmailSetting>
   <monitor-peers>True</monitor-peers>
<!-- <peers peerType="HACDP">
  <peer></peer>
  </peers> -->
   <peers peerType="AP">
   <peer>https://ap1.test.local</peer>
   </peers>
<!-- <peers peerType="LAP">
  <peer></peer>
  </peers> -->
   <Threshold-limit>
   <memory>200</memory>
   <threads>400</threads>
   </Threshold-limit>
   <repoUrl>https://bao.test.local:28080/baorepo</repoUrl>
   <logging-level>INFO</logging-level>
</ConnectionDetail>

 

Note: For password encryption to work set "enable-encryption" to true and provide BAO password and SMTP password as encrypted through encryption tool.

 

Steps to Run or Deploy the script or executable.

  • In Windows with .exe file.

        After Configuring the "Config.xml" file double click on executable to run directly or run it from command prompt.

  • In Windows from .py file

        After Configuring the "Config.xml" file, Open command prompt and Type "python" then space followed by the script file name or to run in background type pythonw then space followed by the script file name.

  • In Linux from .py file

        After Configuring the "Config.xml" file, from terminal type python3 then space followed by the script file name and to run in background type python3 then space followed by script file name space followed by "&"

       example for running in background

               python3 BAOMonitoring_v2.py &

 

 

Encrypting Password for version 2.1

  • To encrypt password run the passwordtool.exe in windows and provide the password string as show below along with operation as "encrypt" and copy the encrypted output to the config.xml file for respective encrypted password tag.

        passwordtool.exe -p P@ssw0rd -o encrypt

PLEASE NOTE: THIS IS NOT AN BMC OFFICIAL SCRIPT OR TOOL FOR MONITORING BAO/TSO, BMC SUPPORT WILL NOT BE ABLE TO PROVIDE YOU ANY ASSISTANCE FOR THE SAME.

The Script/Tool has been designed only as a proof of concept by me, so kindly use it at your own discretion.

 

If anyone would like to contribute to the script below is the GitHub link for source.

GitHub - sandeep239/BAOMonitoringScript

Share:|

At times it is required to reset the Test Env. CDP or any other peer back to freshly installed state, to do that, follow these simple steps and take a backup file system before doing so.

 

  • Stop the CDP or Peer Service.
  • Removed files from KahaDB located at AO_HOME/server/.jms/activemq-data/ao-grid-framework-embedded-broker-<guid>/KahaDB
  • Removed log files located at AO_HOME/tomcat/logs

 

Reset Peer:

Removed following:

  • "server" folder located on AO_Home
  • "messages" folder located on AO_Home
  • all XML files in "config" folder except authentication.xml, tuning-config.xml and installation_audit.xml
  • all files in tomcat/work
  • all files in tomcat/temp

Then start peer service.  It will re-create all those files and directories, just like the first time it starts up.

 

NOTE: These steps are only for non production environment, do not try these step in production.

 

 

Share:|

I quite often get this sort of scenario where customer have a requirement to deploy a Peer in a different network or in DMZ while keeping communication between CDP's and Peer in other network using NAT IP address.

 

Below is a high level overview of how this sort of deployment configuration which provides connectivity with an LAP located in a different network/DMZ using NAT IP address. This deployment minimizing the impact to the DMZ firewall rules. The grid is deployed in a high-availability configuration to provide redundancy for the processing peers. Inside the DMZ or Other network, a lightweight activity peers exist which can host required adapters.

 

Lets understand some basic concept of Peer Discover and Peer Communication which is associated with JMS broker.

Each peer is associated with a JMS broker which hosts a set of queues which is involved to send messages targeted to the respective peer.  A given broker may host queues for any number of peers. The combination of broker URI and topology queue is referred to as an "advertisement".

 

Each peer engage and contribute in the management grid "discovery" topology, along with a subset of the peers being designed to provide discovery services to the other peers. The advertisements associated with discovery peers are persisted with each peer in the environment. Upon startup, a peer registers the advertisements for its topology with the master in the discovery topology, and periodically requests advertisements matching topology in which the peer participates.

 

Non discovery advertisements are never persisted and are maintained only in memory. When a peer learns about a new discovery server (this occurs when an HA CDP is added to a grid), the associated advertisement is persisted with that peer.

 

discover_Peer.png

 

Broker Config

Most JMS configuration files are located in $AO_HOME/server/.jms

 

The broker-config.xml file specifies the JMS broker configuration for individual peer.

The XML document can have below nodes

  • external - indicates whether the broker is hosted within this peer's VM or externally
  • uri - the broker URI. This specifies the protocol, IP (or host), port, and any ActiveMQ parameters associated with the broker.
  • port-upper-bound - if present, indicates that the broker port may range from the port specified in the uri up to the number provided in this element. Upon peer startup, the first available port in the range will be used
  • listen-addresses — if present, this would enforce the server to listen only on the addresses specified
  • advertise-addresses — if present, this would enforce the server to advertise only the addresses specified

 

activemq-data - a directory used by ActiveMQ

certificate- the certificate used for SSL JMS connections.

disco - a directory used to hold discovery advertisements (Ideally you will find only CDP and HACDP disco file here)

disco.png

 

In the below mentioned example, I have a data-center in which I have a CDP, HACDP, REPO and Authentication System (AM/ASSO/RSSO) based on the version of the production in use.
My CDP 1 has an host IP Address of 10.0.0.4 and its NAT IP address is 192.168.2.103, My CDP - 2  has an host IP address of 10.0.0.5 and NAT IP Address as 192.168.2.104. Rest of my BAO components are in the same data center as CDP's expect for one LAP which is located in different location or different data center or in DMZ.  My LAP has an Host IP address as 192.168.2.106.

NATIP.png

It is possible to specify an address to advertise which can be a NAT IP other than the CDP's host IP address.

CDP Broker-config.xml (located in $AOCDPHOME\server\.jms)

<broker-config>

<external>false</external>

<uri>ssl://CDP-IP:<Peer Communication Port>?connectionTimeout=1000</uri>

  <advertise-addresses>

   <address>NAT IP Address of CDP</address>

    <address>CDP-1 FQDN</address>

   </advertise-addresses>

</broker-config>

 

Note: that it is also possible to use "hostnames" in the advertise-addresses block. and further host file mapping for the hostname can be done in the individual peer.

Share:|

One of the biggest challenge which customer face who upgrade from BAO 7.6.x to BAO 7.7 or 7.8 is that ASSO doesn't allow multiple WebApp URL, and if you try to use a CNAME, Alias, VIP or Loadbalancer (to avoid single point of failure) it will alway rewrite the URL back to registered WebApp URL, which might eventually cause a  request failure if the URL is not resolvable from out network or in some case in restricted internal network.

 

Below example is valid for scenario where a VIP or Loadbalancer are used to access the ORCA webservice to avoid single point of failure for ORCA web service.

Lets start with an example: Assume that I have BAO CDP server named "cdp.bao.local", it’s the same name which is used during installation and is been used in web app url of ASSO, I also have an HACDP and it has a name "hacdp.bao.local", I also have a loadbalancer in front of these two server to manage the ORCA requests and avoid single point of failure, LB has Alias/CNAME as "xyz.bmc.com" which I used in my code or 3rd part application to access ORCA webservice from outside the network now if I want access ORCA webservice using external url I will use "https://xyz.bmc.com:28080/baocdp/orca" however when the request reaches the server ASSO automatically rewrite the to registered WebApp URL  which is internal url in this case i.e.  "https://cdp.bao.local:28080/baocdp/orca" or "
https://hacdp.bao.local:28080/baocdp/orca" (based on where LB direct the request) which may not be accessible from outside network irrespective of whether ORCA is added in excluded URL in SSO. This is the default behavior of Atrium SSO and cannot be changed to the best of my knowledge.

To solve this if we re-register ASSO agent with external name (CNAME or Alias) or LB Name, ORCA will start to work fine however base on the way your infra is been design or number of redirection in your network you may find your self stuck in a re-direction loop in browser whenever you try to access grid console using internal url which is "https://cdp.bao.local:28080/baocdp" or
"https://hacdp.bao.local:28080/baocdp"  since URL rewrite to "https://xyz.bmc.com:28080/baocdp" or end with a message saying too many re-directions. The same happen if a CNAMEor VIP is used.

 

To resolve this problem we would have to tweak the system a bit with below mentioned trick.
What we need to do is use a Name/FQDN which is accessible internally (i.e. it can be resolved internally) and it should be accessible externally (i.e. name should be resolved to your loadbalancer IP address). Since for accessing Grid console we need a direct access so that we don't get stuck in redirect loop. however for ORCA It usually work irrespectively.

 

Below are the high level overview of the steps we will be performing.
1. Register the CDP ASSO agent with LB or CNAME FQDN lets say xyz.bmc.com
2. Add a mapping for CDP IP to Loadbalancer FQDN in hosts file of the machine from which Grid Manager will be called and Administrate through those machine.

 

HAORCA.png

 

Apart from above there is alternative option

1. Continue to use CDP and HACDP name in ASSO agent. However make the calling system resolve the CDP and HACDP name to a IP address of LB.

2. Add a hostname mapping for CDP and HACDP name to Loadbalancer IP/ VIP in the external system which are calling ORCA web service.

 

HAORCA2.jpg

 

Note :

1. This workaround would offer you kind of High Availability for ORCA webservice with single end point (for consuming in 3rd party application and custom codes) considering the fact that the requests reached the respective CDP nodes without any failure from Network and OS end, although product was not originally intended to offer High Availability for ORCA  webservices with single point to consume, as per the Product design HA grid configuration with the CDP and HA-CDP manages all CDP failures by election. That is, configuration will provides redundancy for job processing elements of the grid.

 

2. This problem has been addressed in BAO 7.9 release when RSSO will be introduce as a new authentication system since for a typical new installation RSSO will be using the same tomcat instance as BAO and there will be no URL rewrite in it.

 

 

New Workaround

-----------------------

Recent Development showed as that there is one more workaround which can be implemented by modifying the web.xml of the CDP's which will by pass the redirection for Orca, RESTful and Legacy web service. 

 

Below are the changes which we need to make in the CDP's web.xml located in ("<BAO_CDP_HOME>\tomcat\webapps\baocdp\WEB-INF\web.xml")

Change SSO agent URL pattern to /gm/* instead of /* in web.xml as shown below thereby preventing SSO JEE filter from redirecting or rewriting URL to register WebApp URL when ORCA, Legacy or Restful Web Services are called.

  <filter>

        <filter-name>Agent</filter-name>

        <filter-class>com.bmc.atrium.sso.agents.web.jee.JEEFilter</filter-class>

        <init-param>

            <!-- SSOPrincipal -->

            <param-name>principalAttrName</param-name>

            <param-value>com.bmc.ao.sso.principal</param-value>

        </init-param>

        <init-param>

            <!-- SSOToken -->

            <param-name>tokenAttrName</param-name>

            <param-value>com.bmc.ao.sso.token</param-value>

        </init-param>

        <init-param>

            <!-- Identifier from SSOToken -->

            <param-name>tokenIdAttrName</param-name>

            <param-value>com.bmc.ao.sso.tokenId</param-value>

        </init-param>

        <init-param>

            <!-- Name from SSOPrincipal -->

            <param-name>useridAttrName</param-name>

            <param-value>com.bmc.ao.sso.userid</param-value>

        </init-param>

    </filter>

    <filter-mapping>

        <filter-name>Agent</filter-name>

        <url-pattern>/gm/*</url-pattern>

        <dispatcher>REQUEST</dispatcher>

           <dispatcher>INCLUDE</dispatcher>

        <dispatcher>FORWARD</dispatcher>

        <dispatcher>ERROR</dispatcher>

    </filter-mapping> 

With this new workaround you don't have to making any further changes in the host file mapping and you would be able to use Alias, VIP and CNAME while calling webservice.

Filter Blog

By date:
By tag: