Skip navigation
1 2 3 Previous Next

TrueSight Server Automation

157 posts
Share This:

A couple customers followed my old guide about hacking the VPC to work with other RPM-based Linux distributions and I figured I'd document the steps specifically for Amazon Linux.  After these changes you will be able to use the VPC to run Patch Analysis for Amazon Linux, and continue to use it for CentOS.  Standard disclaimer that this is not supported, your mileage may vary, etc.

 

We'll be editing a few files:

  • <VPC Install Directory>/patch/linuxpu/Scripts/Perl/linuxpc.pl
  • <VPC Install Directory>/patch/linuxpu/Scripts/Jython/linux-analysis-py
  • <VPC Install Directory>/patch/linuxpu/Work/linux-analyze.sh
  • <VPC Install Directory>/patch/linuxpu/Work/linux-deploy.sh

 

Then we will sync the repositories and generate metadata, configure the NSH Script Job for analysis with these repositories, run the job, and review the results.

 

File Modifications

linuxpc.pl

 

In the linuxpc.pl file around line 57 you see:

our $suse_relfile   = "/etc/SuSE-release";

our $redhat_relfile = "/etc/redhat-release";

add another line for the /etc/system-release file:

our $suse_relfile   = "/etc/SuSE-release";

our $redhat_relfile = "/etc/redhat-release";

our $amzn_relfile   = "/etc/system-release";

 

Around line 706 you see:

if ( !( system_cmd("nexec $host 'sh -c \"rpm -q --quiet centos-release\"'") ) )

{

       $os_vendor = "cent";

       $os_rel_file=$redhat_relfile;

}

after that add a section to test for the system-release rpm:

 

if ( !( system_cmd("nexec $host 'sh -c \"rpm -q --quiet centos-release\"'") ) )

{

       $os_vendor = "cent";

       $os_rel_file=$redhat_relfile;

}

 

if ( !( system_cmd("nexec $host 'sh -c \"rpm -q --quiet system-release\"'") ) )

{

        $os_vendor = "amzn";

        $os_rel_file=$amzn_relfile;

}

 

Around line 746 look for:

}elsif($os_vendor eq "cent") {

     $os_version = $a[2];

     if($os_version eq "release") {

           $os_version = $a[3];

          } $os_version = int $os_version;

}

and add:

}elsif($os_vendor eq "cent") {

       $os_version = $a[2];

       if($os_version eq "release") {

               $os_version = $a[3];

       }  $os_version = int $os_version;

}elsif($os_vendor eq "amzn") {

       $os_version = $a[3];

      }

 

Around line 792 look for:

}elsif($os_relstring=~/CentOS Linux release/gi){

   $os_relstring='COS';

   }else{

and add:

}elsif($os_relstring=~/CentOS Linux release/gi){

     $os_relstring='COS';

     }elsif($os_relstring=~/Amazon Linux release/gi){

     $os_relstring='AMZN';

     }else{

 

After making the changes above you must update the copy of this script on the file server, either by copy/pasting the contents via the gui into the Depot:/Patch Analysis Items/Linux Patch Analysis/Scripts/Linux Patch Analysis NSH Script object, or copying over the file directly on the file server: <File Server Root>/scripts/xxxxx_linuxpc.pl

 

linux-analysis.py

Around line 484:

        if check_remote_file(host, '/etc/SuSE-release'):

            error, out = nexec(host, 'cat /etc/SuSE-release')

            if error:

                print_error('Cannot read /etc/SuSE-release from host %s' %host)

                print_error('Error is *%s*' %out)

                print_error('Skipping host %s' %host)

                print_error('($host) Cannot read /etc/SuSE-release.')

                print_error('($host) Skipping Host.')

                continue

            found = 1

add:

        if check_remote_file(host, '/etc/SuSE-release'):

            error, out = nexec(host, 'cat /etc/SuSE-release')

            if error:

                print_error('Cannot read /etc/SuSE-release from host %s' %host)

                print_error('Error is *%s*' %out)

                print_error('Skipping host %s' %host)

                print_error('($host) Cannot read /etc/SuSE-release.')

                print_error('($host) Skipping Host.')

                continue

            found = 1

        if check_remote_file(host, '/etc/system-release'):

            error, out = nexec(host, 'cat /etc/system-release')

            if error:

                print_error('Cannot read /etc/system-release from host %s' %host)

                print_error('Error is *%s*' %out)

                print_error('Skipping host %s' %host)

                print_error('($host) Cannot read /etc/system-release.')

                print_error('($host) Skipping Host.')

                continue

            found = 1

Around line 487:

        if os_relstr.count('CentOS Linux release'):

            os_release='COS';

            ver_indx = 3;

add:

        if os_relstr.count('CentOS Linux release'):

            os_release='COS';

            ver_indx = 3;

 

        if os_relstr.count('Amazon Linux release'):

            os_release='AMZN';

            ver_indx = 3;

 

linux-analyze.sh

Around line 15:

osVer=$(rpm -q --queryformat "%{VERSION}\n" centos-release)

if [[ ${osVer} -eq 7 ]]

        then

        bl_yum=$(which yum)

else

       bl_yum="$(readlink /proc/$(ps -ef | grep rscw | grep -v grep | awk '{print $2}')/exe | sed "s/rscd_full/blyum/g")"

fi

Comment out the whole section, and add the bl_yum definition:

#osVer=$(rpm -q --queryformat "%{VERSION}\n" centos-release)

#if [[ ${osVer} -eq 7 ]]

#        then

#        bl_yum=$(which yum)

# else

#       bl_yum="$(readlink /proc/$(ps -ef | grep rscw | grep -v grep | awk '{print $2}')/exe | sed "s/rscd_full/blyum/g")"

#fi

bl_yum=$(which yum)

 

Around line 169:

        echo "$os_rel_str" | grep -qi "CentOS release"

        if [[ "$?" == "0" ]]; then

                release="COS"

                return 0

        fi

Add:

        echo "$os_rel_str" | grep -qi "CentOS release"

        if [[ "$?" == "0" ]]; then

                release="COS"

                return 0

        fi

 

        echo "$os_rel_str" | grep -qi "Amazon Linux release"

        if [[ "$?" == "0" ]]; then

                release="AMZN"

                return 0

        fi

 

Around line 259:

                if [[ ${osVer} -eq 7 ]] || [[ ${osVer} -eq 2 ]]

                         then

                         repo_err=`cat $result_file | grep "Is this ok \[y\/d\/N\]\:"`

                 else

                         repo_err=`cat $result_file | grep "Is this ok \[y\/N\]\:"`

                fi

Comment out the section and define repo_err:

#                if [[ ${osVer} -eq 7 ]] || [[ ${osVer} -eq 2 ]]

#                         then

#                         repo_err=`cat $result_file | grep "Is this ok \[y\/d\/N\]\:"`

#                 else

#                        repo_err=`cat $result_file | grep "Is this ok \[y\/N\]\:"`

#               fi

repo_err=`cat $result_file | grep "Is this ok \[y\/d\/N\]\:"`

 

linux-deploy.sh

Around line 13:

osVer=$(rpm -q --queryformat "%{VERSION}\n" centos-release)

if [[ ${osVer} -eq 7 ]]

        then

        bl_yum=$(which yum)

else

        bl_yum="$(readlink /proc/$(ps -ef | grep rscw | grep -v grep | awk '{print $2}')/exe | sed "s/rscd_full/blyum/g")"

fi

Comment out the whole section, and add the bl_yum definition:

#osVer=$(rpm -q --queryformat "%{VERSION}\n" centos-release)

#if [[ ${osVer} -eq 7 ]]

#        then

#        bl_yum=$(which yum)

# else

#       bl_yum="$(readlink /proc/$(ps -ef | grep rscw | grep -v grep | awk '{print $2}')/exe | sed "s/rscd_full/blyum/g")"

#fi

bl_yum=$(which yum)

 

Sync the repository and generate metadata

On an Amazon Linux system, install the RSCD agent, and allocate enough space for the repository.  In the example below, that's the /srv/patch/amazon-linux directory.  The amzn2-core and amzn2extra-docker repositories take up around 25gb of space.  Ensure createrepo installed on this system.  Run the reposync and specify the directory with space you allocated, eg:

reposync -p /srv/patch/amazon-linux --download-metadata

You will see one subdirectory under /srv/patch/amazon-linux for each channel you are subscribed to.  For each directory run:

createrepo --no-database --simple-md-filenames /srv/patch/amazon-linux/<channel_name>;cd /srv/patch/amazon-linux/<channel_name>;zip -r repodata.zip repodata

Each sub-directory of the repository root should now have its own repodata.zip, eg:

/srv/patch/amazon-linux/amzn2-core/repodata.zip

/srv/patch/amazon-linux/amzn2extra-docker/repodata.zip

 

Sometime I will write up a download and createrepo script that can be run as a NSH Script Job...

 

Configure the for analysis

Edit the <VPC Install Directory>/linuxpu/Work/linuxrepo.conf and add a line for the repository:

amzn2-core=//reposerver/srv/patch/amazon-linux/amzn2-core,AMZN2x86_64

amzn2-docker=//reposerver/srv/patch/amazon-linux/amzn2extra-docker,AMZN2x86_64

 

In the Linux Patch Analysis Job, alter the Linux Patch Repository argument to contain the Amazon Linux repository tags you added to the linuxrepo.conf file:

 

Run the job and review results

The list of missing patches is seen in the Linux Patch Analysis Results Extended Object in Live Browse of each target server:

 

 

 

Deploy Patches

By default, the NSH Script Job for analysis will generate a Deploy Job (in ap mode), and possibly Batch Job if there are multiple targets needing different patches, in /Patch Analysis Jobs/Linux Patch Analysis Jobs/Patch Deploy Jobs:

Execute the Job, re-run analysis, and confirm there are no missing patches:

 

 

Conclusion

And that's it.

 

While I was working this out, I setup a couple instances of Amazon Linux on-prem, which might be helpful if you are testing this out.

Share This:

To write up the Getting the VPC to work with Amazon Linux article I needed a couple Amazon Linux 2 systems to work with.  I was going to spin up some instances in AWS when Cody Dean pointed me to this article: Running Amazon Linux 2 as a virtual machine        on premises - Amazon Elastic Compute Cloud.  That sounded like a fun project and I would not have to worry about the cost of the AWS instances while I worked out the VPC changes.  The setup involves using cloud-init to do the initial configuration.  This was good because I could have the configuration ready to go in case I needed to come back to this after I'd torn down the VMs.  Cloud-init would handle the network and other configuration, and I'd have a couple systems setup; one repository server with reposync and createrepo installed and a NFS mount setup for the yum repository files, and the other a target to perform analysis on.  I also wanted to enable root ssh access for convince.  The target shouldn't be updated, because of course I need to test analysis for outdated packages.

 

Per the noted link, the process to boot AmazonLinux2 in vCenter involves getting a OVF and generating a seed.iso that contains the cloud-init configuration settings you want to apply.  After a little googling and trial and error, I worked out the settings I needed.  The resolv.conf cloud-init module didn't seem to work and the version of cloud-init had issues with NFS mounts, so I just directly wrote the files I needed rather than use the respective modules.  The repo_upgrade: none stops cloud-init from running a yum upgrade during the first boot.  Normally you want to do that because you want your system updated, but that would make it hard to test patching in my case.

 

AL2 Repository Configuration

 

meta-data

local-hostname: blprov2002a.example.com

network-interfaces: |

  auto eth0

  iface eth0 inet static

  address 192.168.8.154

  network 192.168.8.128

  netmask 255.255.255.192

  broadcast 192.168.8.191

  gateway 192.168.8.129

bootcmd:

  - ifdown eth0

  - ifup eth0 

user-data

#cloud-config

# vim:syntax=yaml

users:

  - default

  - name: bladelogic

    groups: sudo

    sudo: ['ALL=(ALL) NOPASSWD:ALL']

    plain_text_passwd: mypasswd

    lock_passwd: false

chpasswd:

  list: |

    root:mypasswd

disable_root: false

ssh_pwauth: true

write_files:

  - path: /etc/cloud/cloud.cfg.d/80_disable_network_after_firstboot.cfg

    content: |

      # Disable network configuration after first boot

      network:

        config: disabled

  - path: /etc/resolv.conf

    content: |

      nameserver 192.168.8.130

      nameserver 192.168.8.131

      options rotate attempts:3 timeout:1

      search example.com

      domain example.com

  - path: /etc/systemd/system/srv-patch.mount

    content: |

      [Unit]

      After=network.target

      Before=remote-fs.target

      [Mount]

      What=share.example.com:/export/bladelogic/repo/blprov2002

      Where=/srv/patch

      Type=nfs

      Options=rw,sec=sys,proto=tcp,hard,_netdev

      [Install]

      WantedBy=multi-user.target

packages:

  - createrepo

runcmd:

  - systemctl daemon-reload

  - mkdir -p /srv/patch

  - systemctl --now enable srv-patch.mount

 

AL2 Target Configuration

meta-data

local-hostname: al2-2002.example.com

network-interfaces: |

  auto eth0

  iface eth0 inet static

  address 192.168.9.18

  network 192.168.9.0

  netmask 255.255.255.0

  broadcast 192.168.9.255

  gateway 192.168.9.1

bootcmd:

  - ifdown eth0

  - ifup eth0 

 

user-data

#cloud-config

# vim:syntax=yaml

users:

  - default

  - name: bladelogic

    groups: sudo

    sudo: ['ALL=(ALL) NOPASSWD:ALL']

    plain_text_passwd: mypasswd

    lock_passwd: false

chpasswd:

  list: |

    root:mypasswd

disable_root: false

ssh_pwauth: true

write_files:

  - path: /etc/cloud/cloud.cfg.d/80_disable_network_after_firstboot.cfg

    content: |

      # Disable network configuration after first boot

      network:

        config: disabled

  - path: /etc/resolv.conf

    content: |

      nameserver 192.168.9.2

      nameserver 192.168.9.3

      options rotate attempts:3 timeout:1

      search example.com

      domain example.com

repo_upgrade: none

 

To generate the seed.iso (the name doesn't matter), create a directory with the meta-data and user-data files and run the command (Linux, check the article for Windows):

genisoimage -output al2-2002.iso -volid cidata -joliet -rock user-data meta-data

Then, as noted in the article, after deploying the OVF, add a CDROM/DVD device to the Virtual Machine and specify the ISO you generated.  Boot the VM and wait for the configuration to be applied.

 

I now have a couple systems that I can use to work out modifying the VPC for Amazon Linux.  The above could be used for any on-prem testing with AmazonLinux.

Share This:

Most TrueSight Server Automation (TSSA) environments consist of thousands, or tens of thousands, of RSCD Agents installed on the Target Servers being managed by the TSSA Application Server.

 

With so many RSCD Agents enrolled in an environment, and every TSSA job that runs against a Target Server utilizing the Agent, it is not uncommon to encounter various errors when communicating with a subset of the RSCD Agents.

 

These errors can typically be grouped into a couple of categories:

 

1) Agent ACL and User Mapping Issues

 

Examples of such errors, as seen from the TSSA Job Run Logs and Application Server logs, include:

 

  • No authorization to access host
  • Login not allowed for user

 

On the Target Server side, examples of the corresponding, and more-detailed, errors which might be seen in the rscd.log include:

 

  • Failed to map user to local user
  • Host not granted access
  • command: "XXXXXX" not authorized
  • User Impersonation Failed for mapped user
  • No mapping between account names and security IDs was done
  • The user has not been granted the requested logon type at this compute
  • Account restrictions are preventing this user from signing in
  • The user name or password is incorrect

 

For Windows Targets, the possible Agent ACL and User Mapping errors will depend on whether User Privilege Mapping (UPM) or Windows User Mapping (Automation Principals) are being used.

 

 

2) Target Server/Agent Connectivity Issues

 

Examples of such errors seen from the TSSA Job Run Logs and Application Server logs include:

 

  • No Route to Host
  • Remote host is unknown
  • Connection timed out
  • Connection refused
  • Connection Reset or Broken Pipe

 

These errors are often caused by DNS issues, firewall rules, an RSCD Agent not running, intermittent network issues, idle timeouts etc.

 

 

Troubleshooting Guide for RSCD Agent connectivity issues

 

The Truesight Server Automation Customer Support and Engineering teams have produced a new Troubleshooting Guide which can be used to walk through the troubleshooting process for errors such as those listed above. The new Troubleshooting Guide can be found here.

 

 

Videos for RSCD Agent connectivity issues

 

To complement this new Troubleshooting Guide, we have also created a couple of new videos on troubleshooting some of the distinct error messages listed above. More videos will follow on this topic but the first two videos are located here:

 

 

Video 1 - RSCD Agent Connectivity Issues - "No Authorization to access host" (KA000266955)

 

 

 

 

Video 2 - RSCD Agent Connectivity Issues - "Login not allowed for user" (KA000321629)

 

Share This:

One thing we've been having some conversations with customers about is how easy or hard it is to start or stop TSSA.  While you can use TSSA to start or stop any other application, how do you restart your favorite automation application in an orderly fashion?  How do you start it as fast as possible, while still catching all of the hosts?

 

So, we wrote a script for that, that you can use to cleanly shut down, and cleanly start up your environment, in the same order, every time.  This will also make it easier to make the same appserver the leader (does job and workitem distribution) every time.

 

That script is now on Github, in the Remediate section, as remediate/app-control.nsh at master · bmcsoftware/remediate · GitHub .  Feel free to comment there or hjere, or reach out via email, Telegram, or TikTok.

 

Here's typical usage:

 

# nsh app-control.nsh usage

Usage: app-control.nsh {start|stop|restart|status|restart-leader}

#

 

server1# nsh app-control.nsh restart

=======================

Stopping app servers

=======================

Stopping blappserv blprocserv on server3

Stopping TrueSight Server Automation AppServer ... All appserver processes have been terminated successfully.

Stopping TrueSight Server Automation ProcessSpawner ... OK

Stopping blappserv blprocserv on server2

Stopping TrueSight Server Automation AppServer ... All appserver processes have been terminated successfully.

Stopping TrueSight Server Automation ProcessSpawner ... OK

Stopping leader server server1

Stopping blappserv blprocserv on leader server server1

Stopping TrueSight Server Automation AppServer ... All appserver processes have been terminated successfully.

Stopping TrueSight Server Automation ProcessSpawner ... OK

Done stopping leader app server, another server will get elected leader

... and stop-leader will no longer apply...

=======================

Done stopping app servers

=======================

=======================

Starting up app servers

=======================

Starting blappserv blprocserv on leader server server1

Starting TrueSight Server Automation AppServer ... OK

Starting TrueSight Server Automation ProcessSpawner ... OK

Done starting leader app server

Waiting 60 sec for head start

Starting blappserv blprocserv on server2

Starting TrueSight Server Automation AppServer ... OK

Starting TrueSight Server Automation ProcessSpawner ... OK

Starting blappserv blprocserv on server3

Starting TrueSight Server Automation AppServer ... OK

Starting TrueSight Server Automation ProcessSpawner ... OK

=======================

Done starting app servers

=======================

server1#

 

 

 

 

 

 

 

 

server1# nsh app-control.nsh status

=======================

Checking for number of Appserver Processes on each appserver

=======================

Checking server1, count of appserver processes: should be 1 or more if running...

       1

Checking server1, count of launcher processes: should be 1 or more if running...

       1

All java processes (more than one is fine):

bladmin  18051 18041  7 16:57 ?        00:00:06 /opt/bmc/bladelogic/NSH/br/java/

bladmin  18103 18094 10 16:57 ?        00:00:09 /opt/bmc/bladelogic/NSH/br/java/

bladmin  18138 18051 79 16:57 ?        00:01:05 /opt/bmc/bladelogic/NSH/br/java/

bladmin  18774 18768 30 16:58 ?        00:00:08 /opt/bmc/bladelogic/NSH/br/java/

bladmin  18891 18883 37 16:58 ?        00:00:09 /opt/bmc/bladelogic/NSH/br/java/

Checking server2, count of appserver processes: should be 1 or more if running...

       1

Checking server2, count of launcher processes: should be 1 or more if running...

       1

All java processes (more than one is fine):

bladmin  18051 18041  7 16:57 ?        00:00:06 /opt/bmc/bladelogic/NSH/br/java/

bladmin  18103 18094 10 16:57 ?        00:00:09 /opt/bmc/bladelogic/NSH/br/java/

bladmin  18138 18051 79 16:57 ?        00:01:05 /opt/bmc/bladelogic/NSH/br/java/

bladmin  18774 18768 30 16:58 ?        00:00:08 /opt/bmc/bladelogic/NSH/br/java/

bladmin  18891 18883 37 16:58 ?        00:00:09 /opt/bmc/bladelogic/NSH/br/java/

Checking server3, count of appserver processes: should be 1 or more if running...

       1

Checking server3, count of launcher processes: should be 1 or more if running...

       1

All java processes (more than one is fine):

bladmin  18051 18041  7 16:57 ?        00:00:06 /opt/bmc/bladelogic/NSH/br/java/

bladmin  18103 18094 10 16:57 ?        00:00:09 /opt/bmc/bladelogic/NSH/br/java/

bladmin  18138 18051 79 16:57 ?        00:01:05 /opt/bmc/bladelogic/NSH/br/java/

bladmin  18774 18768 30 16:58 ?        00:00:08 /opt/bmc/bladelogic/NSH/br/java/

bladmin  18891 18883 37 16:58 ?        00:00:09 /opt/bmc/bladelogic/NSH/br/java/

server1#

Share This:

We are glad to announce the release of new patch of TrueSight Server Automation, 20.02.001. This new release brings in enhancements to the core features, helps to improve performance and stabilize the functionality.

 

  • Introducing job creation and execution governance through job policies which can be managed globally against set of roles in TSSA.

 

  • RSCD Agent Enhancement: On Linux systems with systemd support, the systemctl commands are now the default commands to start or stop the RSCD and Smart Agents.

 

  • Enhancements to Smart hub to manage public cloud servers by simplifying some of the complexities around connectivity by introducing smart hub gateway.

 

  • Patch Enhancement: You can now download and apply patches on CentOS systems using the out-of-the-box patching solution.

 

  • Compliance templates available for CIS RHEL-8, CentOS-8 and OEL-8. There are also updates made to DISA - Windows 2016, Windows 2012 DC, RHEL-7. CIS updates for Windows 2019.

 

  • Compliance module now support Security Content Automation Protocol (SCAP) 1.3 protocol.

 

  • New operations dashboard to know live status on how the usage of TSSA environment looks like at a given time.

 

  • Simplified unified product installer that is been decoupled from installing and upgrading Zipkits, blcontent script, Script to create container compliance jobs and Quick start page

 

  • Database enhancement: Added support for scheduling an automatic database cleanup

 

  • New operating systems and databases support - CentOS-8, Oracle Enterprise Linux 8, Debian 9 and 10, SQL Server 2019

 

 

For more details, please refer to the online documentation - 20.02.01: Patch 1 for TrueSight Server Automation 20.02 - Documentation for TrueSight Server Automation 20.02 - BMC Docu…

Share This:

Problem statement customers have brought to us: User in a role has broad access to the environment, a high level of privilege, and maybe not an ideal level of training to be “Safe” operating with impunity in the environment.  As an automation solution owner, I want to help our users be safer, and enable them to follow good practice within the automation solution.  How do I start, and how can TSSA help me?

 

This topic is sometimes called "separation of duties", and also uses the principle of least privilege: what is the least amount of access I need to successfully do my job?

 

Two common approaches here:

 

  1. Approved Content Promotion / Separation of Duties: Validating jobs in a non-prod environment before deploying to production: A number of customers build out content promotion workflows using RBAC in TSSA to support this process.  It’s straightforward to explain ways to solve this using TSSA, and it is common to use RBAC to enable a content promotion / validation workflow.
    1. While objects under the covers in TSSA are versioned, the previous versions of objects aren’t directly exposed to the user.  What customers are doing is baking a version number into those job names, and putting them into hardened repositories where many users can consume, but only specific, approved users can edit or promote content into those repositories / folders.  So, for the average user, or the mid-level engineer: “You can do anything you want, as long as it’s an approved automation”.
    2. Implementing separation of duties basically consists of having a “dev” type role that can create content, and typically has a small sandbox of servers for testing automations: packages and scripts, a promotion role that does some testing and then sends approved automations onward to a production role that doesn’t do content creation, but executes pre-approved automations or changes.
  2. Some customers have used an externalized workflow that would explicitly export jobs from TSSA, check them into source control, and re-import them under a different role.  This is more complicated, and requires custom scripting that the customer would own, and has some limitations, but there are customers that do this, including across a sort of “air gap” between a non-prod and prod environment.  The source control component isn't strictly necessary here, but this is one way to integrate source control not only for content, but also for the job definitions themselves.

 

I’m always happy to have a conversation about how other customers have solved these problems and how to address yours, feel free to reach out to me or Bill Robinson

Share This:

With the TrueSight Server Automation (TSSA) 20.02 release two new features were added, the Smart Agent and Smart Hub.  Here is a brief introduction to the features and highlights of some items around installation and configuration.

 

The Smart Agent is an add-on feature for the RSCD Agent which allows communication back to the TSSA Application Server, via the Smart Hub.  The Smart Hub is a new component which will relay information from Smart Agents to the Application Servers.  The Smart Agent will allow for an RSCD Agent to auto-enroll itself and proactively provide agent state information to the application server.

 

Overviews of these components can be found here:

https://docs.bmc.com/docs/tssa2002/overview-of-smart-agents-919901683.html

https://docs.bmc.com/docs/tssa2002/overview-of-the-smart-hub-941857309.html

 

 

Smart Agent and Smart Hub supported platforms

With the TSSA 20.02 release the Smart Agent & Hub are supported on Windows and RedHat Linux.

 

Smart Agent

Operating System

Version

Windows

2012, 2016 & 2019

RedHat Linux (RPM installer only)

6, 7 & 8

 

Smart Hub

Operating System

Version

Windows

2012, 2016 & 2019

RedHat Linux

7 & 8

 

Additional platform support will be added with later releases.  Currently supported platform information is also detailed here:

https://docs.bmc.com/docs/tssa2002/supported-platforms-910749868.html

 

 

Install and upgrade of RSCD and Smart Agent

With 20.02 the Smart Agent will be included and enabled for new RSCD Agent installs.

RSCD Agent installs via the Agent Installer Job will also add and enable the Smart Agent.

Installation of the RSCD Agent via a Provisioning Job though will not include the Smart Agent component, as this currently installs the agent via the RSCD Agent shell script.  In this case instead select to install the RSCD Agent as a post provisioning job and gain the Smart Agent.

 

When upgrading RSCD Agents from a prior version to 20.02 the Smart Agent components will be added as part of the upgrade, but by default the Smart Agent services will be disabled.

 

 

Smart component silent install considerations

Silent installation of the Smart Agent is supported for both Windows and RedHat.  There are new silent install parameters available to support these actions.  The updated silent install procedure for Windows targets is covered here:

https://docs.bmc.com/docs/tssa2002/using-silent-mode-to-install-an-rscd-agent-windows-910750322.html

 

For RedHat the Smart Agent silent install and associated nsh-install-defaults file is outlined here:

https://docs.bmc.com/docs/tssa2002/using-rpm-to-install-nsh-or-the-rscd-agent-910750500.html

 

The Smart Hub can also be installed in silent mode.  Steps for this procedure are here:

https://docs.bmc.com/docs/tssa2002/installing-the-smart-hub-silently-919901905.html

 

 

How to disable Smart Agent services

 

With the Smart Agent component added with the installation, it may be desired to not use the Smart Agent features.  The Smart Agent service can be disabled.  During install the service will be disabled if the option SMARTAGENT_SERVICE is set to 0 in nsh-install-defaults for Linux installations or the Optional MSI Customization Properties for Windows installations.

 

  • For new installations, the default value is set to 1, which indicates that the Smart Agent service is enabled.
  • For upgrades, the default value is set to 0, which indicates that the Smart Agent service is disabled.  The Smart Agent Service is not started automatically after the upgrade.

 

After installation, on Windows the service can be stopped via the Windows Services panel (services.msc).  The service name is “TrueSight Server Automation Smart Agent”.

 

On RedHat stop the service via the command “smartagent stop”.

 

For more detail about the Smart Agent configuration and service see:

https://docs.bmc.com/docs/tssa2002/managing-the-smart-agent-919902012.html

 

 

Troubleshooting logs and files

There are new logs and configuration files for the Smart Agent & Smart Hub, which will be helpful to diagnose and troubleshoot issues.

 

Smart Agent:

Log file

<rscd_home>\smartagent.log

<rscd_home>/log/smartagent.log

Config file

C:\Windows\rsc\smartagent.conf

/etc/rsc/smartagent.conf

 

Smart Hub:

Smart Hub log file

<smarthub_dir>\smarthub\logs\smarthub.log

Redis log file

<smarthub_dir>\smarthub\redis\redis.log

Config file

<smarthub_dir>\smarthub\config\config.json

 

Application server:

Application server smart hub log

<appserver_dir>\NSH\br\appserver-smarthub.log

<appserver_dir>/NSH/br/appserver-smarthub.log

Share This:

TrueSight Server Automation (TSSA) allows servers to be "enrolled" more than once. The only requirements are that the Server name be unique. Servers enrolled by more than one name can cause issues when running jobs, especially when the same server is included multiple times in the same job run due to multiple enrollments in TSSA.

 

Common names servers are often enrolled as include;

  • FQDN - Typically this is the name most sites use normally
  • Short Host Name - With no domain name
  • IP Address
  • DNS Alias
  • Secondary End Point Names - DB network name, Storage network name, etc

 

If you search here on BMC Communities, you will find several postings on this issue with no real solution. The problem being there is no accurate single server "ID" available to correlate enrolled servers by. We had some success here at our site comparing ??TARGET.NAME?? to ??TARGET.FQ_HOST??. However not 100% of our servers use the same DNS name as used in FQ_HOST. I suspect that is also true for most customer sites as well. So a different method was needed.

 

The solution we arrived at was to have each server create a empty file called "TSSA-??TARGET.NAME??" in a existing location on each server endpoint, /tmp on Linux/Unix and C:\temp on Windows. We choose ??TARGET.NAME?? as it will be a unique name in TSSA. We can then create a  normal TSSA compliance policy to verify if the count of these TSSA-* files is exactly 1 on each server end point. If the count is greater than 1, then we have detected a server that is enrolled in TSSA more than once. And we can easily identify the enrollment names by looking at the TSSA-* files on the server.

 

Here's how it looks when you review a compliance job run;

SS-TSSA-Enrollment-Results.png

 

It appears we have 6 servers in the group, however 1 server has been enrolled 3 times. Once by IP address, once by short host name, and once using FQDN. Our compliance run has detected them as non-compliant and selecting the "count File:/tmp/TSSA-** =1 where" line in red for one of the non-compliant servers shows the "Left Value" count to be 3. Confirming that we detected this server has been enrolled 3 times in TSSA.

 

The compliance rule to do this is really simple;

SS-TSSA-Enrollment-Rule.png

 

For those familiar with TSSA compliance rules, you may notice there is both a check for count = 1 and also a foreach loop to detect the compliant status. Using just the count would be sufficient to determine compliance. The foreach rule was added so you can "see" the names of the TSSA-* files in the job run results view for a server that is non-compliant. Simply so you don't actually have to go look for the TSSA-* files, they will be listed in the compliance run results view for the any non-compliant components.

 

The Component Template we used as a proof of concept is available as a ZipKit here: Blade ZipKit - TSSA Enrollment

 

This method of detecting "Multiple Server Enrollment" has proven to be very effective in helping us here at Customer Zero clean up our multiple enrolled servers in TSSA.

Share This:

Please note the following Support Flash regarding the upcoming (May 25, 2020) expiration of the TSSA Live Reporting license file shipped with all versions of TSSA up to and including TSSA 20.02

 

https://docs.bmc.com/docs/tssa2002/notification-of-a-critical-action-required-by-users-of-truesight-server-automation-live-reporting-910749158.html

 

If the license file is not applied, the following error will be encountered when attempting to launch TSSA Live Reporting after May 25, 2020:

 

The software license has been breached.

Please go to the License Management page for more information.

 

Please see Knowledge Article 000170998 for full steps on how to obtain the updated license file from BMC Customer Support.

Share This:

TrueSight Server Automation (TSSA) Windows Patch Analysis results, and the success of subsequent Patch Remediations, can be affected by the target server's "Pending Reboot" status.

 

If the server is in the Pending Reboot state this can mean a patch is only partially installed when the Patch Analysis job is run in TSSA.  This can result in misleading Patch Analysis results and also cause Patch Remediation/Deployment jobs to fail.

 

If a server was in the Pending Reboot state when the Patch Analysis job was run, a warning will be reported in the job run results.

Expanding the job run results under the Server View, Failed and Successful targets are displayed as follows:

 

 

To see which Server(s) generated the warning, right-click on the run to display the job run log.  Here each target is displayed with their run status. The Server(s) which completed with a warning display the yellow triangle icon.  In the run messages for the target the following warning message will be present:

 

"Reboot is pending on this machine, analysis results may be in-correct."

 

See KA 000145107. for additional details on this scenario.

 

In order to avoid this scenario and the Patch Analysis job warnings the target servers can be rebooted prior to executing the analysis job.  A TSSA Compliance Job could be executed against the targets to verify none are in the Pending Reboot state.  The Windows Patch Readiness ZipKit Component Template is available for TSSA versions 8.9 and above in the BMC Communities here.  The Template contains rules to check several items, including the Pending Reboot status.


 

 

A server in the Pending Reboot state can affect the patch remediation process as well.  A patch may have been applied to a target server earlier, but that specific patch may require a reboot to be performed before it is considered fully installed. With the server in the Pending Reboot state, the patch may only be partially installed.  Attempting to apply the same patch via TSSA, because it was earlier flagged as Missing during Patch Analysis, could result in the deployment failing with exit code 2359302.  Additional detail of this scenario is covered in KA 000086137.

 

 

There are currently two enhancements under consideration to help avoid this scenario as well.

The first enhancement (DRLPJ-131) is to include a new node within the Patch Analysis Job result’s Server View to make it easier to notice servers which are in the Pending Reboot state.  A third node, along with Failed and Successful, would include Servers which were successful and pending a reboot.

The second enhancement (DRLPJ-142) is to add an option to Patch Analysis Jobs to reboot a server prior to analysis, if found to be in the pending reboot state.

These enhancements are being considered for a future release of TSSA.

Share This:

“Sean, what the [heck] is this thing and how does it relate to these other, similar-sounding products”

 

For the more in-depth article, check out John's extensive and well-researched post here: Helix Support: Planning Your Deployment of TrueSight Smart Reporting for Server Automation (TSSR-SA) 19.2.02

 

Anthony’s BDSSA EOL announcement

 

 

What is TrueSight Smart Reporting (TSSR)?

 

  • TrueSight Smart Reporting for Server Automation (TSSR-SA solution) is the replacement for BMC Decision Support for Server Automation (BDSSA).  It’s a solution made up of two parts: the Data Warehouse component, which gets data from the core application database to a data warehouse database, and the Smart Reporting Platform, which includes the reporting engine, and the out of the box reports content. 
  • The Smart Reporting Platform is shared across solutions, including Helix ITSM, TSOM, TSCO, and TSNA, and will be able to be used and reused across multiple data sources (server, network, capacity).

 

Which product components do I need to install TSSR?

 

  • The latest version of the TSSR solution is 20.02 (As of April 2020).  It’s made up of two components:
    • TSSA-DW (Data Warehouse) 8.9.04.04 and
    • TSSR-SA (Reporting Platform) 19.2.02.
  • It is compatible with all 8.9 versions of Server Automation, including BMC Server Automation 8.9.0, 8.9.1, and TrueSight Server Automation 8.9.2, 8.9.3, and service packs.

 

Can I use TSSR without removing BDSSA?  I want to migrate my reports slowly, while continuing to use BDSSA.

 

 

How do I get a TSSR license?

 

  • If you already have/had a license for BDSSA, you should contact your Sales Representative who can help you get a permanent license for TSSR, and a trial license if necessary.  If you’ve never had a license for BDSSA, you may need to buy a license: contact your Sales Representative.

 

What’s Yellowfin?

 

  • Yellowfin is a reporting product that BMC is using for both Smart Reporting and Live Reporting.  They are installed separately and using different instructions.  Start here (TBD) for Live Reporting, and here (TBD) for Smart Reporting.

 

What’s the difference between Smart Reporting and Live Reporting?

 

  • Both use the Yellowfin engine, but Smart Reporting runs reports against the data warehouse database, while Live Reporting reports run against the core application database.
  • There are many out of the box reports in Smart Reporting, covering the main use cases of Server Automation, including Compliance, Patching, Job Utilization, RBAC, etc.   Smart Reporting reports can be customized, and “broadcast” via email or FTP.  You can also build custom reports in Smart Reporting. 
  • Live Reporting has near real-time reports for key use cases, including: Patching, Compliance.
  • To prevent performance impacts to the core application, Live Reporting doesn’t support customization of reports.  This doesn't mean you -can't- customize reports, but you should expect them to get wiped out on upgrade.  This is by design.

 

FeatureLive ReportingSmart Reporting
Use CasesPatching, CompliancePatching, Compliance, Job Utilization, RBAC, others
Real Time Reports?YesNo: typically as soon as ETL completes
Customizable?Not supportedYes
Database UsedCore Application DB ("bladelogic")Data Warehouse (TSSR_DW or BSARA_DW etc.)
Impacts Core App Performance?YesNo, run as many as you want!
Share This:

I thought I would show my work for how I finalized the steps to Monitoring the TrueSight Server Automation Application Server with JMX.  This article will cover setting up a simple JMX monitor (jmxtrans) and monitoring of non-TSSA attributes and then configuring the application server to expose TSSA-specific attributes.  While I'm presenting this in a step by step fashion to show the process I went through, you can always read the other article to see the final configurations for the complete setup.  As always, the standard disclaimer: the below procedure is not supported, your mileage may vary, etc etc.

 

What you will need

Jmxtrans and a system to install it on, ideally separate from the TSSA application server

TSSA application server

JDK that matches the version included in the in TSSA version you are using (to use the standalone jconsole for testing).

 

Setting up a JMX Monitor to monitor non-TSSA attributes

Jmxtrans is an open-source tool to pull data from a JMX source and send the data to a logging, graphing, or monitoring engine.  Setting up an entire monitoring, logging, and/or graphing infrastructure is beyond the scope of the article and I'm going to stick with some simple configurations to prove out the functionality.  Since we are using standard JMX configurations, you should be able to adapt these examples to your particular tool.

 

Initial configuration of the TSSA application server for JMX

 

To enable the standard JMX interface I configure the TSSA application server's JVM by starting it with these parameters: -Dcom.sun.management.jmxremote.port=<some port> -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false

 

This is accomplished by running blasadmin against the appserver instance I want to setup monitoring for.  If you have multiple instances on each system, you will need to allocate a unique port for each instance.

# blasadmin -s default set app jvmargs "-Dcom.sun.management.jmxremote.port=9099 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"
# blasadmin -s default show app jvmargs
JVMArgs:-Dcom.sun.management.jmxremote.port=9099 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false

 

I need to restart the application server service to pickup the change and I can confirm my instance is now listening on 9099:

# netstat -anp | grep java

tcp        0      0 192.168.8.70:23034      192.168.8.70:4750       ESTABLISHED 20633/java        

tcp6       0      0 :::9700                 :::*                    LISTEN      20591/java        

tcp6       0      0 :::9701                 :::*                    LISTEN      20591/java        

tcp6       0      0 :::9702                 :::*                    LISTEN      20591/java        

tcp6       0      0 :::9099                 :::*                    LISTEN      20633/java        

tcp6       0      0 :::9840                 :::*                    LISTEN      20633/java        

tcp6       0      0 :::9841                 :::*                    LISTEN      20633/java        

tcp6       0      0 :::9842                 :::*                    LISTEN      20633/java        

tcp6       0      0 :::9843                 :::*                    LISTEN      20633/java        

PID 20633 is my appserver instance (listening on 984*) and I see the same process is bound to the jmxreport.port (9099) I provided.

 

Connect with jconsole to confirm what attributes you can see with this configuration.  I'm using the jconsole from the JDK I downloaded previously.  You will get a warning about an insecure connection which you can ignore.

 

I can see various attributes now on the MBeans tab:

As expected, I cannot see any of the TSSA-specific attributes:

You may have noticed that you are able to execute some operations under various mbean nodes.  We will talk about securing those later.

 

Configuring jmxtrans for logging

After provisioning a Centos 7 system and installing the RPM for jmxtrans, I set wrapper.java.memory=512m in /etc/jmxtrans/wrapper.conf to ensure I have enough memory.

 

On the jmxtrans host I create a /var/lib/jmxtrans/blapp2002.json file with the following contents to pull some of the attributes I do have access to so I can confirm my monitoring is working.   While jmxtrans has several types of output plugins,  I will use the KeyOutWriter which simply writes values to a text file on disk.

{
  "servers" : [ {
    "port" : "9099",
    "host" : "blapp2002.example.com",
    "queries" : [ {
      "obj" : "java.lang:type=OperatingSystem",
      "attr" : ["OpenFileDescriptorCount","ProcessCpuLoad","SystemCpuLoad","SystemLoadAverage" ],
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.KeyOutWriter",
        "outputFile" : "/tmp/blapp2002.txt",
        "maxLogFileSize" : "10MB",
        "maxLogBackupFiles" : 200,
        "delimiter" : ",",
        "debug" : true,
        "typeNames" : ["name"]
      } ]
  }
 ]
 } ]
}

 

After starting the jmxtrans daemon and waiting for a couple minutes, I see some entries in the /tmp/blapp2002.txt file:

blapp2002_example_com_9099.sun_management_OperatingSystemImpl.OpenFileDescriptorCount,340,1585143272

blapp2002_example_com_9099.sun_management_OperatingSystemImpl.ProcessCpuLoad,0.045454545454545456,1585143272

blapp2002_example_com_9099.sun_management_OperatingSystemImpl.SystemCpuLoad,0.06060606060606061,1585143272

blapp2002_example_com_9099.sun_management_OperatingSystemImpl.SystemLoadAverage,0.46,1585143272

 

That shows the hostname, object name, attribute, value, and time in epoch.  The output of your JMX tool may differ.  Now that I have jmxtrans connected to my TSSA appserver pulling the exposed attributes and values, I can move on to the next step and expose the TSSA-specific attributes and monitor them.

 

 

Exposing TSSA attributes via JMX

 

Configuring the TSSA application server

 

I want to expose the TSSA-specific attributes, in a read-only way.  After reviewing Oracle's JMX documentation about JMX configuration it looks like I can accomplish this with the jmxremote.access.file and jmxremote.password.file directives.  I create a NSH/br/jmxremote.access file that contains the line jmxtrans readonly and a NSH/br/jmxremote.password file with the line jmxtrans bladelogic.  On Linux, these must be owned by bladmin:bladmin and the jmxremote.password must be only owner read and write:

-rwxr-xr-x. 1 bladmin bladmin  134 Feb 14 17:04 jmxremote.access

-rw-------. 1 bladmin bladmin   31 Feb 14 17:04 jmxremote.password

 

I will also need to disable the session support in the TSSA application server for the JMX interface.  Expose this setting as configurable by editing the NSH/br/deployments/<deploymentName>/BlAdmin.xml and removing the internal="yes" from this line:

<option name="DisableSessionSupport" description="Disable Session Support [true,false]" internal="yes" type="boolean"/>

becomes:

<option name="DisableSessionSupport" description="Disable Session Support [true,false]" type="boolean"/>

 

I add the new JMX configuration to my blasadmin settings:

#blasadmin -s default set appserver jvmargs "-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=9099 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=true -Djava.rmi.server.hostname=blapp2002.example.com -Dcom.sun.management.jmxremote.access.file=/opt/bmc/bladelogic/NSH/br/jmxremote.access -Dcom.sun.management.jmxremote.password.file=/opt/bmc/bladelogic/NSH/br/jmxremote.password"
#blasadmin -s default show appserver jvmargs
JVMArgs:-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=9099 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=true -Djava.rmi.server.hostname=blapp2002.example.com -Dcom.sun.management.jmxremote.access.file=/opt/bmc/bladelogic/NSH/br/jmxremote.access -Dcom.sun.management.jmxremote.password.file=/opt/bmc/bladelogic/NSH/br/jmxremote.password

 

Disable session support for the JMX interface:

blasadmin -s default set management disablesessionsupport true

 

After restarting the application server service, I connect with the jconsole, using the username and password I specified in the jmxremote.password file:

 

I can now see the previously protected TSSA attributes:

And trying to run any of the various Operations results in a permission error, which is desired as this user will only be used for read access to monitor various attributes.

 

Configuring jmxtrans with additional attributes

 

I update my jmxtrans json file (/var/lib/jmxtrans/blapp2002.json) with the username and password and a few TSSA-specific attributes:

{
  "servers" : [ {
    "port" : "9099",
    "host" : "blapp2002.example.com",
     "username" : "jmxtrans",
     "password" : "bladelogic",
    "queries" : [ {
      "obj" : "java.lang:type=OperatingSystem",
      "attr" : ["OpenFileDescriptorCount","ProcessCpuLoad","SystemCpuLoad","SystemLoadAverage" ],
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.KeyOutWriter",
        "outputFile" : "/tmp/blapp2002.txt",
        "maxLogFileSize" : "10MB",
        "maxLogBackupFiles" : 200,
        "delimiter" : ",",
        "debug" : true,
        "typeNames" : ["name"]
      } ]
  },
        {
      "obj" : "Bladelogic:type=ApplicationServer,Job Manager=Job Manager,name=Job Manager",
      "attr" : ["NumIdleThreads", "NumberOfRunningJobs" ],
      "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.KeyOutWriter",
        "outputFile" : "/tmp/blapp2002.txt",
        "maxLogFileSize" : "10MB",
        "maxLogBackupFiles" : 200,
        "delimiter" : ",",
        "debug" : true,
        "typeNames" : ["name"]
      } ]
  },
  {
    "obj" : "Bladelogic:type=ApplicationServer,Connections=Connections,name=Client Connection Service",
    "attr" : [ "ActiveConnections", "IdleConnections", "OpenConnections", "NumIdleClientWorkerThreads" ],
    "outputWriters" : [ {
        "@class" : "com.googlecode.jmxtrans.model.output.KeyOutWriter",
        "outputFile" : "/tmp/blapp2002.txt",
        "maxLogFileSize" : "10MB",
        "maxLogBackupFiles" : 200,
        "delimiter" : ",",
        "debug" : true,
        "typeNames" : ["name"]
      } ]
  }
 ]
 } ]
}

 

After a few minutes, I see the additional attributes and values show up in my /tmp/blapp2002.txt file:

blapp2002_example_com_9099.com_bladelogic_om_infra_app_service_client_ClientConnectionManager.ClientConnectionService.ActiveConnections,0,1585162606

blapp2002_example_com_9099.com_bladelogic_om_infra_app_service_client_ClientConnectionManager.ClientConnectionService.IdleConnections,2,1585162606

blapp2002_example_com_9099.com_bladelogic_om_infra_app_service_client_ClientConnectionManager.ClientConnectionService.OpenConnections,2,1585162606

blapp2002_example_com_9099.com_bladelogic_om_infra_app_service_client_ClientConnectionManager.ClientConnectionService.NumIdleClientWorkerThreads,10,1585162606

blapp2002_example_com_9099.com_bladelogic_om_infra_app_service_job_JobManagerImpl.JobManager.NumIdleThreads,100,1585162606

blapp2002_example_com_9099.com_bladelogic_om_infra_app_service_job_JobManagerImpl.JobManager.NumberOfRunningJobs,0,1585162606

blapp2002_example_com_9099.sun_management_OperatingSystemImpl.OpenFileDescriptorCount,330,1585162606

blapp2002_example_com_9099.sun_management_OperatingSystemImpl.ProcessCpuLoad,9.714087321070273E-5,1585162606

blapp2002_example_com_9099.sun_management_OperatingSystemImpl.SystemCpuLoad,0.04471864054127741,1585162606

blapp2002_example_com_9099.sun_management_OperatingSystemImpl.SystemLoadAverage,0.02,1585162606

 

 

 

Tidying Up

To make the configuration a little easier to read and work with, we can move most of the configuration settings into a file and pass the -Dcom.sun.management.conf.file setting in blasadmin and then put the rest of the configuration settings in a jmxremote.config file:

 

blasadmin -s default set app jvmargs "-Dcom.sun.management.config.file=/opt/bmc/bladelogic/NSH/br/jmxremote.config

 

jmxremote.config:

com.sun.management.jmxremote.port=9099

com.sun.management.jmxremote.ssl=false

com.sun.management.jmxremote.password.file=/opt/bmc/bladelogic/NSH/br/jmxremote.password

com.sun.management.jmxremote.access.file=/opt/bmc/bladelogic/NSH/br/jmxremote.access

com.sun.management.jmxremote.authenticate=true

java.rmi.server.hostname=blapp2002.example.com

 

Summary

The above steps can be used to configure a rudimentary JMX monitor (jmxtrans) with file-based logging, for the purposes of validating that TSSA-specific JMX attributes can be monitored with a standard JMX monitor.  You could use one of the other writers and send the JMX information into a graphing solution like Graphite or StatsD.

Share This:

The TSSA application server exposes various attributes about itself and its operations via a customized JMX interface which is accessible via the bljconsole and jmxcli. Both of these interfaces require authentication via BLSSO before allowing retrieval of the various TSSA specific attributes and values.  This poses a challenge when trying to use JMX-based monitors as they cannot perform BLSSO authentication and therefore cannot retrieve the TSSA-specific attributes.  Fortunately we can configure the TSSA application server to provide JMX access to these attributes without needing BLSSO authentication.  As always, the standard disclaimer: the below procedure is not supported, your mileage may vary, etc etc.

 

Configuration changes to monitor TSSA-specific attributes via a standard JMX interface

Edit the NSH/br/deployments/<deploymentName>/BlAdmin.xml and remove the internal="yes" from this line:

<option name="DisableSessionSupport" description="Disable Session Support [true,false]" internal="yes" type="boolean"/>  

becomes:

<option name="DisableSessionSupport" description="Disable Session Support [true,false]" type="boolean"/>  

 

Create a NSH/br/jmxremote.password file with the following contents (use whatever username and password you like):

jmxuser password

Create a NSH/br/jmxremote.access file with the following contents (you must use the same username as in the password file):

jmxuser readonly

Create a NSH/br/jmxremote.config file with the following contents:

com.sun.management.jmxremote.port=<port>

com.sun.management.jmxremote.ssl=false

com.sun.management.jmxremote.password.file=/opt/bmc/bladelogic/NSH/br/jmxremote.password

com.sun.management.jmxremote.access.file=/opt/bmc/bladelogic/NSH/br/jmxremote.access

com.sun.management.jmxremote.authenticate=true

java.rmi.server.hostname=<appserver hostname>

replace <port> with a distinct port for this appserver instance, and <appserver hostname> with the appserver's fqdn.

 

Ensure the file permissions on these files match the below:

-rw-r--r--. bladmin bladmin  jmxremote.access

-rw-r--r--. bladmin bladmin  jmxremote.config

-rw-------. bladmin bladmin  jmxremote.password

 

Run the below blasadmin commands:

blasadmin -s <instance name> set app JVMargs "-Dcom.sun.management.config.file=/opt/bmc/bladelogic/NSH/br/jmxremote.config

blasadmin -s <instance name> set management disablesessionsupport true

If you have multiple instances of the appplication server on the same host, you should create multiple jmxremote.config files and specific a different port for each instance in each file, and then specify the per-instance file in each instance's JVMArgs value.

 

After making all of the above changes, restart the application server service.

 

Attributes to Monitor with JMX

There are several attributes available for monitoring in the JMX interface and below is a list of attributes that should provide a good view into what is happening in your TSSA application server:

Bladelogic:type=ApplicationServer,name=Application Server:

FreeJvmMemory, MaximumJvmMemory, ResidentSetSize, TotalJvmMemory, UsedJvmMemory

 

Bladelogic:type=ApplicationServer,BlExec Service=BlExec Service,name=BlExec Service

NumPollingRequests, NumProcessingRequests, NumWaitingRequests

 

Bladelogic:type=ApplicationServer,Connections=Connections,name=Authentication Service

PendingAuthenticationRequests

 

Bladelogic:type=ApplicationServer,Connections=Connections,name=Client Connection Service

ActiveConnections, IdleConnections, OpenConnections

 

Bladelogic:type=ApplicationServer,Connections=Connections,name=Nsh Proxy Service

ActiveNshProxies, IdleNshProxies, OpenNshProxies

 

Bladelogic:type=ApplicationServer,Connections=Connections,name=SSL Connection Service

ActiveConnections, IdleConnections, OpenConnections

 

Bladelogic:type=ApplicationServer,Database Service=Database Service,Client-Connection-Pool=Client-Connection-Pool,name=Client-Connection-Pool

MinConn, NumAvailableConnections, NumConnections

 

Bladelogic:type=ApplicationServer,Database Service=Database Service,General-Connection-Pool=General-Connection-Pool,name=General-Connection-Pool

MinConn, NumAvailableConnections, NumConnections

 

Bladelogic:type=ApplicationServer,Database Service=Database Service,Job-Connection-Pool=Job-Connection-Pool,name=Job-Connection-Pool

MinConn, NumAvailableConnections, NumConnections

 

Bladelogic:type=ApplicationServer,Job Manager=Job Manager,name=Job Manager

NumberOfRunningJobs, NumIdleThreads

 

Summary

The above steps can be used to configure a TSSA application server for standard JMX monitoring.

Share This:

One of the core competencies of TrueSight Server Automation (TSSA) is automating the patch update process for servers. TSSA makes this easy for the typical monthly cycle of patching servers in large groups, but in no particular order for each server in the group. Many applications are hosted on multiple servers, often in a HA (High Availability) and/or failover cluster to limit service outage. Automation to patch these type of services needs to update the nodes in a specific order usually one at a time, to prevent service outages during the patch update process. We call this Service Aware Patching.

 

Microsoft Exchange is one of these services, and I wanted to share here how we have implemented Service Aware Patching entirely in TrueSight Server Automation here at Customer Zero (IE BMC-IT).

 

Key Concepts

 

When automating any cluster service, we need to be able to idle the current target node by moving services off to the other nodes. Experience has taught is that changing the cluster state requires a careful methodical approach to be successful.

 

Key "milestones" in the process;

  1. Verify the target cluster node is in the "normal" state BEFORE attempting to change the cluster state
  2. Move services off the target node to idle the node (IE moving the node into Maintenance Mode)
  3. Verify the services successfully moved to another node and the target node is actually idle
  4. Use normal TSSA Patch Analysis and Deployment to update the patch level of the target node
  5. Verify the node and services are still in the idle state, as patch deployment probably restarted the node
  6. Move the services back onto the target node (IE move the node out of Maintenance Mode)
  7. Verify cluster node and cluster is in the "normal"  state BEFORE moving on the the next node is the sequence

 

Notice we use a careful logical sequence to move the target node through the required phases, we call this the "chain" of steps. Important point here, is that if ANY step in the chain fails the automation procedure needs to stop and NOT PROCEED to the next step. if something unexpected happens in the chain of steps on the cluster, ignoring a failed step and continuing on will most likely cause a service outage due to the automation not halting at a failed step in the chain.

 

TSSA takes care of the actual patch updates, the question is how do we implement the steps to handle the transitions of the cluster nodes?

 

Microsoft Exchange automation requires knowledge on how to verify the Exchange service and how to move the Exchange node in and out of "Maintenance Mode". Fortunately there are Exchange experts that have created existing powershell scripts to do what we need here for Service Aware Patching.

 

We used the powershell scripts available here;

 

The first issue we encountered is common when attempting to automate existing scripts of any type. If the script was designed for interactive use and not called by automation, then the scripts typically do not do things like set the exit code on failure. They often rely on the errors to STDOUT being read by the admin running the scripts. This was true with these powershell scripts, so one of our Exchange admins made a copy of the scripts we needed for the all important "verify" steps in our Service Aware Patching chain and modified them to set a non zero exit code if the script failed in any of the key operations or checks. Our TSSA deploy jobs will automatically detect and report failure if the script exits with a non zero exit code.

 

Once we had the automation procedure steps in the chain defined and the powershell scripts to perform the steps needed to manipulate the Exchange cluster, all the hard work was done! Now we just need to create the TSSA jobs required to execute the steps in the chain in the correct order!

 

Implementation in TrueSight Server Automation

 

We used TSSA batch jobs to implement the chain of steps required in executed in a prescribed order.

 

Note: Important options to set in the TSSA Batch jobs;
  • "Continue executing batch when individual jobs return non-zero exit code" is NOT set
  • "Execute jobs sequentially"

 

The main TSSA batch job to run our Service Aware Patching and control the node by node sequence in the cluster;

 

TSSA Batch Job: "Exchange 2013 - FULL Sequence Maintenance Mode Operation (PROD)"
  • Exchange 2013 - Maintenance Mode Operation (Node #1)
  • NSH - SleepDelay (5 minutes)
  • Exchange 2013 - Maintenance Mode Operation (Node #2)
  • NSH - SleepDelay (5 minutes)
  • Exchange 2013 - Maintenance Mode Operation (Node #3)
  • NSH - SleepDelay (5 minutes)
  • Exchange 2013 - Maintenance Mode Operation (Node #4)
  • NSH - SleepDelay (5 minutes)
  • Exchange 2013 - Maintenance Mode Operation (Node #5)
  • NSH - SleepDelay (5 minutes)
  • Exchange 2013 - Maintenance Mode Operation (Node #6)

 

This main TSSA batch job is the controller job that calls the child job for each node in the prescribed order, and any failure will cause the batch job to stop and not continue to the next node.

 

Note: In the case of any failures in the chain of steps
Manual intervention by the Exchange admins would be required to determine how to resolve without service impact.

 

Each of the child jobs is where our chain of steps are executed against each specific node in the cluster;

 

TSSA Batch Job: "Exchange 2013 - Maintenance Mode Operation (Node #?)"
  • Deploy - Verify Exchange READY for Maintenance Mode
  • Deploy - Move Exchange to Maintenance Mode
  • Deploy - Verify Exchange Maintenance Mode
  • PAJ - Exchange 2013 Servers
  • Build Provisioning (Windows) - CHECKPOINT Wait for Server
  • Deploy - Move Exchange to Normal Mode
  • Deploy - Verify Exchange Normal Mode

 

Note: There is one child job per node.

 

Here's a screenshot showing a successful run of one of the child jobs;

 

SAP-Exchange-MMO-1.png

 

Overview of the TSSA low level jobs and the associated powershell script

 

Deploy - Verify Exchange READY for Maintenance Mode
  • verify-server-is-NORMAL_MODE.ps1
Description: Script we created to check mailbox database (Get-MailboxDatabase) status is OK

 

Deploy - Move Exchange to Maintenance Mode
  • Start-ExchangeServerMaintenanceMode.ps1
Description: Script from VanHybrid site to move Exchange into "Maintenance Mode"

 

Deploy Verify Exchange Maintenance Mode
  • verify-server-is-IN-maintenance-mode.ps1
Description: Script we created to check various Exchange components are in the Inactive state and the cluster node state is "Paused"

 

Deploy - Move Exchange to Normal Mode
  • Stop-ExchangeServerMaintenanceMode.ps1
Description: Script from VanHybrid site to move Exchange out of "Maintenance Mode"

 

Deploy - Verify Exchange Normal Mode
  • verify-server-NOT-in-maintenance.ps1
Description: Script we created to check various Exchange components are in Active state and the cluster node state is "Up"

 

Summary

 

Our Service Aware Patching use case for Exchange Service uses a single TSSA Batch job to run the entire sequence end to end, moving one node at a time into maintenance mode and then updating the patch level using normal TSSA patching jobs, and then return the node to service. All nodes will be sequenced through and the end result will be the entire Exchange cluster patched with no service outage. All using automation.

Share This:

TrueSight Smart Reporting for Server Automation (TSSR-SA) 19.2 Patch 2 released on February 27th, 2020.

 

The official product version is TSSR-SA 19.2.02. As of July 9 2020, 19.2 Patch 2 is still the latest version and this blog posting will be updated after the next release in August 2020.

 

TSSR-SA is the Smart Reporting solution for Truesight Server Automation (TSSA) environments and is the replacement product for BDSSA which reached its End-Of-Life in August 2019.

 

The TSSR-SA solution consists of two main products which currently use different versioning schemes. Although documented, this can sometimes result in confusion around which versions of the product components are compatible and should be downloaded and installed together for a valid TSSR-SA solution installation.

 

The goals of this month’s blog are to clarify details around:

 

  1. Which products make up the TSSR-SA 19.2.02 solution.
  2. Which versions of the constituent products are compatible.
  3. Which versions of TSSA are compatible with TSSR-SA 19.2.02
  4. Which files should be downloaded from the BMC EPD site in order to install or upgrade to TSSR-SA 19.2.02
  5. When to follow the documented install path vs the upgrade path

 

All of this information can be found in the official product documentation and documentation links will be provided when referenced.

 

Since some of this information exists in the TSSR-SA documentation space and some exists in the TSSR Platform documentation space, this blog aims to help avoid confusion by highlighting the most important information in one single place.

 

As a first step, its a good idea to read Sean Berry's TrueSight Smart Reporting (TSSR, BDSSA) FAQ blog posting which is a higher level introduction to Truesight Smart Reporting for Server Automation.

 

Thanks,


John O’Toole

Principal Technical Supp Analyst

BMC Software Customer Support

 

 

 

1. What products make up the TSSR-SA Smart Reporting Solution?

 

 

TSSR-SA consists of two main constituent products:

  • TrueSight Server Automation - Data Warehouse (TSSA-DW)
  • TrueSight Smart Reporting – Platform (TSSR Platform)

 

 

The product version information for TSSR-SA 19.2.02, as listed in the Release Notes, is as follows:

 

 

TrueSight Smart Reporting for Server Automation 19.2.02 (Solution)

Component

Version

TrueSight Server Automation - Data Warehouse

8.9.04.004

TrueSight Smart Reporting - Platform

20.02

 

 

For completeness, this can be compared with the product version information for TSSR-SA 19.02.01 (Patch 1) which released in December 2019:

 

 

TrueSight Smart Reporting for Server Automation 19.2.01 (Solution)

Component

Version

TrueSight Server Automation - Data Warehouse

8.9.04.003

TrueSight Smart Reporting - Platform

19.3

 

And with TSSR-SA 19.2 which released in July 2019:

 

TrueSight Smart Reporting for Server Automation 19.2. (Solution)

Component

Version

TrueSight Server Automation - Data Warehouse

8.9.04.002

TrueSight Smart Reporting - Platform

19.2

 

 

 

2. Which versions of TSA-DW are compatible with which versions of TSSR Platform?

 

 

This information can be found in the TSSR-SA 19.2.02 Release Notes:

 

 

 

 

As we can see here, in a valid TSSR-SA environment, there is a tight relationship between the versions of TSA-DW and TSSR Platform.

 

When downloading the TSSA-DW and TSSR Platform installers, it is important that compatible versions are downloaded.

 

 
3. Which versions of TrueSight Server Automation (TSSA) are compatible with which versions of TSSR-SA?

 

This information can also be found in the TSSR-SA 19.2.02 Release Notes:

 

Compatibility with TrueSight Server Automation

The following versions of TrueSight Server Automation are supported with TrueSight Server Automation - Data Warehouse:

  • BMC Server Automation 8.9.0.x and 8.9.02
  • TrueSight Server Automation 8.9.03.x, 8.9.04.x, and 20.02

 

So, every version of TSSR-SA 19.2.X is compatible with every version of BSA/TSSA 8.9.X and TSSA 20.02

 

The version numbers of TSSA and TSA-DW are not tightly coupled i.e. if an environment is currently running TSSA 8.9.03 the version of TSA-DW installed does not need to match.

 

What is important is that the version of TSA-DW installed is compatible with the corresponding version of TSSR-Platform. (see previous compatibility matrix)

 

For example, the following would be a valid combination:

 

TSSA 8.9.03

TSSA-DW 8.9.04 Patch 4          (part of TSSR-SA 19.2.02)

TSSR Platform 20.02                 (part of TSSR-SA 19.2.02)

 

 

4. Which files should I download in order to install a TSSR-SA 19.2.02 environment?

 

As mentioned previously, the TSSR-SA 19.2.02 solution consists of the following product versions:

 

TrueSight Smart Reporting for Server Automation 19.2.02 (Solution)

Component

Version

TrueSight Server Automation - Data Warehouse

8.9.04.004

TrueSight Smart Reporting - Platform

20.02

 

 

Details on which files to download for both TSA-DW 8.9.04.004 and TSSR Platform 20.02 can be found in the “Downloading the Patch” section of the TSSR-SA 19.2.02 release notes.

 

 

 

a) Locating and downloading the installer files for TSSA-DW 8.9.04.004

 

Since TSSA-DW 8.9.04.004 is a patch, the installation files are found under the “Product Patches” tab on EPD as highlighted below:

 

 

 

From here, we navigate to the TSSA-DW 8.9.04 area:

 

 

Under here, the TSSA 8.9.04.004 installation files are all dated 02/27/2020:

 

 

 

b) Locating and downloading the installer files for TSSR Platform 20.02

 

TSSR Platform 20.02 is a full product release so the installation files are found under the “Licensed Products” tab on EPD as highlighted below:

 

 

 

 

 

From here, we download either the Windows or the Linux installer depending on the OS of the TSSR-SA Server:

 

 

 

 

5. Once the installation files are downloaded, which documentation steps do I follow to install or upgrade TSSR-SA 19.2.02?

 

 

The steps to follow depend on whether this is the initial installation of TSSR-SA in the environment or whether a previous version of TSSR-SA (i.e. 19.2, 19.2.01) was previously installed and is now being upgraded to TSSR-SA 19.2.02.

 

 

Scenario A – Installing TSSR-SA for the first time in an environment:

 

This is the initial installation of TSSR-SA in this environment.

 

BDSSA is the TSSA reporting solution currently in use and we want TSSR-SA to use the same Warehouse DB that BDSSA has been using. This is sometimes referred to as a "migration" from BDSSA to TSSR-SA.

 

These are also the steps to follow if neither BDSSA nor TSSR have previously been installed in this environment.

 

In this scenario, we want to follow the Installing section in the TSSR-SA documentation.

 

Since the installation process includes steps from both the TSSA-DW and TSSR Platform documentation spaces, they are listed here in order:

 

  1. Review the TSSR-SA 19.2.02 Release Notes
  2. Review the TSSR-SA 19.2.02 Orientation for BDSSA users
  3. Review the TSSR-SA 19.2.02 Getting Started section
  4. Complete the Planning activities for TSSA-DW. Note: If the TSSA-DW installation will be using the same databases as BDSSA, no new databases or schemas need to be created for TSSA-DW.
  5. Complete the Prepare to install activities for TSSA-DW
  6. Complete the TSSA-DW Installation steps
  7. Complete the TSSA-DW Post-Installation tasks. Note: Step 2 of the Post-Installation tasks is the step where we provide TSSA-DW with the database information for the Warehouse, two ETL databases and the TSSA database.

 

 

As this point, we should have a functioning TSSA-DW installation and should be able to perform an ETL run as mentioned in step 6 of the Post Installation tasks.

 

If ETL is successful, we can continue to the TSSR 20.02 Platform installation which consists of the following steps:

 

8. Complete the TSSR Platform 20.02 Planning activities.

9. Complete the TSSR Platform 20.02 Preparing To Install activities. ( See point 6B below)
10. Complete the TSSR Platform 20.02 Installation steps
11. Complete the TSSR platform 20.02 Post Installation tasks which consists of two main steps:

  1. Adding the TSA-DW Component to TSSR Platform
  2. Configuring TSSR Platform settings

 

 

Scenario B – Upgrading an existing TSSR-SA environment:

 

A previous version of TSSR-SA (19.2 or 19.2.01) has already been installed in this environment and the goal is to upgrade it to TSSR-SA 19.2.02.

 

In this case, we want to follow the Upgrade steps in the TSSR-SA documentation.

 

  1. Review the  TSSR-SA 19.2.02 Release Notes
  2. Review the TSSR-SA 19.2.02 Upgrading page
  3. Complete the TSSA-DW Preparing to upgrade activities
  4. Complete the TSSA-DW Upgrade steps.
  5. Complete the TSSR Platform 20.02 Preparing to upgrade activities
  6. Complete the TSSR Platform 20.02 Upgrade steps
  7. Complete the TSSR-SA 19.2.02 Post-upgrade tasks. These steps are very important in order to avoid post-upgrade performance, resource usage, data-refresh or authentication issues

 

 

6. Important points and common questions/pitfalls

 

This section will be updated as common TSSR-SA 19.0.02 install/upgrade issues and questions are encountered.

 

A) Warehouse DB data condition to check before running TSSA-DW post-installation steps.

 

Make sure step 1 of the TSSA-DW Post-installation steps is followed if this is the initial migration of a BDSSA environment to TSSR-SA. Skipping this step may result in errors during the Warehouse DB upgrade.

 

"If you are performing the initial migration of BMC Decision Support for Server Automation warehouse to TrueSight Server Automation - Data Warehouse, you must run a diagnostic SQL query to verify that the version of BMC Decision Support for Server Automation warehouse is as expected. For more information, see this knowledge article"

 

 

 

B) Creating the TSSR Repository DB/Schema

 

When doing a fresh installation of TSSR-SA in an environment where BDSSA is currently installed, and the same Warehouse and ETL DBs are to be used by TSSR-SA, there can sometimes be confusion around the new TSSR Repository DB which TSSR Platform uses to store information such as metadata, users, and user permissions for reports.

 

Unlike the Warehouse and ETL DBs, this is not a DB which was present or used by BDSSA. This is a net-new DB required by TSSR Platform.

 

Oracle Environments:

For Oracle DB environments, the TSSR Repository user must be created in advance of running the TSSR Platform installer.

The installer is then provided with the DB connection information. See the "Setting up Oracle as the repository database" section of the Setting up the TSSR Repository Database page.

This page contains details on creating the Oracle TSSR Repository user, the tablespace (if a separate tablespace is preferred) and granting the required privileges.

 

 

SQL Server Environments:

For SQL Server environments, the user can choose whether to create the TSSR Repository DB and user in advance of running the TSSR Platform installer or to allow the installer to create the TSSR Repository DB and user.

 

If the option to have the installer create the DB and User is selected, then the installer requires Database Administrator credentials to be provided so that these tasks can be completed.

If the option to create the DB and User manually in advance is selected, then Database Administrator credentials will not be required by the installer.

Filter Blog

By date:
By tag: