Share This:

Over time, as your TrueSight grows in your environment, this growth can lead to performance issues and extra events. TrueSight sends self-monitoring events to users when their environments are growing too large for the existing configuration. The effort is made to provide a TrueSight Administrator with the details of how many devices, monitors, instances, and events are being monitored. When TrueSight has reached 50% of the upper limit, a warning event is sent out and once TrueSight has reached 90% of the upper limit on the parameters being monitored, a self-monitoring event is sent out.

 

Users don’t want to see this event - self-monitoring event: "No. of instances is above 100% of max limit 250000 on Infrastructure Management Server." This means you have hit a limit within TrueSight and you must clean up the instances which may not be necessary. If they are all necessary, a second setup may be needed to manage the load. TrueSight uses the following thresholds to monitor the count of devices, monitors, and attributes. Once these thresholds have reached 50% of the upper limit then TrueSight will notify users via an event so that the TrueSight admin is aware of the issue.

 

These values are found in the pronet.conf under the TSIM/pw/pronto/conf folder

 

pronet.deployment.selfmonitor.device.threshold.min=50

pronet.deployment.selfmonitor.device.threshold.max=90

 

pronet.deployment.selfmonitor.instance.threshold.min=50

pronet.deployment.selfmonitor.instance.threshold.max=90

 

pronet.deployment.selfmonitor.attribute.threshold.min=50

pronet.deployment.selfmonitor.attribute.threshold.max=90

 

pronet.deployment.selfmonitor.event.threshold.min=50

pronet.deployment.selfmonitor.event.threshold.max=90

 

There is a property available to enable/disable this functionality using a True and False value.

pronet.deployment.selfmonitor.feature.enable=True
BMC does not recommend disabling this property as it is important to know when the application is reaching its limits. The details above represent the approach to enhance self-monitoring so an alarm is generated if the number of monitored instances or attributes is approaching the upper limit for the environment. Based on the size configuration setting in the TSIM Server

Small environment

Size of jserver default (2 GB)

Size of rate default (2 GB)

Size of agent controller default (2 GB)


Medium environment
Size of jserver default (4 GB)
Size of rate default (4 GB)
Size of agent controller default (4 GB)

 

Large environment

Size of jserver default (10 GB)

Size of rate default (6 GB)

Size of agent controller default (7 GB)

 

TrueSight will read the configuration setting from TSIM server and identify a Small, Medium, or Large deployment, then based on a scheduled task which runs hourly (configurable) TrueSight will validate the device, instance, and attribute counts and will trigger the events once it reaches 90%(configurable) of the max limit.

 

Note: The values above are not absolute, Support has had to tune environments specifically based on the components used, the number of items being monitored, and the hardware specifications. There are many times when Support will need to tune the values of the jserver, rate, or agent controller based on what the system is running already.

 

There are several documentation pages which reference the sizing and limitations within TrueSight. The values are presented for the TrueSight Administrator to know which size of environment is needed. During the start of a TrueSight rollout, it may start off as a small environment, but over time it may grow. The idea is that users should plan for the size they think they may grow into.  This is not always possible, so for that, we do have tuning options to take you from a small to medium sized environment or a medium to large environment, but once you reach the maximum on the large environment, you will need to look at what is being monitored and whether some of the instances are not needed. Think of it as a built-in housekeeping situation.

 

Many times, you want to monitor everything up front, you want to see the nice reports and the total number of instances and parameters which can be monitored, but over time, you learn more about your systems and their performance and perhaps some parameters or some instances are no longer needed.  When you run into the self-monitoring event, it is time to review the setup and see which instances stay and which can be let go.

 

There are sizing guidelines for each TrueSight Component, so it can get complicated at times, but the best thing to do is to size the Presentation server, and then size the Infrastructure Management server, the Integration Service server(s) and then take a look at the data and event sizing guidelines as well. It really is enough to make your head spin!

 

Here are details the documentation provides to help make sizing a bit easier:

TSPSScaling.png

These details are in the online documentation under the Presentation Server sizing section

 

The hardest part of sizing is on the TSIM server. There are many more considerations in TSIM to account for. Here is the quick link to the start of that documentation - Infrastructure Management Server general sizing guidelines

 

A single large BMC TrueSight Infrastructure Management Server host computer scales to support the following maximum values. The values include all processes and the event management cell.

Data for 1.7 Million performance parameters:

  • 20,000 devices
  • 250,000 monitored instance
  • 3 days of raw data
  • 3 months of rate data
  • 40,000 intelligent events per day
  • 350,000 external events per day


These values are all independent so if you reach one, not all, but one of these values, you will receive the self-monitoring event indicating a sizing issue. This sizing is for a 64-bit host with 8 CPUs and 32 GB of RAM. Do not try to exceed 250,000 monitor instances or 1,700,000 attributes on a single Infrastructure Management Server host. Additional CPU and memory do not help to scale beyond these values. You will see performance issues as well as odd, unexplained behavior because when a system is overloaded, nothing really works correctly. A restart may help momentarily, but once the monitoring limits are breached, all bets are off and performance tanks quickly.

 

For 11.3.01 in some environments, the Infrastructure Management database requires more storage space than previous versions. The data collection rate has been increased, data was collected every 5 minutes but now the data collected by the BMC PATROL Agent is streamed to the BMC TrueSight Infrastructure Management Server and many parameters are collected every minute.  This means you must give the database the space I needs to for the increased data stream. You must allocate 600 GB of storage (100 GB for the Infrastructure Management Server + 500 GB for the database) in a large implementation.

 

We also need to consider the various components and their sizing:

Sizing charts and guidelines for event and impact management

Sizing charts and guidelines for data and event management environments

Sizing chart and guidelines for the Cloud Lifecycle Management integration

Sizing the VMware vCloud Director integration

 

Let’s not forget about the Integration Service, the overlooked component when it comes to sizing. It is important to have it properly sized.

A single large Integration Service host scales to support the following maximum values. These values include the Integration Service and the event management cell.

    Data for 1.7 million performance parameters

  • 900 PATROL Agents
  • 50,000 monitored instances
  • 25 events per second

 

Again, these values are independent, it does not take all values to trigger a self-monitoring event, it just takes one. Therefore, TrueSight documentation offers the considerations below for sizing the Integration Service:

· A 64-bit Integration Service host machine with 2 CPUs and 2 GB RAM can scale up to 500 PATROL Agents.

· Allocate a separate Integration Service for each remote network.

· A 64-bit Integration Service host with 4 CPUs and 8 GB RAM can scale up to 900 PATROL Agents.

· Minimize the number of Integration Service that you use to collect data from BMC PATROL Agents.

· Ensure that the Infrastructure Management Central and Child Servers have user access to the Integration Service computers.

· Deploy an Integration Service on each network where BMC PATROL Agents are located, and distribute BMC PATROL Agents on the Integration Services so that the load on the service does not exceed the suggested values.

· The load can be distributed across one or more Integration Services by keeping the total load on the server constant as per the deployment type. You can configure a maximum of 25 Integration Services.

· You can configure a maximum of 500 PATROL Agents per Integration Service in a small deployment and 900 in medium and large deployments.

 

 

Some of the general tips for all types of deployments

· Install the BMC TrueSight Infrastructure Management Server and Integration Service on separate computers.

· Distribute event correlation, de-duplication, and normalization on Infrastructure Management remote cells.

· (SAP SQL Anywhere only) To improve I/O throughput, use separate disks for the operating systems on which the BMC TrueSight Infrastructure Management Server and the database are deployed. Ensure that each disk has its own disk controllers.

· Distribute data collection on Integration Services so that the server is dedicated to the primary server processes (analytical engine, object cache, and agent controller). The Self-Monitoring Service, which is automatically installed on the BMC TrueSight Infrastructure Management Server, collects metrics only about the performance of the BMC TrueSight Infrastructure Management Server.

· If you expect the total number of attributes for the environment to grow to 750,000 attributes, use a large-environment deployment so that the system can accommodate growth smoothly.

 

Overall, when you size the environment, Support recommends planning for growth. Support has seen many cases over the past year where the environment was originally configured for small or medium and by the end of the year, the TrueSight implementation is already over the large sized capacity and performance is at a standstill. Please do review the sizing and hardware requirements as they are important in an ever growing monitoring environment.

 

If you have any sizing or tuning concerns, please log a case with Support for further assistance.

 

 

Did you miss one of our TrueSight Webinars this year?

Check out past Webinar recordings!!! - TrueSight Best Practices Webinar Calendar

 

Do you have questions about the new TrueSight Smart Reporting option? Take a look at this FAQ to see if helps - TrueSight Smart Reporting details

 

AMIGO.jpg

 

 

 

TrueSight 11.3.01 is here!!!!!…

The BMC Assisted MIGration Offering, or AMIGO, is a program designed to assist our customers in planning and preparing for product upgrades from an older, to a newer supported version.  By engaging with BMC Technical Support Analysts, you will be provided with materials containing guidelines and best practices to aid in compiling your own upgrade plan. An upgrade expert will then review your plan, and offer advice and suggestions to ensure success through proper planning and testing.

The AMIGO program consists of a Starter Phase and a Review Phase.  Each phase is initiated by opening a support case, and ends when the case is closed.

In the Starter Phase, an AMIGO Starter case is opened.  Reference material will be provided and a call with a Technical Support Analyst will take place to discuss the details of your upgrade, and address any questions you may have.  The AMIGO Starter case will be closed, and the next step will be for you to prepare a documented upgrade plan.

In the Review Phase, an AMIGO Review case is opened preferably two weeks prior to a set upgrade date.  A call will be scheduled with an upgrade expert to review your detailed plan, providing feedback and recommendations, along with answers to any outstanding questions.  As needed, a follow up discussion with a Technical Support Analyst may take place for feedback after the upgrade is performed.

The AMIGO program includes:

» A “Question and Answer” session before you upgrade

» A review of your upgrade plan with Customer Support

» An upgrade checklist

» Helpful tips and tricks for upgrade success from previous customer upgrades

» A follow-up session with Customer Support to let them know how it went. This will help BMC to enhance the process.

 

To get started, please review the details here:

https://docs.bmc.com/docs/TSOperations/113/amigo-checklist-for-truesight-operations-management-814553031.html

 

Then open a BMC Support issue containing your environment information (product, version, OS, etc.) and the planned date of the installation, if known. We will contact you promptly, and work with you to ensure a successful and timely outcome.

 

 

 

Computer.png

 

New Knowledge Added over the last month:

 

000162392 How to auto-close event in TrueSight when the corresponding incident is resolved in Remedy

 

000162132 Slow response in creating or editing TSPS infrastructure policies

 

000161854 When creating a package to upgrade PATROL Agent servers what happens if you leave the Agent TAG and Integration Service fields blank?

 

000161739 Is it recommended to install ASSO and RSSO on the same server in a TrueSight environment?

 

000161592 "Cannot validate argument on parameter 'Group'" seen when attempting to run GenericService_SDIG.ps1

 

000161584 Unable to remove an ISN under Manage Devices in TSPS

 

000162410 Policy is not being applied against all PATROL Agents that are defined in the Agent Selection Criteria

 

000162243 During fail over to Disaster Recovery site the TrueSight Presentation Server in the DR remains in standby mode

 

000162162 Cross Launch From TSPS to Truesight Smart Reporting is Throwing Session Timeout error

 

000162040 Propagation policy does not work after being edited

 

000161956 The publishing server is not starting properly. The psstat returns "Connection initialization timeout expired. Please verify if the Publishing Server and JBoss jms service are up  and running."