Discovery issues may have various causes that can sometimes be difficult to identify. The new Report monitor implemented in Hardware Sentry KM for TrueSight Operations Mgmt | TrueSight Infrastructure Mgmt provides an overview of the discovery progress, collect, status, and results for a specific host. Each monitored attribute can be configured to trigger alerts when their value reaches an unacceptable or abnormal level to help you quickly pinpoint a discovery or performance problem.
The Report monitor is displayed in the Monitor list of the Device Details in your TrueSight console:
The Report graph provides a real-time view of the data collected on a specific host. The graph can be customized to display several attributes at a time.
The collected data is distributed in 5 categories:
Note: The availability of attributes may vary according to the components discovered and monitored on your device.
These attributes report on the number of components discovered and monitored on a host. Although thresholds are not set by default, you can easily configure your own custom thresholds to get notified when a counter is going suspiciously low which may suggest a discovery issue. A graph displaying some of the following counter-type attributes can help you focus on a specific type of data to analyze and isolate an issue:
Count - All
|Sum of all objects reported by the Count-x attribute, except for Count-ConnectedPorts and Count-Missing.|
Count - Battery
|Count of all batteries discovered on the monitored host.|
Count - Blade
|Count of all blades discovered on the monitored host.|
Count - Connected Ports
|Count of all connected ports discovered on the monitored host|
Count - CPU Cores
|Count of all CPU cores discovered on the monitored host.|
Count - CPU
|Count of all CPUs discovered on the monitored host.|
Count- Disk Controller
|Count of all disk controllers discovered on the monitored host.|
Count - Disk Enclosure
|Count of all disk enclosures discovered on the monitored host.|
Count - Enclosure
|Count of all enclosures discovered on the monitored host.|
Count - Fan
|Count of all fans discovered on the monitored host.|
Count - LED
|Count of all LEDs discovered on the monitored host.|
Count - Logical Disk
|Count of all logical disks discovered on the monitored host.|
Count - LUN
|Count of all LUNs discovered on the monitored host.|
Count - Memory
|Count of all memory modules discovered on the monitored host.|
Count - Missing
|Count of all objects that are currently missing.|
Count - Network
|Count of all network interfaces discovered on the monitored host.|
Count - Other
|Count of all other devices discovered on the monitored host.|
|Count - Physical Disk||Count of all physical disks discovered on the monitored host.|
Count - Power Supply
|Count of all power supplies discovered on the monitored host.|
Count - Robotics
|Count of all robotics discovered on the monitored host.|
Count - Tape Library
|Count of all tape libraries discovered on the monitored host.|
Count - Temperature
|Count of all temperature sensors discovered on the monitored host.|
|Count - Voltage|
Count of all voltage sensors discovered on the monitored host.
For example, a Count-All attribute displaying a number of monitored components lower than 5 may indicate that the host monitoring is likely incomplete and requires investigation.
These attributes report on the number of successful executions performed on a host, by protocol (including OS Commands) based on data collected upon each discovery.
|Execution - Command||Count of successful command executions performed on the monitored host.|
|Execution - HTTP||Count of successful HTTP request executions performed on the monitored host.|
|Execution - IPMI||Count of successful IPMI command executions performed on the monitored host.|
|Execution - SNMP||Count of successful SNMP query executions performed on the monitored host.|
|Execution - UCS||Count of successful UCS commands executions performed on the monitored host.|
|Execution - WBEM||Count of successful WBEM query executions performed on the monitored host.|
|Execution - WMI||Count of successful WMI query executions performed on the monitored host.|
Overloaded systems may prevent Hardware Sentry from operating properly. When a sudden drop in the number of executions occurs, it is likely due to a lack of resources available to the monitoring solution for monitoring the host in an optimal manner. Hardware Sentry 10 enables you to closely watch the amount of successful commands or queries execution on a host to rapidly pinpoint any overload or unpredicted change in the system’s workload.
These attributes report on the status of the connections between Hardware Sentry 10 and the monitored host. When the monitored host is the localhost, connection issues may be due to invalid credentials or a protocol interruption or failure (SNMP service or SSH daemon stopped for example). When a connection becomes degraded or fails, the connection-specific ProtocolStatus attribute triggers an alert to immediately notify you about the protocol that requires attention. Note that the attribute graph is also annotated to provide additional information regarding the protocol error.
|ProtocolStatus-Command||Status of the connection to the monitored device with the OS command protocol.|
|ProtocolStatus-SNMP||Status of the connection to the monitored device with the SNMP protocol.|
|ProtocolStatus-WBEM||Status of the connection to the monitored device with the WBEM protocol.|
|ProtocolStatus-WMI||Status of the connection to the monitored device with the WMI protocol.|
The size-type attributes report on the consolidated size, in gigabytes, of logical/physical disks and memory for the monitored host, giving you an overall view of the capacity potentially available on your device.
|Size - Logical Disk||Sum of the size of all discovered logical disks on the monitored host.|
|Size - Memory||Sum of the size of all discovered memory modules on the monitored host.|
|Size - Physical Disk||Sum of the size of all discovered physical disks on the monitored host.|
|Degrees Below Warning||Number of degrees before reaching the closest warning threshold.|
|Hardware Discovery Status||Status of the discovery of all hardware components on the monitored host.|
|Hardware Discovery Time||Time taken to discover all hardware components on the monitored host.|
|Platform Detection Time||Time taken to detect the connectors that match the system. If connectors are pre-selected, the attribute value is '0'.|
|Power Consumption||Wattage consumed by all the discovered components on the monitored host.|
Sum of the values of all LinkSpeed attributes (only for plugged-in network cards).
Some of these attributes may anticipate critical situation and optimize the monitoring of your entire IT assets, for example:
The Degrees Below Warning attribute reports on the remaining number of degrees (C°) before the temperature reaches the closest warning threshold set on the temperature sensor of the device. Hardware Sentry 10 automatically sets thresholds according to the manufacturers' recommendation and the location of the temperature sensor. A continuous and close control of the temperature of your servers will help you optimize the overall temperature of your entire IT environment to avoid hardware overheating and keep your energy bill within budget.
The Hardware Discovery Status attribute can help you identify an unusual or suspicious workload on the agent monitoring a host.
A Hardware Discovery Status set to 1 (Waiting in Bottleneck), indicates that the discovery is waiting to be performed. A discovery process reported longer than a mere second in a bottleneck state (value at 1) is likely to indicate a resource issue. Although Hardware Sentry 10 handles these situations, splitting the configuration across other PATROL agents might help optimize the monitoring performance.
A Hardware Discovery Status set to 2 (On) can be trickier to analyze, since it may occur when the monitored host is slow or if the number of devices to discover is high. Typically, the graph should display a discovery status set to 1 when the host discovery starts, and then a value at 2 that should remain for the same duration from one discovery to another, meaning that Hardware Sentry 10 and host are behaving the same way from one discovery to another.
The Hardware Discovery Time attribute collects the time taken to discover the hardware components on a monitored host. The Hardware Discovery Time attribute of a specific host can be customized to alert you when sudden baseline deviation occurs and therefore help you isolate the root cause of the problem faster.
The Power Consumption attribute reports on the wattage consumed by all the discovered components on the monitored host. This measure makes it easier to determine which of your systems consumes the most or the less energy. The Power Consumption attribute provides a solid, overall view of a host power consumption and can help you better manage energy efficiency.
Monitoring the discovery and availability of critical components and processes, such as CPU, memory, battery, temperature, connections, etc.., is essential to maintain systems’ performance, anticipate critical hardware failure and minimize downtime. The Report monitor can help support the effort to maintain proactive problem detection policies by constantly monitoring critical key-performance indicators; it can also become a source of significant savings in terms of time and money by providing an overall portrait of systems health and performance and by quickly pinpoint the root cause of potentially critical issues.
Download Hardware Sentry KM on Sentry's Website!