We are using BPM2.9.1 for agentless monitoring. and one of the AIX (AIX 6.1 TL4 SP3)
server face this issue and from the Element sys logs user observed the below errors.
Did not receive identification string from XXXX (XXXX is RSM IP address).
but this error was observer intermittently.. and monitoring stopped for OS application class except Filesystem.
Thanks in advance.
It sounds like your command is timing out which could possibly be because you've asked it to retrieve a lot of data, the host is busy or it's having problems getting information back across the network in time. You could try increasing the timout limit in (on all host where you have an RSM installed)
and change the following line:
I believe the value is in seconds so obviously don't make this larger than yout sampling time.
ACS error - "Timed out waiting for data to be sent or received by the SSH channel"
=> This problem can occur when the default timeout period for the Command Shell collector is not high enough for the element. Please adjust the timeout value by modifying a property in the rsm.properties file on the RSM computer.
Here is the location of the file and information:
- rsm.properties-available in the following location, contains timeout values that you can customize for application classes that use the SNMP or command shell collectors:
Properties in rsm.properties file on RSM computer:- patsdk-commandshell-solution.patsdk-commandshell.timeout
number of seconds that the Command Shellcollector waits for a value before timing out
# rsm.properties configuration file.
# Command Shell collector default timeout in seconds.
# To configure a custom timeout for Command Shell collectors, uncomment the # following line and replace the default value.
Please try the above solution and let me know the outcome.
Also please verify the following solution:
The application class AIX - Using Command Shell and maybe others, contain some shell commands that include a nslookup. This is fine if DNS is being used and nameservers ar correctly configured. However, if DNS is not used or the nameservers are incorrectly configured, timeouts may occur. The nslookup command has default timeout and retry settings if it fails to connect to a nameserver, however these default timeout periods can be very long. This is because the default retry setting is 4 and with each retry the timeout doubles.
To resolve this create a file called .nslookuprc in the $HOME directory of the user account used for the connection between RSM and AIX server. In the file $HOME/.nslookuprc, put the following lines:
The next time the application class runs, the nslookup will timeout in 5 seconds and the data will be collected successfully.
Please let me know if the above is able to be applied to your situation and if it helps to solve the problem. Thanks.VM.
Just For Your Information: - Latest BPM Express for Servers released versions:
BPM Express for Unix/Linux 2.7.64
BPM Express for Microsoft Windows 2.7.70
Release Notes of the latest Portal v2.10
I hope this helps, Thank You.