9 Replies Latest reply on Sep 17, 2018 11:03 AM by EDOARDO SPELTA

    Patrol agent intermitent disconnections with Integration Service

      Good morning,


      we're facing some issues with a Patrol Agent V9.5.00.2i. The config is the same as for the rest of the agents ( retrieved from CMA policies). The log is not showing any additional info but this:


      Tue Nov 17 02:00:54 PM 2015: ID 1021fb: W: Connection with Integration Service INTEGRATION_SERVICE_HOSTNAME with port 3183 lost

      Tue Nov 17 02:01:24 PM 2015: ID 1021fa: I: Connection established with Integration Service INTEGRATION_SERVICE_HOSTNAME on port 3183

      Tue Nov 17 03:11:24 PM 2015: ID 1021fb: W: Connection with Integration Service INTEGRATION_SERVICE_HOSTNAME with port 3183 lost

      Tue Nov 17 03:11:54 PM 2015: ID 1021fa: I: Connection established with Integration Service INTEGRATION_SERVICE_HOSTNAME on port 3183

      ....

      the connections/disconnections may differ in time, but mostly they are every 10 minutes approx.

       

      As we've checked that connectivity is OK , IS is working fine, and the rest of the agents don't have this problem, I enabled the debug in the agent, but can't interpretate the meaning:

       

      COMM     1447841359 18-11-15 10:09  Wed Nov 18 11:09:19 2015 from agent to console (nonblocking) (567)   
                           IS_M  1-1-70 0:00    
                           {  1447841359    
                             1447841359     IS_M_T=DATA
                             1447841359     IS_M_ID=PA-0-10889-1447840889-1004
                             1447841359     IS_M_D=AC-0
                             1447841359     IS_An=Y
                             1447841359     NSDL=133
                             1447841359     MOH=\007\005AISH\010\001\006APL_N\003\006APL_V\003\010APL_REL\003\010APL_REV\003\007INST_S\003\007INST_O\004PAN\003\004PIS\003\006Linux\00516.5\0109.12.00\00300\014es2rxsms04v\006
                             1447841359     NSDL=321
                             1447841359     MOJ=\007\004AIS\004\004
                           }  1447841359    


      SESSION  1447841359 18-11-15 10:09  Compressing 567
      SESSION  1447841359 18-11-15 10:09  Session T:26.3183@10.2.90.120 Compress: 567 > 19
      IDENTITY 1447841359 18-11-15 10:09  Entering esi_Write
      TCP      1447841359 18-11-15 10:09  Fd = 26 sends 'D' (0x44) Size = 46 to: 3183@10.2.90.120
      IDENTITY 1447841359 18-11-15 10:09  Exiting esi_Write OK
      SELECT   1447841359 18-11-15 10:09  timewait 0 sec 391800 usec
      TCP      1447841359 18-11-15 10:09  Fd = 26  NumRead = -1  errno = 104
      RUNQ     1447841359 18-11-15 10:09  ExecuteProcesses: current time is 1447841359
      RUNQ     1447841359 18-11-15 10:09  ExecuteProcesses: next proc to exec is in 3 secs
      IDENTITY 1447841359 18-11-15 10:09  Entering esi_IsEstablished
      IDENTITY 1447841359 18-11-15 10:09  Exiting esi_IsEstablished YES
      COMM     1447841359 18-11-15 10:09  Connection with Integration Service 'INTEGRATION_SERVICE_HOST' '3183' is closed '-1'
      MAIN     1447841359 18-11-15 10:09  DEL AGS Timestamp / PV Timer
      GENERAL  1447841359 18-11-15 10:09  shortcut 'AgentSetup' expanded to target '//pcfg/AgentSetup'
      GENERAL  1447841359 18-11-15 10:09  shortcut 'AgentSetup' expanded to target '//pcfg/AgentSetup'
      PCFG     1447841359 18-11-15 10:09  get /AgentSetup/integration/sendAllConnectDisconnectEvent => (?,no)
      PEMTRACE 1447841359 18-11-15 10:09  BEM event processing with format 'BiiP3'
      PEMTRACE 1447841359 18-11-15 10:09  Best match on catalog '0' level (unmatched IS_DISCONNECT)
      MAIN     1447841359 18-11-15 10:09  Sending EVT_MSG APL=Linux,SID=es2rxsms04v, NAME=agenthostname,            PARAM=<NULL>, ORIG=agenthostname
      GDD      1447841359 18-11-15 10:09  EVENT Cache Log Written for MSGID 10923
      MAIN 1447841359 Restart timer '652' of the Integration Service is installed for hostname 'INTEGRATION_SERVICE_HOST' port '3183' protocol 'TCP'

       

      May someone please give us some light in this? I don't know where that "Restart timer" comes from (in the last line), but seems to be restarting the connection?

       

      thanks a lot and kind regards

      Sergio

        • 1. Re: Patrol agent intermitent disconnections with Integration Service

          We have solved this problem.

           

          In the IP Forwarding profile of one of our F5 Loadbalancers, we had to enable the following checks:

           

          Loose Initiation

          Loose Close

           

          After enabling them, the patrol agents are no longer presenting disconnections.

           

          kind regards

          Sergio Martín

          1 of 1 people found this helpful
          • 2. Re: Patrol agent intermitent disconnections with Integration Service

            Glad that some of you manage to have the problem solved.

             

            My problem is slightly different as I do not have load balancer but i got have 1 pair of integration service cluster.

             

            the recent case I open with support did not manage to resolve the issue. the observation is that once i start up the integration service, there are so many traffic that came in (about 600 for the 1 pair) till the integration service unable to establish the proper TCP 3 way handshake.

             

            even if i telnet locally on the IS machine for port 3183, sometime can telnet through  and some time cannot.

            we even adjusted some TCP setting on windows level, but it does not help.

             

            now i am considering switching the IS that running on windows to linux

            • 3. Re: Patrol agent intermitent disconnections with Integration Service
              Michael Evans

              in our PNet 9.6 + Agent 9.6 environment We have had many examples of repeating disconnects (every 10 minutes).  We also use a load balancer (cisco ace) however we were able to show the disconnects occurring even when directly connected to our ISN.  We also didn't get anywhere with R&D to resolve the problem - we supplied plenty of logs including many debugs from the patrol agents.(ticket 00026586)

               

              We found the debug showed us the text "Flushing Out Stream" at the same time as the disconnects (see below) - but that may be a symptom or the cause. 

               

              One thing that did appear to help was when we reduced the amount of monitoring on a host.  For example we had some Exchange servers with a large amount of monitoring that constantly disconnected.  When we reduced the monitoring load the agent no longer had the same number of disconnects. 

               

              Overall this is still an outstanding issue for us - as we're deploying 10.5 shortly we'll see if any of the agents display the same behavior on our new environment and re-open the case if needed.

               

              From the Patrol ERRS file

              Mon Nov 16 15:45:35 2015: ID 1021fb: W: Connection with Integration Service 172.16.76.29 with port 3183 lost

              Mon Nov 16 15:46:05 2015: ID 1021fa: I: Connection established with Integration Service 172.16.76.29 on port 3183

              From the Debug File at the same timestamps

              SESSION |1447706735| Flushing Out Stream

              SESSION |1447706735| Flushing In Stream

              • 4. Re: Patrol agent intermitent disconnections with Integration Service
                Deepak Saxena

                Hi ,

                 

                We do have the same issue with 9.5 , frequently Integration service disconnect alerts are coming. As BMC suggested we are planning to upgarde 9.6 to fix this , But now i am not sure if that will be fix with this 9.6 version.

                 

                Thanks ,
                Deeps

                • 5. Re: Patrol agent intermitent disconnections with Integration Service
                  EDOARDO SPELTA

                  Hello,

                  this is old stuff i know, but i'm facing the same problem with same versions of bppm and ISNs.

                  Did you solve this somehow ?

                  It doesn't seem to be network-related as it happens also for the PA on the ISN itself..

                  • 6. Re: Patrol agent intermitent disconnections with Integration Service
                    Michael Evans

                    There were many service packs and hot fixes to resolve the issues. I recommend getting up to whatever the latest 9.6 release is. From there look to move to the very latest from BMC as there are other data issues they have addressed such as dropping data on the data queueing side.

                    • 7. Re: Patrol agent intermitent disconnections with Integration Service
                      EDOARDO SPELTA

                      Hi,

                      support made us apply the latest Patches/fixes for ISNs, but it's 9.6 so there not many updates out there..

                      We are still investigating the cause though.

                       

                      Where is the data dropping you mention  happening ? on BPPM, on ISN ?

                      We still need to tell if the problem is due to network (they deny) or on application (bmc denies..)

                      • 8. Re: Patrol agent intermitent disconnections with Integration Service
                        Michael Evans

                        Hi Edoardo – all of the changes we did to stabilize the system were on the BMC side – we didn’t make any infrastructure changes.

                        The ISN environment patches made a huge difference to our system – along with the BPPM patches.

                         

                        the Data Dropping occurs at the BPPM side when the queues become filled with the data streamed from the Agents.  The queues sizes can be tuned but only so much can be done.  If your agents are pumping more data than can be processed by BPPM the queues will system will drop data (no recovery) that cannot be stored.   the BPPM logs also show the dropped data – so you can alert yourself if it occurring.

                         

                        The result is you may be missing important data that may otherwise trigger a Server Side (BPPM) threshold/alarm.  The graphs would be blank during that data gap – or have a abnormally long solid line from the last data last before the gap to the next data point.   There are a number of posts in the forums about dropped data.

                         

                        The good news is BMC re-architected the queuing system in TrueSight and as of at least 10.7 we do not have any dropping data.

                         

                        Hope that helps.

                        • 9. Re: Patrol agent intermitent disconnections with Integration Service
                          EDOARDO SPELTA

                          ok, i see what kind of dropped data you mean.

                          My case was patrolagents connections to ISNs being dropped, so i'm trying to figure out if it was a network problem or an application problem. I still think that BPPM (maybe the agent controller) could be involved in this as much as the network.