3 Replies Latest reply on Mar 20, 2017 9:30 AM by Kris Cichon

    How to troubleshoot hanging discovery run?

    Yanick Girouard

      We've had our nightly discovery run hang twice since last week. It normally completes within 5-6 hours and this morning I came in and it had been running for 2 days and 9 hours. If I cancel it, it doesn't cancel and stays at "cancelling". If I stop discovery and start it again, it gets stuck at Starting... If I restart the appliance services, the run still shows as "Cancelling" when the appliance comes back online, but then if I stop and start discovery again, it clears.

       

      I had to do the same thing the last time it occurred, which was the night when a major network switch change happened (some switches were replaced on the network). It might very well be related, but I don't know what to look for to find where it's hanging exactly and why. I've tried looking at the discovery logs, but there's too much info and I can't spot any clue that's meaningful to me. The discovery run summary doesn't show anything regarding the endpoint it was trying to scan when it hung. All I noticed is that it was scanning a single endpoint when it was hung because it showed "1 scanning" under the % complete column of the discovery run view. Seems like it was getting hung on a single host.

       

      If that happens again, what should I be looking at to find which host it's in the process of scanning, and what should I be looking at to troubleshoot why it's hanging?

        • 1. Re: How to troubleshoot hanging discovery run?
          Andrew Waters

          There are a few ways to find out what Discovery is scanning.

           

          The "Unfinished Endpoints" reports on the DiscoveryRun for the scan will list those DiscoveryAccesses which have started but not yet completed. If you click through to the DiscoveryAccess you can see what discovery request that have completed against the device.

           

          tw_reasoningstatus gives information about what Reasoning is doing, from that you should be able to tell if the system is waiting for a discovery result. Exactly what to look for depends upon what the system is doing. Normally it will appear as a non-zero value for 'Endpoint discovering (maximum)' line. For example 1 (30) means it is actively performing a discovery request against 1 IP address.

           

          Determining more details can be tricky unless you are running reasoning in debug. If you are running discovery in debug you may be able to limit by seeing what discovery is doing.

          • 2. Re: How to troubleshoot hanging discovery run?
            Yanick Girouard

            Excellent, I'll give that a try. Thanks Andrew !

            • 3. Re: How to troubleshoot hanging discovery run?
              Kris Cichon

              Hello Yanick ..Geetings. Here we are here again.

              Run across your submission and found out that we are battling a same enemy.

              Thanks to Andrew we know use of the tw_reasoningstatus .. thanks Andrew...

              Does your report reflect DISCOVERY ON SCANNER ? (i presume you may have bid discovery out there with Consolidators)

               

              My SCANNERS complete discoveries just right .. my CONSOLIDATORS are having problem you just described. ADDM 10.2 (latest update).

               

              I just wonder if anyone discovered some straight forward method of dumping these hanging runs and having Reasoning Enging return to its full capacity. And YES .. so far I am using "Your" method of canceling and restarting services ...

               

              All the Best - Kris Cichon

              BlueCross BlueShied of IL

               

              Just for reference .. this is what my output looks like .. (ALL Discoveries are COMPLETE and then Engines seat in the middle  doing nothing when there is so much to do ;-( ..

               

                                             |                        ECA Engine
                                             |   0 |   1 |   2 |   3 |   4 |   5 |   6

                ------------------------------------------------------------------------------------------------------

                                Event engine |     |     |     |     |     |     |
                                     Status: | running | running | running | running | running | running | running
                              Queued events: |   0 |   0 |   0 |   0 |   0 |   0 |   0
                Events processing (maximum): |   0 (1) |   0 (1) |   0 (1) |   0 (1) |   0 (1) |   0 (1) |   0 (1)
                             Actions loaded: | 187 | 187 | 187 | 187 | 187 | 187 | 187
                               Rules loaded: |   24073 |   24073 |   24073 |   24073 |   24073 |   24073 |   24073
                                             |     |     |     |     |     |     |
                            Discovery engine |     |     |     |     |     |     |
                                     Status: | running | running | running | running | running | running | running
                            Queued requests: |   0 |   0 |   0 |   0 |   0 |   0 |   0
                             Endpoint count: |   0 |   0 |   0 |   0 |   0 |   0 |   0
                          Endpoints waiting: |   0 |   0 |   0 |   0 |   0 |   0 |   0

                Endpoints discovering (maximum): |  0 (30) |  0 (30) |  0 (30) |  0 (30) |  0 (30) |  0 (30) |  0 (30)

                      Asynchronous requests: |   0 |   0 |   0 |   0 |   0 |   0 |   0
                        Endpoint throttling: |   False |   False |   False |   False |   False |   False |   False
                       Providers throttling: |   False |   False |   False |   False |   False |   False |   False

               

                Modeling_Consolidator-02 [10.136.237.44]

               

                                             |                        ECA Engine
                                             |   0 |   1 |   2 |   3 |   4 |   5 |   6

                ------------------------------------------------------------------------------------------------------

                                Event engine |     |     |     |     |     |     |
                                     Status: | running | running | running | running | running | running | running
                              Queued events: |   0 |   0 |   0 |   0 |   0 |   0 |   0
                Events processing (maximum): |   0 (1) |   0 (1) |   0 (1) |   0 (1) |   0 (1) |   0 (1) |   0 (1)
                             Actions loaded: | 187 | 187 | 187 | 187 | 187 | 187 | 187
                               Rules loaded: |   24073 |   24073 |   24073 |   24073 |   24073 |   24073 |   24073
                                             |     |     |     |     |     |     |
                            Discovery engine |     |     |     |     |     |     |
                                     Status: | running | running | running | running | running | running | running
                            Queued requests: |   0 |   0 |   0 |   0 |   0 |   0 |   0
                             Endpoint count: |   0 |   0 |   0 |   0 |   0 |   0 |   0
                          Endpoints waiting: |   0 |   0 |   0 |   0 |   0 |   0 |   0

                Endpoints discovering (maximum): |  0 (30) |  0 (30) |  0 (30) |  0 (30) |  0 (30) |  0 (30) |  0 (30)

                      Asynchronous requests: |   0 |   0 |   0 |   0 |   0 |   0 |   0
                        Endpoint throttling: |   False |   False |   False |   False |   False |   False |   False
                       Providers throttling: |   False |   False |   False |   False |   False |   False |   False

               

                Modeling_Consolidator-03 [10.136.237.45]

               

                                             |                        ECA Engine
                                             |   0 |   1 |   2 |   3 |   4 |   5 |   6

                ------------------------------------------------------------------------------------------------------

                                Event engine |     |     |     |     |     |     |
                                     Status: | running | running | running | running | running | running | running
                              Queued events: |   0 |   0 |   0 |   0 |   0 |   0 |   0
                Events processing (maximum): |   0 (1) |   0 (1) |   0 (1) |   0 (1) |   0 (1) |   0 (1) |   0 (1)
                             Actions loaded: | 187 | 187 | 187 | 187 | 187 | 187 | 187
                               Rules loaded: |   24073 |   24073 |   24073 |   24073 |   24073 |   24073 |   24073
                                             |     |     |     |     |     |     |
                            Discovery engine |     |     |     |     |     |     |
                                     Status: | running | running | running | running | running | running | running
                            Queued requests: |   0 |   0 |   0 |   0 |   0 |   0 |   0
                             Endpoint count: |   0 |   0 |   0 |   0 |   0 |   0 |   0
                          Endpoints waiting: |   0 |   0 |   0 |   0 |   0 |   0 |   0

                Endpoints discovering (maximum): |  0 (30) |  0 (30) |  0 (30) |  0 (30) |  0 (30) |  0 (30) |  0 (30)

                      Asynchronous requests: |   0 |   0 |   0 |   0 |   0 |   0 |   0
                        Endpoint throttling: |   False |   False |   False |   False |   False |   False |   False
                       Providers throttling: |   False |   False |   False |   False |   False |   False |   False

               

               

              Consolidation status:

               

              Run cf7c4b34ed47f3f9c0740a88ed27756d
              Discovery complete: Yes
              Last received data: 2017-03-17 21:14:09.85
              Endpoints consolidated: 112192 done, 2450536 waiting

               

              Run 6b04eb34eca8aed1d4c20a86ecb30bd5
              Discovery complete: Yes
              Last received data: 2017-03-16 20:28:05.60
              Endpoints consolidated: 9743 done, 379348 waiting

               

              Run cf7c4b34ec7eca4c5c420a88ed27756d
              Discovery complete: Yes
              Last received data: 2017-03-16 21:20:33.08
              Endpoints consolidated: 354859 done, 2207869 waiting

               

              Run 01d20134ecca373047380a0581240b85
              Discovery complete: Yes
              Last received data: 2017-03-16 18:48:38.11
              Endpoints consolidated: 1959 done, 194641 waiting