1 2 Previous Next 15 Replies Latest reply on Dec 7, 2018 4:13 AM by Mark Lemar

    Failed to start reasoning services

    Turshit Singh

      Hello Experts,

       

      We have build a new 11.3 ADDM appliance, no TKU and any data loaded into the appliance.

      We unable to start the tideway services after the reboot of the server.

      It stuck at

      Starting Reasoning service: Loading rules and configuration

       

      We have performed the reboot, checked the NIC configuration, killed reasoning services, still unable to start the services.

       

      Thu Dec  6 14:40:21 2018 : tw_svc_reasoning started.

      omniORB: (0) 2018-12-06 14:40:21.984745: Failed to bind to address :: port 25035. Address in use?

      omniORB: (0) 2018-12-06 14:40:21.984774: Error: Unable to create an endpoint of this description: giop:tcp::25035

      Traceback (most recent call last):

        File "./main.py", line 187, in <module>

        File "./main.py", line 163, in main

        File "./reasoninginterface.py", line 2241, in init

        File "/usr/tideway/lib/python2.7/site-packages/omniORB/CORBA.py", line 389, in resolve_initial_references

          return self._obj.resolve_initial_references(identifier)

      omniORB.CORBA.INITIALIZE: CORBA.INITIALIZE(omniORB.INITIALIZE_TransportError, CORBA.COMPLETED_NO)

       

      how can we start the services?

        • 1. Re: Failed to start reasoning services
          Brian Morris

          Are you running on a cluster? If so, you need to use tw_cluster_control when you start services.

           

          Stop all the services using this command and then start them again. Also a reboot should help this, but the root cause is probably using tw_service_control instead of tw_cluster_control.

          • 2. Re: Failed to start reasoning services
            Turshit Singh

            No, it is a standalone appliance

            • 3. Re: Failed to start reasoning services
              Brian Morris

              Also this page is old, but describes a similar issue.  It seems like something is leaving behind a reasoning service on the port, so it's conflicting with the new one.  Did you recently change the IP?

               

              ADDM: Reasoning service will not start: Failed to bind to address

              • 4. Re: Failed to start reasoning services
                Turshit Singh

                Thanks Brian,

                i already tried that one and still unable to start the services.

                • 5. Re: Failed to start reasoning services
                  Brian Morris

                  OK, so stand-alone system. What's the results of tw_service_control --status?

                   

                  What's the result of netstat -ltnp | grep -w ':25035'

                   

                  Does the issue persist when you try tw_service_control --restart

                   

                  If that doesn't work, then do:

                   

                  tw_service_control --stop

                   

                  Verify that the services are all stopped:  tw_service_control --status

                   

                  Reboot the appliance

                  • 6. Re: Failed to start reasoning services
                    Turshit Singh

                    [tideway@*************** ~]$ tw_service_control --status

                    ADDM application services are starting

                        Security service:                        2561          [  OK  ]

                        Model service:                           2612          [  OK  ]

                        Vault service:                           2888          [  OK  ]

                        Discovery service:                       2936          [  OK  ]

                        Mainframe Provider service:              3006          [  OK  ]

                        SQL Provider service:                    3090          [  OK  ]

                        CMDB Sync (Exporter) service:            3194          [  OK  ]

                        CMDB Sync (Transformer) service:         3277          [  OK  ]

                        Reasoning service:                       3378          [  OK  ]

                        Tomcat service:                                        [ STOP ]

                        Reports service:                                       [ STOP ]

                        External API service:                                  [ STOP ]

                        Application Server service:                            [ STOP ]

                     

                    [tideway@************** ~]$ netstat -ltnp | grep -w ':25035'

                    (Not all processes could be identified, non-owned process info

                    will not be shown, you would have to be root to see it all.)

                    tcp        0      0 :::25035                    :::*                        LISTEN      3378/python

                     

                     

                    yes, issue persist after running tw_service_control --restart.

                    • 7. Re: Failed to start reasoning services
                      Brian Morris

                      What changes did you make to the appliance prior to this happening? Did it ever work? Did you configure the networking through the netadmin shell account?

                       

                      Try this too, I've had that work before:

                      tw_service_control --stop

                       

                      Confirm that the services are all stopped:  tw_service_control --status

                       

                      Reboot the appliance

                      • 8. Re: Failed to start reasoning services
                        Brice-Emmanuel Loiseaux

                        "Starting Reasoning service: Loading rules and configuration" is a normal startup step that could take some time. When you see this, you can quickly check that reasoning is doing something by tailing the log (tail -f /usr/tideway/log/tw_svc_reasoning.log). You should be able to see the progress of rules loading.

                         

                        "omniORB: (0) 2018-12-06 14:40:21.984745: Failed to bind to address :: port 25035. Address in use?" usually means you try to start reasoning service but a process is still present that you need to manually kill to solve the problem. As suggested by Brian, a reboot also cleans this up.

                        • 9. Re: Failed to start reasoning services
                          Andrew Waters

                          This means that an old reasoning service has failed to shutdown correctly. Kill all the old reasoning processes. E.g. run ps axw | grep reasoning to get the pids and kill them.

                           

                          If you still get it after reboot then this means that something is dying a reasoning is starting and it tries to unsuccessfully restart. In that case what is reported in tw_svc_reasoning.out and tw_svc_reasoning.log

                          2 of 2 people found this helpful
                          • 10. Re: Failed to start reasoning services
                            Mark Lemar

                            I'm working with Turshit on this as he is now away.  I killed all reasoning processes & rebooted as advised by Andrew, but the issue remains.

                             

                            Not much in tw_svc_reasoning.out:

                             

                            Thu Dec  6 17:37:56 2018 : tw_svc_reasoning started.

                             

                            Lots of info in tw_svc_reasoning.log, particularly in relation to what I assume is a pattern for IBM Informix Dynamic Server?

                             

                            tail tw_svc_reasoning.log

                              File "./ami.py", line 162, in _callMethodException

                              File "/usr/tideway/lib/python2.7/site-packages/omniORB/ami.py", line 80, in raise_exception

                                self._poller.raise_exception()

                            ECAEngineError: ReasoningCORBA.ECAEngineError(reason='Error loading rule module "generated_code.IBM.InformixDynamicServer"')

                            139734996543232: 2018-12-06 17:46:43,826: reasoning.ami: ERROR: eeReloadRules failed

                            Traceback (most recent call last):

                              File "./ami.py", line 162, in _callMethodException

                              File "/usr/tideway/lib/python2.7/site-packages/omniORB/ami.py", line 80, in raise_exception

                                self._poller.raise_exception()

                            ECAEngineError: ReasoningCORBA.ECAEngineError(reason='Error loading rule module "generated_code.IBM.InformixDynamicServer"')

                            • 11. Re: Failed to start reasoning services
                              Bob Anderson

                              I'll defer to Andrew, but if this is a brand new standalone appliance - no data - how about just doing a tw_model_wipe (with appropriate options) to clear all the data and reload the base taxonomy and TKU

                              • 12. Re: Failed to start reasoning services
                                Andrew Waters

                                Okay - for some reason the system cannot load the code generated for the IBM.InfomixDynamicServer pattern. The tw_svc_eca_engine.log may tell you why. Given the message is has a TKU. Something else must have been done because otherwise everybody would have problems with 11.3.

                                 

                                In order to get the system working again.

                                 

                                * On the command line run tw_svc_reasoning --deactivate-patterns

                                * Once the "tw_svc_reasoning started" message appears Ctrl+C out of it

                                * Start the services as normal.

                                 

                                This deactivates all patterns.

                                2 of 2 people found this helpful
                                • 13. Re: Failed to start reasoning services
                                  Mark Lemar

                                  The tw_svc_eca_engine.log contains the following:

                                   

                                  E04-140049708971776: 2018-12-07 09:12:27,728: reasoning.ecaengine.rule_loader: ERROR: Error loading rule module "generated_code.IBM.InformixDynamicServer"

                                  Traceback (most recent call last):

                                    File "./rule_loader.py", line 65, in _loadModule

                                    File "./hooked_importer.py", line 221, in load_module

                                  IOError: [Errno 13] Permission denied: '/usr/tideway/var/code/generated_code/IBM/InformixDynamicServer/__init__.py'

                                  E04-140049708971776: 2018-12-07 09:12:27,728: reasoning.ecaengine.ecaengine: CRITICAL: Error loading rules: Error loading rule module "generated_code.IBM.InformixDynamicServer"

                                   

                                  Running the tw_svc_reasoning --deactivate-patterns command produces the following which doesn't look right as I didn't need to Ctrl+C out of it.

                                   

                                  tw_svc_reasoning --deactivate-patterns

                                  Fri Dec  7 09:13:53 2018 : tw_svc_reasoning started.

                                  omniORB: (0) 2018-12-07 09:13:53.747141: Failed to bind to address :: port 25035. Address in use?

                                  omniORB: (0) 2018-12-07 09:13:53.747164: Error: Unable to create an endpoint of this description: giop:tcp::25035

                                  Traceback (most recent call last):

                                    File "./main.py", line 187, in <module>

                                    File "./main.py", line 163, in main

                                    File "./reasoninginterface.py", line 2241, in init

                                    File "/usr/tideway/lib/python2.7/site-packages/omniORB/CORBA.py", line 389, in resolve_initial_references

                                      return self._obj.resolve_initial_references(identifier)

                                  omniORB.CORBA.INITIALIZE: CORBA.INITIALIZE(omniORB.INITIALIZE_TransportError, CORBA.COMPLETED_NO)

                                   

                                  I see the "tw_svc_reasoning started" message but trying to restart the services produces the following:

                                   

                                  sudo /sbin/service tideway start

                                  Starting local ADDM application services

                                  Waiting for other service operations to complete ...

                                   

                                  Before going on leave, Turshit had also raised support case #00625834 about this issue.  I'll reference this community topic on the support ticket.

                                  • 14. Re: Failed to start reasoning services
                                    Andrew Waters

                                    You have to have killed all the reasoning processes first.

                                    2 of 2 people found this helpful
                                    1 2 Previous Next