1 2 Previous Next 16 Replies Latest reply on Sep 27, 2018 6:28 PM by Edison Pioneer

    ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation

    Edison Pioneer

      Hi folks

       

      I am getting ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation whenever I try to access the Data Management → Job Console on the fly out menu. However, when I click on Spreadsheet Management, it opens just fine. My first action was to check RAM usage, which was considerably high (About 96% on primary and about 88% on the secondary. For the 3rd and 4th server I don't remember) I rebooted all the servers in that environment but the issue persists.

       

      Having went through the exisiting literature on this particular error - ARERR 93, I realize there could be many resolutions for this, but we started by gathering API and SQL logs (which I did)

       

      The first question bugging me right now is - why should I get this error only when I am clicking on Data Management → Job Console on the fly out menu, and why not on Spreadsheet Management?

       

      I parsed API+SQL logs through Log Analyzer. I had collected 2 different set of logs & this is what I got from the Log Analyzer.   

       

      I can see from both the forms that DMT:Services is taking up a lot of time, almost 4 minutes.

       

      I Googled for DMT:Services but was surprised to see that there's next to nothing on DMT:Services.

       

      Please guide me here.

       

      Any suggestion would be highly appreciated,

       

      Thanks in advance

      Memento Mori.

        • 1. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
          Robert Poulos

          To debug this further you will need to see the log output for the long running SVE (ServiceEntry) API calls that are running against the DMT:Services form.  The most helpful log output will be a combined API/SQL/Filter log.  Execute the operation with the logging on and then examine all the SQL and Filter log lines for that API call, identified by the RPC ID for that specific call.  With that information you should be able to determine where the delay is that is causing the API to run for an unusually long time, leading to the timeout.  Maybe there is a plug-in call somewhere within the filter sequence that cannot be completed.  You can post the log file for that API call if you need further assistance.  Good luck -

          • 2. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
            Edison Pioneer

            Hi Robert

             

            Thanks for taking interest in this!

             

            I have already gathered API+FLTR+SQL logs for this. Shall examine the FLTR and SQL log lines for that particular API Call (SVE - ServiceEntry)

             

            Having said this, I am afraid the issue has not been captured in the logs while I was reproducing it. I need to check further.

             

            Thanks again

            • 3. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
              Andreas Mitterdorfer

              Have you set the server plugin default timeout to an very high value?

              Enable client side logging (al/api/filter/sql - make sure your user is in the client side logging group and the group is not set to public) and open the job console. It should give you an indication, where the workflow is stuck.

              • 4. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
                Edison Pioneer

                Hi Robert

                 

                I reproduced the issue and captured API + FLTR + SQL logs while doing so. Please go through them and let me know what you find.

                 

                Thanks.

                • 5. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
                  Robert Poulos

                  This workflow was firing during the delay:

                   

                      Line 3810: <FLTR> <TID: 0000000439> <RPC ID: 0000053193> <Queue: Fast      > <Client-RPC: 390620   > <USER: g704053                                      > <Overlay-Group: 1         > /* Tue Sep 25 2018 12:17:37.3570 */ <Filter Level:0 Number Of Filters:30> Checking "DMT:DSV:Service~DMT_PRE_CHECK_SG" (510)

                      Line 3811: <FLTR> <TID: 0000000439> <RPC ID: 0000053193> <Queue: Fast      > <Client-RPC: 390620   > <USER: g704053                                      > <Overlay-Group: 1         >    --> Passed -- perform actions

                      Line 3812: <FLTR> <TID: 0000000439> <RPC ID: 0000053193> <Queue: Fast      > <Client-RPC: 390620   > <USER: g704053                                      > <Overlay-Group: 1         >         0 : Set Fields

                      Line 48360: <FLTR> <TID: 0000000439> <RPC ID: 0000053193> <Queue: Fast      > <Client-RPC: 390620   > <USER: g704053                                      > <Overlay-Group: 1         >               z1D_ServerGroup (304397811) = 0

                      Line 48361: <FLTR> <TID: 0000000439> <RPC ID: 0000053193> <Queue: Fast      > <Client-RPC: 390620   > <USER: g704053                                      > <Overlay-Group: 1         >               z1D_Char01 (301325300) = Run finished with failed checks

                   

                      Line 48362: <FLTR> <TID: 0000000439> <RPC ID: 0000053193> <Queue: Fast      > <Client-RPC: 390620   > <USER: g704053                                      > <Overlay-Group: 1         > /* Tue Sep 25 2018 12:19:44.0390 */ <Filter Level:0 Number Of Filters:31> Checking "DMT:DSV:Service~GetProgressCounts_0Total" (515)

                   

                  So you'll want to examine the filter DMT:DSV:Service~DMT_PRE_CHECK_SG in Developer Studio to determine what it's designed to do.  What is the source of the Set Fields data assignment that seems to be taking 2 minutes?

                   

                  If you look at the last thing in the SQL log before the API call resumes you see this:

                   

                  <API > <TID: 0000000515> <RPC ID: 0000053276> <Queue: Fast  > <Client-RPC: 390620   > <USER: Remedy Application Service               > <Overlay-Group: 1     > /* Tue Sep 25 2018 12:19:44.0310 */ +CE  ARCreateEntry -- schema ConfigCheckerLog from Unidentified Client (protocol 24) at IP address 10.164.244.112 using RPC // :q:0.1s

                   

                  <SQL > <TID: 0000000515> <RPC ID: 0000053276> <Queue: Fast  > <Client-RPC: 390620   > <USER: Remedy Application Service               > <Overlay-Group: 1     > /* Tue Sep 25 2018 12:19:44.0340 */ INSERT INTO T3749(C46007,C46008,C46009,C46010,C46011,C46012,C2,C7,C8,C5,C3,C6,C1) VALUES('ERROR','FAIL','UDM','Check UDM plugin config','The check could not query the server parameters: HW-ITSMQA3 ','Please check if AR Server is running and is reachable on specified port','Remedy Application Service',0,'Remedy Application Service','Remedy Application Service',1537892384,1537892384,'000000000000920')

                   

                  So again I'm thinking there's a plug-in involved and that plug-in may have made this ARCreateEntry call as it was completing what it was supposed to do.  Follow these clues and see if it leads to a better understanding of what's occurring.  Good luck -

                  • 6. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
                    Edison Pioneer

                    Hi Robert

                     

                    Million thanks for such quick response!!

                     

                    I will follow your directions and let you know how it goes.

                     

                    Thanks again

                    • 7. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
                      Edison Pioneer

                      Hi Robert

                       

                      I opened DMT:DSV:Service~DMT_PRE_CHECK_SG in my test server (not the QA server, where I am facing the issue) and found that the plugin involved is ARSYS.FILTERAPI.CONFIGCHECK

                       

                      Then I found this KA, (which was surprisingly created today itself)- Unable to Access Job Console

                       

                      However, I went through my QA servers (where the problem is happening and is different from my test server) and found that the plugin entry is well intact in my ar.conf

                       

                      I also found this Cannot open Data Management - Job Console encounter error "Server group member 'XXXX' is missing from UDM:Config"  but discarded it to be different from my issue.

                       

                      The above screenshot is from my test server.

                       

                      Would you kindly guide me on how should I proceed?

                       

                      Thanks

                      • 8. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
                        Carl Wilson

                        Hi,

                        disable the Filter that is running the check.

                        The process of the check is convoluted at best via the plugin, so as long as you have the correct configuration in place to allow the DMT to run you can skip the pre-checker which stops the Console from opening.

                         

                        Cheers

                        Carl

                        • 9. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
                          Robert Poulos

                          As mentioned by Carl, disabling the filter is certainly an option.  If you want to pursue the error further, I'd suggest looking at the entries in the ConfigCheckerLog form to see if you can resolve some of the issues the ConfigChecker plug-in is complaining about.  From the SQL, you can see various log records being created noting issues like "Server group member 'HW-ITSMQA3' is missing from the UDM:Config form" and "Check UDM plugin config','The check could not locate an entry for 'HW-ITSMQA1' in the UDM:RAppPassword form".  Clearing up those issues might make the lengthy connection attempt go away and allow the filter to complete in a timely manner.  The bottom line is the UDM configuration seems to be causing the pre-checker problems, but as Carl suggested, the checking process may not be designed very well.  Good luck -

                          • 10. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
                            Edison Pioneer

                            Hi Robert

                             

                            I have disabled the filter - DMT:DSV:Service~DMT_PRE_CHECK_SG, but things haven't changed.

                             

                            1) We have 4 AR servers in our QA environment - Server1, Server2 Server3 and Server4. There are also 2 midtier servers - MT1 and MT2.

                            I disabled the filter only on Server1, expecting that doing so will also disable it on the rest of the three servers, since they are all on a server grouping. Apparently, that did not happen.

                            Do I need to disable that filter on every AR server? Including mid tier server?

                             

                            4) Furthermore, I opened UDM:Config and did not find even a single AR server or midtier server mentioned there. There was only one AR server but it looks like my colleague made that entry hoping to get the job console started. Looks like it isn't working out. I opened UDM:PermissionInfo form and the records show entries related to only AR Server 1.

                             

                            Seeking your advise again. Please revert.

                             

                            Thanks in advance

                            • 11. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
                              Edison Pioneer

                              Hi Carl

                               

                              Please check my latest update on the unresponsive job console issue above. Need your advise. Please do.

                               

                              Thanks in advance

                              • 12. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
                                Robert Poulos

                                <Do I need to disable that filter on every AR server?>

                                 

                                No, one server (running Admin operation) is the only server you can make the change on and the others will be notified.  If the other servers are still running that filter then you have a communication problem between your servers.  That would require another troubleshooting effort to understand.  If you can restart those other AR Servers the filter change should take effect.

                                • 13. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
                                  Edison Pioneer

                                  Thanks for your response, Robert.

                                   

                                  Of course, as you say, understanding why the filters haven't changed on other servers will take another troubleshooting effort.

                                   

                                  However, after disabling the filter on ARServer1, I still get ARERR 93 while trying to access the data management ->  job console from the flyout menu.

                                  What do you think about that? Shouldn't that work, at least?

                                   

                                  Thanks again

                                  • 14. Re: ARERR 93 → Timeout during data retrieval due to busy server -- retry the operation
                                    Robert Poulos

                                    The next step would be to gather the API/SQL/Filter log segment like you did before to verify the filter is no longer executing and see what is taking so much time to execute.  There's no way to draw any conclusions without that information.

                                    1 2 Previous Next