1 2 Previous Next 28 Replies Latest reply: Nov 12, 2013 3:45 AM by Jan Sierens Go to original post RSS
  • 15. Remedy failover
    pega

    Check your conf to see if there's Db-Connection-Retries, add if not there.

    100 is default.

  • 16. Remedy failover
    Laurent Matheo

    I see, thanks. I rather tought it would have been something on SQL Server side but well, as long it's working...

  • 17. Re: Remedy failover
    Jan Sierens

    Did you manage to solve your problems?

     

    We are facing the same issue since the upgrade to 7.6.04 SP3 crashes with:

    "Approaching physical stack limit. (ARERR 8749)"

     

    On 7.6.04 SP2 we saw the errors "[DBNETLIB][ConnectionWrite (send()).]General network error. Check your network documentation. (SQL Server 11)" but the AR System didn't crash. These errors appear very random but are more likly at 11:50 exactly here.

     

    sqlserver11.png

     

    We tried to change Db-Connection-Retries, but it didn't solve our problem.

     

    We noticed some other strange Remedy behaviour.  The queues are configured as follows:

    Private-RPC-Socket: 390601 1 1

    Private-RPC-Socket: 390603 1 1

    Private-RPC-Socket: 390620 2 2

    Private-RPC-Socket: 390635 2 2

    By definition

    390620 = Fast

    390635 = List

     

    But the log shows:

    <API > <TID: 0000000284> <RPC ID: 0000028576> <Queue: List > <Client-RPC: 390620 > <USER: Remedy Application Service > <Overlay-Group: 1 > /* vr jun 01 2012 05:47:14.4940 */-GES OK

    <SQL > <TID: 0000004564> <RPC ID: 0000028577> <Queue: Fast > <Client-RPC: 390620 > <USER: Remedy Application Service > <Overlay-Group: 0 > /* vr jun 01 2012 05:47:15.9900 */SELECT Feature, NumberField, LicenseQualifier, ExpireDate, LicenseKey FROM license_cache

     

    Fast and List queues run on the same RPC socket 390620 while List queues should run on 390635.

  • 18. Remedy failover
    Laurent Matheo

    Wow, those are very low...

    Private-RPC-Socket: 390601 1 1

    Private-RPC-Socket: 390603 1 1

    Private-RPC-Socket: 390620 2 2

    Private-RPC-Socket: 390635 2 2

     

    Escalation should be 3, and there are formula about list / fast (nb_cpu*4 for example), I don't have the formulas in my head right now but 2 is very low.

  • 19. Remedy failover
    Jan Sierens

    These are the values for an OOTB installation of ARS with a minimum of configuration. We started to having crashes after we upgraded to ARS 7.6.04 SP2 from SP3. We reproduced the crashes on an OOTB  ARS install without installing any application on top.

     

    These are the setting we use on our development server system:

    Private-RPC-Socket:  390601   1   1

    Private-RPC-Socket:  390603   1   3

    Private-RPC-Socket:  390620   6  30

    Private-RPC-Socket:  390621   5  16

    Private-RPC-Socket:  390626  16  32

    Private-RPC-Socket:  390627   4  10

    Private-RPC-Socket:  390635  10  30

    Private-RPC-Socket:  390680   2   2

    Private-RPC-Socket:  390698  10  10

  • 20. Remedy failover
    Laurent Matheo

    Did you open a ticket with BMC?

  • 21. Re: Remedy failover
    Bartosz M

    Hello everyone,

     

    Today we have a critical issue with our system: 7.6.0.4 SP3.

     

    Sun Nov 18 21:02:27 2012  390603 : The SQL database operation failed. (ARERR 552)

    Sun Nov 18 21:02:27 2012     [DBNETLIB][ConnectionWrite (send()).]General network error. Check your network documentation. (SQL Server 11)

    Sun Nov 18 21:03:52 2012 : Action Request System(R) Server x64 Version 7.6.04 SP2 201110080614

    (c) Copyright 1991-2011 BMC Software, Inc.

    Mon Nov 19 01:56:41 2012  390620 :  : The Request Assignee Group fields are invalid.  Use the menus for the Support Company, Support Organization, Support Group Name, and Request Assignee fields to select this information. (ARERR 150497)

    Mon Nov 19 01:56:40 2012  390620 : An application command failed. (ARERR 4554)

    Mon Nov 19 01:56:40 2012     Application-Delete-Entry "SYS:Action" 000000023006849

    Mon Nov 19 02:27:27 2012  390620 :  : The Request Assignee Group fields are invalid.  Use the menus for the Support Company, Support Organization, Support Group Name, and Request Assignee fields to select this information. (ARERR 150497)

    Mon Nov 19 02:27:26 2012  390620 : An application command failed. (ARERR 4554)

    Mon Nov 19 02:27:26 2012     Application-Delete-Entry "SYS:Action" 000000023006905

    Mon Nov 19 02:45:41 2012  390620 :  : The Request Assignee Group fields are invalid.  Use the menus for the Support Company, Support Organization, Support Group Name, and Request Assignee fields to select this information. (ARERR 150497)

    Mon Nov 19 02:45:40 2012  390620 : An application command failed. (ARERR 4554)

    Mon Nov 19 02:45:40 2012     Application-Delete-Entry "SYS:Action" 000000023007878

    Mon Nov 19 05:19:31 2012 : Action Request System(R) Server x64 Version 7.6.04 SP2 201110080614

    (c) Copyright 1991-2011 BMC Software, Inc.

    Mon Nov 19 07:25:01 2012  390620 : Required field cannot be blank. : AST:CMDB Associations : Request Description01 (ARERR 326)

    Mon Nov 19 07:25:01 2012  390620 : An application command failed. (ARERR 4554)

    Mon Nov 19 07:25:01 2012     Application-Delete-Entry "SYS:Action" 000000023012028

    Mon Nov 19 08:10:54 2012  390620 : Required field cannot be blank. : AST:CMDB Associations : Request Description01 (ARERR 326)

    Mon Nov 19 08:10:53 2012  390620 : An application command failed. (ARERR 4554)

    Mon Nov 19 08:10:53 2012     Application-Delete-Entry "SYS:Action" 000000023013429

    Mon Nov 19 08:34:45 2012  390620 : Required field cannot be blank. : AST:CMDB Associations : Request Description01 (ARERR 326)

    Mon Nov 19 08:34:45 2012  390620 : An application command failed. (ARERR 4554)

    Mon Nov 19 08:34:45 2012     Application-Delete-Entry "SYS:Action" 000000023015216

    Mon Nov 19 10:06:29 2012  AssignEng : Timeout during database update -- the operation has been accepted by the server and will usually complete successfully (SABAD11131)  ARERR - 92

    Mon Nov 19 10:06:29 2012  AssignEng : AR System Application server terminated -- fatal error encountered (ARAPPNOTE 4501)

     

    I had checked our ar.cfg file I there is no Db-Connection-Retries line. I had also read about Select-Query-Hint: NOLOCK line, this lline is also not visible in my ar.cfg file, but I read that this line is highly recommended by BMC. Do you think that adding this two lines can solve this issue or reduce their occurrence?Is this save, will it have bad influance for performance?

     

    Could you also check if our RPC is configured fine?

    Private-RPC-Socket:  390601   1   1

    Private-RPC-Socket:  390603  12  12

    Private-RPC-Socket:  390620  12  12

    Private-RPC-Socket:  390626  10  10

    Private-RPC-Socket:  390635  20  20

    Private-RPC-Socket:  390680  20  20

    Private-RPC-Socket:  390681  10  10

    Private-RPC-Socket:  390682   5  10

    Private-RPC-Socket:  390698  10  20

    Private-RPC-Socket:  390699   3   7

  • 22. Re: Remedy failover
    pega

    This is our RPC config.

    390601     1     1

    390603     1     1    

    390620     16     32

    390621     5     16

    390626     2     4

    390635     2     16

    390680     32     32

  • 23. Re: Remedy failover
    Bartosz M

    Hello,

     

    Today, again we had serious issues with the production environment and we had to restart several application servers.

     

    Mon Nov 26 09:43:45 2012  AssignEng : Timeout during database update -- the operation has been accepted by the server and will usually complete successfully (SABAD11131)  ARERR - 92

    Mon Nov 26 09:43:45 2012  AssignEng : AR System Application server terminated -- fatal error encountered (ARAPPNOTE 4501)

     

    This error is appearing each monday for three weeks. Last weekend we had added a new parameter to ar.cfg file:

    Select-Query-Hint: NOLOCK but it do not solved our problem.

    We were able to log this issue. I will try to add some logs.

    Have you got any idea what can cause this problem?

  • 24. Re: Remedy failover
    Laurent Matheo

    Well if it's each monday there is a lead... Do you have something special running on week ends or on mondays? Integration? Heavy workflow?

  • 25. Re: Remedy failover
    Bartosz M

    Laurent we are checking it now. If we find something interesting I will share it. Can you also please have a look at our logs?

    http://www.sendspace.com/filegroup/jH9YRLuzjeHGThmeDER40w

     

    Regards

    Bartosz

  • 26. Re: Remedy failover
    andrew NameToUpdate

    What ever came of this?  I thought you were going to share.

  • 27. Re: Remedy failover
    Luciano Muller Nicoletti

    Hello,

     

    We have the same issue and adding Db-Connection-Retries: 200 did not solved the problem.

    How have you fixed this ?

    Thanks.

  • 28. Re: Remedy failover
    Jan Sierens

    We had a ticket with BMC for a similar issue which was never solved.

     

    The behaviour we analysed is, if there was a connection problem to the DB the ar server will reconnect to the DB. No problem so far. After such an occurrence the ar server would hang not directly but within a few days. Although we send excessive logs and analysis's, we couldn't convince BMC this was a bug. They kept on saying this was a DB problem which it is certainly not the case. The problem only started after installing SP3 of 7.6.04.

     

    Now we monitor for connection problems and if they occur we restart the ar server preventive.


1 2 Previous Next