Remedy - Server - Server Group Coordinator Causing Non-Responsiveness

Version 1
    Share This:

    This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.


    PRODUCT:

    Remedy AR System Server


    COMPONENT:

    AR System Server


    APPLIES TO:

    9.1.04+



    PROBLEM:

    The AR System Server is non-responsive to changes, particularly when making configuration changes.  This includes making changes on Centralized Configuration, changing Server Group Rankings, and starting up a new process (like FTS or Email Engine) where they might now change the active operation owner.

    How to tell if you're affected by this issue:

    1. Run a thread dump using VisualVM or other tool on AR System Server.

    https://docs.bmc.com/docs/brid1902/jvm-monitoring-837866011.html

    2. Look for Blocked Threads, like below:

    "DefaultMessageListenerContainer-1" - Thread t@466
       java.lang.Thread.State: BLOCKED

    3. Look for "locked" SyncID in that thread's stack.

    - waiting to lock <68edc27a> (a java.util.concurrent.ConcurrentHashMap) owned by "DefaultMessageListenerContainer-1" t@457

    4. Look up that thread (the t@xxx number at the end) and see if it is in a "WAITING" state.

    "DefaultMessageListenerContainer-1" - Thread t@457
       java.lang.Thread.State: WAITING

    5. Confirm if the "org.apache.activemq.transport.FutureResponse.getResult" function is used in this stack.

    If thread is DefaultMessageListenerContainer and one of the functions are "org.apache.activemq.transport.FutureResponse.getResult" then this issue is confirmed.


    CAUSE:

    The problem is when a server group member attempts to message the server group coordinator, the coordinator doesn't respond quickly. This can cause server blocking/hangs for other non-related processes.


    SOLUTION:

    The only solution to this issue is to stop the Coordinator server, and allow another server to assume the Server Group Coordinator responsibility.

    You can follow these steps to determine the Server that is considered the Coordinator:

    1. Run the following SQL statement against the database:

    SELECT * FROM servgrp_config

    2. Whichever server is returned, is the server that needs to be restarted.


    Article Number:

    000174240


    Article Type:

    Solutions to a Product Problem



      Looking for additional information?    Search BMC Support  or  Browse Knowledge Articles