Troubleshooting DMT UDM Load Balance and Server Group issues

Version 12
    Share:|

    This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.


    PRODUCT:

    Remedy IT Service Management Suite


    COMPONENT:

    Remedy ITSM Foundation



    DETAILS:

    The following guide will be separated in sections outlined below

       
    1. What logs do I gather to troubleshoot a server group set up
    2.  
    3. Example server group configuration
    4.  
    5. Sample error messages you may receive and how to correct them.
    6.  
    7. Links to other useful DMT / UDM articles
      
    It is important to read and understand the documentation behind Load Balance environments as this guide will cover some basics. Please see the following documentation below

    How to set up UDM in a load balance environment
    Configuration Form
    Data Management recommendations
    Configuring the load path

    What logs and config files do I utilize when troubleshooting Load Balance setups?  
       
    • Arjavaplugin.log file typically found in the C:\Program Files\BMC Software\ARSystem\Arserver\Db directory
    •  
    • arcarte.log file file typically found in the C:\Program Files\BMC Software\ARSystem\Arserver\Db directory
    •  
    • arerror.log file typically found in the C:\Program Files\BMC Software\ARSystem\Arserver\Db directory
    •  
    • ar.cfg file typically found in the C:\Program Files\BMC Software\ARSystem\Conf directory
    •  
    • pluginsvr_config.xml file typically found in the C:\Program Files\BMC Software\ARSystem\pluginsvr directory
      
    Example Server group configuration and how to set it up correctly.

    Recommendations:  
       
    • Before configuring the UDM:Config form for a server group environment, you must rank the Atrium Integrator servers
      by using the AR System Server Group Operation Ranking form. If you assign ranking 1 to a server, that server becomes primary
      server and runs the jobs. If the primary server fails, the secondary server (failover server) runs the jobs. Failover server is the server
      to which you assigned ranking 2. If you do not assign ranking to the servers in a server group environment, jobs run on the server which
      receives the request first. For details, see Setting failover rankings for servers and operations
    •  
    • BMC recommends you to select a non-user facing server (admin server) as a primary server.
    •  
    • Recommendation is to select a non-user facing server as your primary server
    •  
    • Default checkbox should be set for the primary server in the UDM:Config form
      

    Configuration for server groups

    1. UDM:Config form configuration  
       
    • Atrium integrator engine server name should match Server-Connect Name value in the ar.cfg file
       User-added image  
       
    • 'Host' value for each entry should match the diserver server hostname deinfed in the armonitor.cfg for diserver/carte plugin
    •  
    • No long names, aliases, ip addresses or host names should be entered in the UDM:Config form
    •  
    • 'Is default' will be set to YES for server defined as Rank-1 in the 'AR System server group operation ranking' form
    •  
    • 'Failover server name' should not have any entries in this field
    •  
    • Port value should be - 20000
      
    Here is the correct way to configure entries for a 3 AR server group environment. In this example - diserver/carte plugin is enabled in all 3 servers in this server group and this is what we recommend and is a best practice. If the default server goes down then the 2nd server in the ranking form will run the jobs (as the plugin will be available and running to take care of the jobs that will be created when server 1 goes down) 

    User-added image User-added image User-added image  
       
    • In this scenario the job will always run on NEWSC-PD-AR--1 irrespective of where it was triggered from and the AR server the user session is on, as the 'Is Default' value is set to "yes" for server NEWSC-PD-AR-01
    •  
    • Recommendation is to always restart AR Service after making any changes to the UDM:Config form
    •  
    • Important note" If 3 UDM jobs were running when the server 1 went down, then those will have to be reviewed and you will need to manualy create a new job with the non promoted data and run it again for the 2nd server. Failover is not automatic, but more of being able to run the jobs on second server if it server 1 does go down.
      
    2. UDM:RAppPassword form:

    This form authenticates the Remedy Application Service password for the $SERVER$ value from the mid-tier and then finds the correct server name from the UDM:Config 

    This form should contain entries for ALL possible server names which can be used to connect to the AR Server, including  
       
    • Host names
    •  
    • IP Addresses
    •  
    • Alias Names
    •  
    • Load Balancer names
    Changes to the UDM:RAppPassword form does not require an AR server restart. 

    3.   Below is the configuration for the UDM:RAppPassword using the above server samples for the UDM:Config form  
       
    • newsc-s: AR Server alias name
    •  
    • newscorp-vip: LB alias name
      
    User-added image

    UDM:PermissionInfo:  
       
    • A regular form UDM:PermissionInfo that contains lists of all pentaho transformations, jobs, database connections, slave servers, partition schemas, directory, cluster schemas and corresponding user group permissions in field 112 as shown below
    •  
    • " Carte Server Name - (Optional) if the "Carte server name" is set for a particular transformation/job, then ARDBC plugin always executes that transformation/job on that particular carte server. This way the load balancing of data integration jobs across multiple carte servers can be done. If the carte server name is not configured for a transformation/job then that transformation/job will be always executed on the local carte server.
       Forms that would contain data if a database migration has taken place, and where to clean them:

    Server References should be fixed in the following forms:   
       
    • UDM:ExecutionInstance
    •  
    • UDM:PermissionInfo
    •  
    • UDM:Config
    •  
    • UDM:RAppPassword
       **If you migrated your database or Changed server names** Please review the following documentation on Migration of environments

    Data should be cleaned up in   
       
    • DMT:Thread Manager
    •  
    • CAI:Events
    •  
    • CAI:EventParams
    •  
    • UDM:Variable
      
    Now that you have configured your enivronment for UDM in a load balance, let's take a look at some soution's to problems you may face 

    What are some example error's / Problems I may face with a load balance environment for UDM?
                                                                                                                                                                                                   
    015/12/22 14:02:36 - CI-CS-CMDBErrorOutput.0 - ERROR (version 4.1.0, build 1 from 2014-07-20 22.27.24) : org.pentaho.di.core.exception.KettleDatabaseException:
    2015/12/22 14:02:36 - CI-CS-CMDBErrorOutput.0 - ERROR (version 4.1.0, build 1 from 2014-07-20 22.27.24) : Did not find Remedy Application Service password for server X  in UDM:RAppPassword Form on server Y
    Verify that you have the load balance server listed in the UDM:RAppPassword form and the windows HOST file contains
           
          AR Server IP servergroupname    
          
          AR Server IP servergroupname.domain.net     
    The jobs are failing with the error messages:
    "Error Connecting to ARSystem" and "Did not find Remedy Application Service password for server xxxxxxx in UDM:RAppPassword Form
    Make sure the UDM:Config and the UDM:RAppPassword form contain the correct entries mentioned in this guide
    ERROR (90): Cannot establish a network connection to the AR System server; servername:31500Make sure the UDM:Config and the UDM:RAppPassword form contain the correct entries mentioned in this guide
    Error while fetching data from form UDM:ExecutionStatus
    ERROR (623): Authentication failed; aradmin
          
           
    • Ensure the UDM:Config form is configured correctly
    •      
    • Make sure the UDM:RappPassword is configured and the passwords are correct
    •      
    • Make sure the AI is configured in a server group
    •     
    ARDBCPluginRepository.java:445       > /* Tue Jul 19 2016 02:38:28.981 */  getListEntryWithFields() FAILs in plugin: ARSYS.ARDBC.PENTAHO
    ERROR (623): Authentication failed; aradmin
          
           
    • Ensure the UDM:Config form is configured correctly
    •      
    • Make sure the UDM:RappPassword is configured
    •      
    • Make sure the AI is configured in a server group
    •     
     ERROR [pool-4-thread-25] com.bmc.arsys.pluginsvr.plugins.a (?:?) - createEntry() FAILs in plugin: ARSYS.ARDBC.PENTAHO
    ERROR (8753): Error in plugin; servername.name.com
    Verify that you have the load balance server listed in the UDM:RAppPassword form and the windows HOST file contains
           
          AR Server IP servergroupname    
          
          AR Server IP servergroupname.domain.net    
    Error in plugin : servername.xyz.com (ARERR 8753)
     An application command failed. (ARERR 4554)
    Application-Delete-Entry "DMT:Action" 000000000060008
    Verify that you have the load balance server listed in the UDM:RAppPassword form and the windows HOST file contains
           
          AR Server IP servergroupname    
          
          AR Server IP servergroupname.domain.net    

    Error in plugin : No Carte Server with name servername exists in UDM:Config form. (ARERR 8753)
    Tue Jun 03 18:36:39 2014 390626 : An application command failed. (ARERR 4554)
    Tue Jun 03 18:36:39 2014 Application-Delete-Entry "DMT:Action" 000000000002304
     

    Verify the server entries in the UDM:Config form
    Nothing is being written to the arcarte, arcarte-stdout log files.If you are in a servergroup, verify that the logs are being written on the correct servers outlined in the UDM:Config form. If you see that the logs are being written to server 2 and not the default server checked in UDM config file is not being used as the primary
    Checker error trying to join Job Console: The check could not query the server parameters: %s 



     

    From central configuration , ensure that server connect is not missing.

    Validate that server-connect-name and other parameters are written properly from ar.conf file. (sample: conect)

    Remove REPORTING server names from next tables:
    select * from servgrp_board

    select * from servgrp_resources

    Restart servers

     

       UDM/DMT Troubleshooting Video


    Links to other useful DMT / UDM troubleshooting articles

    Troubleshooting DMT UDM Load step issues
    Troubleshooting DMT UDM Validate issues
    Troubleshooting DMT UDM Promote issues

     


    Article Number:

    000163160


    Article Type:

    Product/Service Description



      Looking for additional information?    Search BMC Support  or  Browse Knowledge Articles