Troubleshooting Job Console DMT UDM load issues

Version 41
    Share:|

    This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.


    PRODUCT:

    Remedy IT Service Management Suite


    COMPONENT:

    Remedy ITSM Foundation



    PROBLEM:

    This is a maintained guide on how to troubleshoot UDM load step issues in Remedy ITSM versions 9.x and up. BMC Support will maintain this article and update it as needed.

    This article is also available in Spanish version KA000168542 here.


    SOLUTION:

    Before reading this guide, it is important to understand the UDM / DMT overview process of data loads found below. It contains a link to the configuration of DMT, Data load job overview, and some best practices:

    1. Overview of the data load process

    2. Best Practices for the Load step:

       
    • A job failed in the load step cannot be re-run, so it should be cancelled and cleaned up. You will need to run the job as a new job
    •  
    • Most load step issues can be easily avoided if the environment is configured properly for UDM such as load balance configurations and data paths
    •  
    • Ensure DMT:SYS:SequencingEngine has all the sequence records. These records are created when you run your first UDM Job load
    •  
    • Cleanup the UDM:Variable, Load forms such as CTM:LoadPeople Records, DMT:ThreadManager and CAI:Events forms periodically especially if large volumes of data have piled up.
       If you are in a load balanced environment, please see the following KB article in addition to this article found here:
    Troubleshooting DMT/UDM Load balance guide

    The following guide will be separated in sections outlined below
      
       
    1. What logs do I gather for load steps stuck in queued
    2.  
    3. How do I troubleshoot if my load step is stuck "In Progress"
    4.  
    5. Typical settings that need to be verified
    6.  
    7. Forms that are used in the load step
    8.  
    9. Example root causes and how to correct them.
    10.  
    11. Links to other useful DMT / UDM articles
       What logs do I gather for stuck DMT / UDM load jobs?

    All load step error details can be found in the following logs
      
       
    • arcarte.log file typically found in the C:\Program Files\BMC Software\ARSystem\Arserver\Db directory
    •  
    • arjavaplugin.log file typically found in the C:\Program Files\BMC Software\ARSystem\Arserver\Db directory
    •  
    • Additionally verification that escalations are running on the primary server and are turned on found here
    •  
    • Pentaho plugin logs found here
       What do I need to inspect if my Load step is stuck "In progress"?

    Jobs that are stuck in progress can mean many things.

    A. Escalation pools running on the same pool. Verify that the following escalations are running on the same pool and not shared with other escalations and that they are running
      
       
    • DMT:DJS:SetStatus Fires Every 1min and Set the z1D_Action Field to "CHECKTRANSSTATUS"
    •  
    • Filter DMT:DJS:ChkStepCompleted Triggers Immediately once the above Field Value is Set
       A guide on how to do this can be found here. Setting escalation pools

    B. Custom Pentaho job that is not configured correctly. See the following documentation and communities as this is considered a customization and may require outside BMC support assistance

    Pentaho Communities
    Managing Customizations
    Creating Custom Jobs

    C. Incorrect ar.cfg and pluginsvr_config.xml files that could contain incorrect data. Example seen below in example error message.

    What are the typical settings utilized for DMT / UDM and how do I verify these if they are correct?
     
    Detailed documentation can be located on the following URL's. Not having the correct configuration can impact the Load Process.

    Configuring data management
    Data Management recommendations
    Configuring the load path
    [SYS:System+Settings  is the form for direct access]
    Data Management Application preferences

    Once you understand the settings utilized for DMT / UDM verify the CAI Plug-in Registry Settings. The following are recommendations depending on the load.
      
       
    • Set total threads value     
           
      • 6 for normal production server data loads
      •    
      • 8 or 12 for bulk data loads (onboarding as an example)
      •   
    •  
    • Set private queue values - 390620-390625, 390627-390629
      
    User-added image  
       
    • Configuration Settings for Normal Data Loads (in the 100s and 100s) in a Production server in the CAI Plugin Registry form:
    •  
    • Total threads can be set to either 3 or 6. 3 is sufficient in most cases.
    •  
    • Use one of the fast list thread values for the Private Queue field. Use between 390620 and 390629 (do not use 390626 as it is used by the AR Server for its loop back functionality).
    •  
    • Configuration Settings for Bulk Data Loads ( in the 10,000s and above ranges) for an onboarding scenario before the Production server is live OR if you are in a situation to load bulk after you are live (in a time were users are off the server):
    •  
    • Set values between 8 - 12 for the Total threads field
       What forms are utilized in the load step of UDM and what should I look for?

    Server References should be fixed in the following forms:
          **If you migrated your database or Changed server names** Please review the following documentation on Migration of environments

    Data should be cleaned up in  
       
    • DMT:Thread Manager
    •  
    • CAI:Events
    •  
    • CAI:EventParams
    •  
    • UDM:Variable
      
    What example errors should I look for and how do I correct them?

    **It is important that if the load step fails that you copy the job to a new one. DO NOT run the failed job again.**

                                                                                                                                                                                                                                                                                                                                                                                                         
    Loads seem to point to the load balance and I see an error in arjavaplugin error regarding a timeout connection.
     

    ERROR (91): RPC call failed; {AR servername}:{AR Port} Connection reset" on CAI:Events running a UDM job

          
           
    • Verify that you have the load balance server listed in the UDM:RAppPassword form and the windows HOST file contains
    •     
           
          AR Server IP servergroupname    
          
          AR Server IP servergroupname.domain.net     
    No file(s) specified! Stop processing.
    2015/06/06 05:49:16 - COM_LoadCompany.0 - ERROR (version 4.1.0, build 1 from 2014-07-20 22.27.24) : Error initializing step [COM_LoadCompany]
    2015/06/06 05:49:16 - COM_LoadCompanyAlias.0 - ERROR (version 4.1.0, build 1 from 2014-07-20 22.27.24) : No file(s) specified! Stop processing.
    2015/06/06 05:49:16 - COM_LoadCompanyAlias.0 - ERROR (version 4.1.0, build 1 from 2014-07-20 22.27.24) : Error initializing step [COM_LoadCompanyAlias]
          
           
                                                                     
    MTATTACHPATH" variable missing for the involved Job in the "DMT:TransformationParam" form
           
          
          
           
                                                                                
    Solution                                                                                             
                    
                    The following Advanced Search on the "DMT:TransformationParam" form (on a working environment) would show the records related to this variable for the AI jobs:
     
                     
                     'Variable Name' = "DMTATTACHPATH"               
                     
                      An export/import of missing records from working to non-working environment resolved the issue              
           
          
    2019/01/09 10:15:15 - CI-APP-AROutput.0 - Impact String : couldn't convert String to Integer
    2019/01/09 10:15:15 - CI-APP-AROutput.0 - Unparseable number: " 2000.0"
          
           
    • This means there is an issue with a column on the spreadsheet. Verify this on the text fields that you have a small green Triangle. If you do not, click in this field and it will set the integer. This is an issue with Microsoft Excel. Data issue.
    •     
    Load step is skipped.      
           
    • Load step is missing a valid Atrium Integrator name, check the step or as the DMT Admin, check for a valid entry in the Atrium Integrator Jobs from the Job Console.  NO data was loaded for this step, and related validate and promote steps . Make sure Atrium Integrator is configured
    •     
    Data management process flow is
     currently being initialized or rebuilt, this will finish shortly, please run the job in a few minutes”
    •       •Wait a few minutes, on very slow systems the underlying records that UDM relies on for ordering operation are still being built.
    •       If after a period of time this still happens, and administrator may go to the for DMT:SYS:StagingFormDependency and trigger the rebuild of the Data Management process flow records using the below steps:
    1.     Disable all logs (ensure no logs are on).
    2.     Open ‘DMT:SYS:SequencingEngine’ form in search mode.
    3.     Search for records with ‘Parent_Job_GUID’ = “DO NOT REMOVE” on that form and delete all matching records from the form.
    4.     Open form ‘DMT:SYS:StagingFormDependency’ in search mode (blank search screen) and then click the button Rebuild Sequence Table.
    5.     Wait for a few minutes (it may take some time so wait until hourglass disappears from the screen) and then query form ‘DMT:SYS:SequencingEngine’ and see if you have records with values with ‘GO’ prefix in data in field Stream (304302140).
     
    Load Step stays In Progress forever,
     even though all data may be in the staging form, seemingly done
          
           
    • Have an Administrator check that the Escalation Server is turned on (check the Server Information form)  The Administrator can query the UDM:ExecutionStatus form (Execution Instance name is the Instance ID of the Load Step) •If no record, the AI engine never got called (custom AI Job/Transformation)
    •     
    java.lang.OutOfMemoryError: Java heap space      
           
    • This may come if there are multiple concurrent jobs running on carte server. By default we support 1G of heap space but if required user can increase it to adopt more concurrent no of jobs.
    •     
    UDM:Execution when I perform any search, I get: Error in plugin" Get List Entry With Fields not supported on form UDM:Execution (ARERR 8753)  after Load Balance name change.

    Error in plugin : Invalid Execution Instance. Execution Instance AGHAA5V0GG40RAN7D2XYEGI6DXP3N5 does not exist or user Remedy Application Service is not allowed to access it. (ARERR 8753)
    An application command failed. (ARERR 4554)
    Application-Delete-Entry "DMT:Action"
          
           
    • Verify that the server name is correct in the ar.cfg.
    •      
    • Verify that the UDM load path can still be reached.
    •      
    • Add the Load balance names to the windows HOST file.  
    •     
          
          AR Server IP servergroupname    
          
          AR Server IP servergroupname.domain.net     

    We failed to initialize at least one step. Execution can not begin!
    No file(s) specified! Stop processing.
    Error initializing step [CTM_LoadPeopleModification
    ERROR (version 6.0.1.0-386, build 1 from 2017-05-10 13.39.58 by buildadm) : Source folder/file [/opt/bmc/ARSystem/db/UDM/DJBnumber] can not be found!
          
           
    • Verify that the UDM Load Path can be reached and that the shared path is a shared path with Read and Write
    •     
          
                
    <ERROR> <ARDBCPluginRepository > < ARDBCPluginRepository.java:184 > /* Wed Sep 26 2018 04:09:56.972 */ createEntry() FAILs in plugin: ARSYS.ARDBC.PENTAHO      
           
    • Verify that the carte server is running by entering the following URL http://<carte host>:20000, It should display a user name password windows security dialog. Enter your any AR server user name and password and displays the current states of any jobs running on that carte server. If the carte is down on hitting the url browser will say could not connect.
    •     
    ARPentahoPlugin > < ?:? > /* Fri Sep 14 2018 06:29:17.743 */ Error while creating an entry on form UDM:ScheduleProcessor
    com.bmc.arsys.pdi.ardbc.data.InsufficientPermissionException: Either job PowerBI Job
     
    Exception in thread "main" java.lang.IllegalArgumentException: Malformed
    \uxxxx encoding.
          
           
    • carte process does not recognize "\u" and so if there is any entry in ar.conf that contains "\u". The error will be thrown. For example: BMC Software\ARSystem\ARServer\Db\userupload.log. Correct the path and restart AR server
    •     
    92: "Timeout during database update -- the operation has been accepted by the server and will usually complete successfully", "servername:32825 ONC/RPC call timed out";      
           
    • Verify that the CAI Plugin-Registry has the correct settings as seen above. Increase the AR timeout settings
    •     
    I have verified in the forms mentioned in the server reference clean up, and old servers are still mentioned. I have ran the following query and have identified the old server still listed

    select * from servgrp_applic
    select * from servgrp_board
    select * from servgrp_config
    select * from servgrp_ftslic
    select * from servgrp_op_mstr
    select * from servgrp_resources
    select * from servgrp_userlic
    select * from AR_System_Server_Group_Operati
    select * from AR_System_Service_Failover_Ran
    select * from AR_System_Service_Failover_Whi
     
    Error Code: 12103
    Error Message: An error was encountered during CI data load
    CMDB Error Message: java.lang.NullPointerException
     
    Couldn't open file file:////servername/D$/Workspace/UDM/DJB000000003317/cicmdbfile.txtSee KB article 000097446.If this is a server group verify if the servers can communicate with each other and can read write to the UDM path set in the configuration.
    Error Code: 12116
    Error Message: You do not have access to modify the Company information supplied. You must either have the Company added to your Access Restrictions, or have Unrestricted Access set to Yes in your People profile.
    kettle.properties files including the variable AR_USER. These variables are used and referenced from the time Pentaho/UDM plugin is started and will supercede any which may be defined on a job itself.
    Comment the AR_USER parameter in kettle.properties file, restart the AI and then test the issue.

    Another solution is add "Remedy Application Service" in "
    CTM:People Permission Groups" by performing following instructions.

    Open "
    CTM:People Permission Groups" in New mode.      
          Remedy Login ID* = Remedy Application Service
        Person ID* = PPL000000001
        Permission Group* = Unrestricted Access
        Permission Group ID* = 1000000000
        
          
              Permission Group Type = System Only    
          
              Status = Enabled    
    ERROR (417): The group name is not a defined group 
    ARSYS.ARDBC.PENTAHO : RPC: Miscellaneous tli error - System error (Connection refused)
    Wed May 01 00:19:45 2019  390620 : Cannot establish a network connection to the AR System Plug-In server : Server Name (9999) ARSYS.ARDBC.PENTAHO : RPC: Miscellaneous tli error - System error (Connection refused) (ARERR 8760)
     
    See KB 000099262

    ARERR 8753 Error in plugin. Either job Operational_Catalog(Objec Id null, Directory Id 000000000000) does not exist Or User appadmin is not allowed to access job Operational_Catlaog ARERR 8753

    See KB 000166371

     
    Could not create file "file:///usr/jdk/instances/jdk1.8.0/jre/bin/server/cicmdbfile.txt".

    CI-CS-AttributeErrorLogging-GetFieldInfo.0 - ERROR org.pentaho.di.trans.steps.textfileoutput.TextFileOutput.init
    Verify in spoon/pentaho that the steps related have a filename path called S{JOBDIR}cicmdbfile

    User-added image
      
    UDM/DMT troubleshooting video:




    Links to other useful DMT / UDM troubleshooting articles


    Troublehsooting DMT UDM Load balance issues
    Troubleshooting DMT UDM Validate issues
    Troubleshooting DMT UDM Promote issues


      

     


    Article Number:

    000163011


    Article Type:

    Solutions to a Product Problem



      Looking for additional information?    Search BMC Support  or  Browse Knowledge Articles