How to troubleshoot a cell startup/not starting problem.

Version 1
    Share:|

    This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.


    PRODUCT:

    BMC Event Manager Base


    APPLIES TO:

    BMC Event Manager Base



    QUESTION:

    This KA describes how to troubleshoot a cell startup/not starting problem as well as listing the common causes.


    ANSWER:

     

    Legacy ID:KA399828

      

    Enabling cell tracing and starting cell in foreground

      

    To determine why the cell is unable to start, enable cell trace and start the cell in foreground mode. Starting the cell in foreground mode is the preferred way to troubleshoot cell startup failures because it will output messages even if the cell is unable to write to the trace file.

      
       
    • In pw/server/etc/<cell> directory, create a file named mcell.trace (if not present) with contents: ALL ALL stderr
      
       
    • Start the cell in foreground mode: mcell -d -n <cell>
      

    The trace will be displayed in stdout and will display the reason for the cell failing to start.
    One possible workaround is to rename the xact to xact.1. If the xact is corrupt, and doesn't contain any events/data of consequence, you could also remove it, and then the cell may be able to start right away.

    The common causes of cell startup failure are as follows:

      

    Cell fails to start due to corrupt mcdb

      

    Starting cell in foreground displays following error:

      

    20130515 102612.423000 mcell: EVTLOG: BMC-IMC065024E: Error in transaction history file C:/Program Files/BMC Software/ProactiveNet/pw/server/var/Admin/mcdb, line 1: bad command header

      

    20130515 102612.426000 mcell: CONTROL: BMC-IMC200015F: Could not reload State file

      

    20130515 102612.472000 mcell: SERVICE: BMC-IMC050141V: Disconnecting from destinations...

      

    20130515 102612.473000 mcell: SERVICE: BMC-IMC050114V: Disconnecting clients...

      

    20130515 102612.473000 mcell: SERVICE: BMC-IMC050106V: Cell shutdown...

      

    The preceding error message indicates that there is an inconsistency with the state file (mcdb). In this particular situation it will be necessary to revert to a previous mcdb file located in the same directory. A dir listing shows:

      

    Directory of C:\Program Files\BMC Software\ProactiveNet\pw\server\var\Admin

      

    15/05/2013 10:38 <DIR> .
    15/05/2013 10:38 <DIR> ..
    15/05/2013 10:38 11 datid.txt
    15/05/2013 10:38 11 evtid.txt
    15/05/2013 10:38 0 mcdb
    15/05/2013 08:26 237,483 mcdb.11932a910
    15/05/2013 09:28 237,326 mcdb.119338a70
    15/05/2013 10:17 237,495 mcdb.119344760
    14/05/2013 12:51 12 smid
    15/05/2013 10:37 4,286 xact
    15/05/2013 09:28 47,343 xact.119338a70.1
    15/05/2013 10:16 38,431 xact.119344760.1
    15/05/2013 10:17 299 xact.119344870.1
    15/05/2013 10:37 9,105 xact.119349320.1
    12 File(s) 811,802 bytes
    2 Dir(s) 10,376,409,088 bytes free

      

    From the dir listing you can see that mcdb.119344760 was the previous file and xact.119349320.1 was the transaction file that needs to be reapplied. The following steps will be needed to ensure that there is no loss of data:

      
       
    • Take a backup of the cell var directory
    •  
    • Rename mcdb to mcdb.bak
    •  
    • Rename mcdb.119344760 to mcdb
    •  
    • Rename xact to xact.2
    •  
    • Rename xact.119349320.1 to xact.1
    •  
    • Run statbld to create a new mcdb from the xact.1 and xact.2 files with command statbld -n <cell>
    •  
    • After statbld has run successfully, then the cell can be started.
      

    Cell fails to start due to stabld not working

      

    Starting cell in foreground shows following error:

      

    20103831.109000 mcell: EVTLOG: BMC-IMC065102V: Checking for trailing transaction log file C:/Program Files/BMC Software/ProactiveNet/pw/server/var/Admin/xact

      

    20130515 103831.111000 mcell: EVTLOG: BMC-IMC065103V: Processing trailing transaction log file

      

    20130515 103831.126000 mcell: EVTLOG: BMC-IMC065051I: Performing State Build - please wait

      

    BMC Impact State Builder 9.0.20 (Build 231155889 - 18-Feb-2013) [w4]

      

    Copyright 1998-2012 BMC Software, Inc. as an unpublished work. All rights reserved.

      

    20130515 103831.293000 mcell: SYSTEM: BMC-IMC012011V: Executed program C:/Program Files/BMC Software/ProactiveNet/pw/server/bin/statbld.exe - exit code 1

      

    20130515 103831.294000 mcell: EVTLOG: BMC-IMC065012E: State Builder failed to process trailing transactions

      

    20130515 103831.296000 mcell: EVTLOG: BMC-IMC065011E: Cannot activate State Builder

      

    20130515 103831.296000 mcell: EVTLOG: BMC-IMC065004F: Cannot start with trailing transaction log file C:/Program Files/BMC Software/ProactiveNet/pw/server/var/Admin/xact - repair first

      

    20130515 103831.297000 mcell: SERVICE: BMC-IMC050141V: Disconnecting from destinations...

      

    20130515 103831.298000 mcell: SERVICE: BMC-IMC050114V: Disconnecting clients...

      

    20130515 103831.298000 mcell: SERVICE: BMC-IMC050106V: Cell shutdown...

      

    The preceding error messages, indicate a problem with the statbld process. There are a number of reasons for its failure. Now, run the mlogchk -n <cell> command, as this will perform a consistency check and advise of any action required. If mlogchk does not find any inconsistency then run statbld with trace enabled:

      
       
    • In pw/server/etc directory, modify file statbld.trace so that it contains: ALL ALL stderr
      
       
    • Run statbld from a command window: statbld -n <cell>
      

    The trace will be displayed in the stdout and will show the reason for statbld failure.

      

    Cell fails to start with message "Impossible to bind endpoint"

      

    Starting cell in foreground shows the following error:

      

    20130515 142116.785000 mcell: SERVICE: BMC-IMC050005F: Server <ANY> 10.64.9.165/ 1827 setup error 5 (Impossible to bind endpoint (10048))

      

    This indicates that the cell has been unable to bind to the port defined in the pw\server\etc\mcell.dir file. From the message we can see it is port 1827. The following are known reasons for this problem:

      
       
    • There is another process using that port. A netstat command should be run to see if anything is already listening on that port.
    •  
    • This is a HA cell and the definition for that cell in mcell.dir file on primary server and secondary server are different.
    •  
    • This is a secondary HA cell and the mcell.conf incorrectly contains CellDuplicateMode=1
    •  
    • The cell is already running.
      

    Cell fails to start with message "BMC-IMC032205F: Cannot read knowledge base file"

      

    Starting cell in foreground shows following error:

      

    20130515 160755.218000 mcell: BAROC: BMC-IMC032270V: Signature 1691 3811407335

      

    20130515 160755.219000 mcell: BAROC: BMC-IMC032269V: Installing from file C:/Program Files/BMC Software/ProactiveNet/pw/server/etc/Admin/kb/rules/mv_admin.wic

      

    20130515 160755.222000 mcell: BAROC: BMC-IMC032270V: Signature 1071 2064456062

      

    20130515 160755.223000 mcell: EVTPROC: BMC-IMC090004F: Failed to load knowledgebase definitions

      

    20130515 160755.302000 mcell: SERVICE: BMC-IMC050141V: Disconnecting from destinations...

      

    20130515 160755.303000 mcell: SERVICE: BMC-IMC050114V: Disconnecting clients...

      

    20130515 160755.304000 mcell: SERVICE: BMC-IMC050106V: Cell shutdown...

      

    This indicates that the cell is unable to load the knowledge base (KB). Open a command window and run mccomp -n <cell> to recompile the KB. Resolve any errors it reports (if any) and then start the cell again.

     


    Article Number:

    000030584


    Article Type:

    FAQ/Procedural



      Looking for additional information?    Search BMC Support  or  Browse Knowledge Articles