Discovery: Core file(s) have been created in /usr/tideway/cores. What should be done when such files are found?

Version 3
    Share This:

    This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.


    PRODUCT:

    BMC Discovery.


    COMPONENT:

    BMC Discovery 11.3


    APPLIES TO:

    BMC Discovery



    QUESTION:

    Discovery: Core file(s) have been created in /usr/tideway/cores

    What are these files?
    What should be done when such files are found?
    What are the known root causes and solutions?

       
          
       
          


    ANSWER:

     

    What is a core file?

      

    A core file is an image of a process that is created by the operating system when the process terminates unexpectedly. The core file contains information that is able to identify the PID and name of the process that generated it. For example, a core file will have a name like "core.2675" where 2675 is the PID of the failing process.

      


    What should be done when such files are found in Discovery?

      

    If possible, do not delete the content of /usr/tideway/cores.

      
       
    • Open a BMC Customer Support case     
           
      • Open an ssh session (tideway user) and execute "tw_check_cores"
      •   
    •  
    • For releases prior to 11.2 ONLY      
           
      • Create a new file named /usr/tideway/cores/gdbCommandFile with the following content:
      •   
      
          info threads  
        thread apply all bt  
        quit 
      
       
    • Send the txt output of the command below in the support case
      
      find /usr/tideway/cores -type f -name "core.*" -exec ls -lsa {} \; -exec file {} \; -exec  gdb -batch -quiet -x /usr/tideway/cores/gdbCommandFile python {} \; -exec gzip {} \; > gdb_cores.txt  2>&1
      
       
    • Upload the /usr/tideway/cores/*.txt files to the support case

    •  
    • If possible, do not delete core files until the end of the investigation. If the UI is unavailable because /usr is full, try to backup at least the two most recent *.gz files to another location before deleting them. 

    •  
    • Gather the logs for the date(s) matching the core creation. The most important logs are Model, Performance, Others, SAR log, and system messages. 

      


    Known root causes and solutions:

      

    Root cause 1: Datastore corruption
    This is the most frequent cause. This often leads to frequent cores/crashes. For this root cause, the cores are always generated by the python model process. To confirm this cause and try to repair (not always possible), see KA 000134382.

    Root cause 2: Hyper threading is enabled on HP server
    In one case (HP Proliant DL380 Gen9), the model process was generating cores during malloc. There were other symptoms, but no corruption was reported by db_verify (see root cause #1 above).  
    Workaround: It  is not confirmed, but it seems that root cause 2 is less frequent when the model process is started with the option "-ORBmaxClientThreadPoolSize 500" added in data/installed/startup/15model 
    Solution 2 : disable the hyper threading. 

    Root cause 3: Memory management errors
    In this case, cores are typically dumped while trying to allocate memory (malloc). A db_verify does not report any errors and the crash/cores occur less frequently (compared to root cause #1). These cores can be dumped because the OS sent a signal 6 (abort) to the process, or there was a segmentation fault in the process. In certain cases these cores could be dumped while the OS was intensively swapping. Note that there is no known solution for root cause 3 at this time, but the requested information above could allow BMC to find it.

    Root cause 4:  CMDB Sync creates core dumps in the model process. This impacts Discovery 11.2.x and 11.3.x prior to 11.3.04 when the CMDB Sync is used. For more details, see KA 000161376
    Solution 4: Upgrade to 11.3.0.4 or later.

    Root cause 5: An OS Upgrade for RHEL 6 was executed on an appliance running CentOS 6. 
    Solution 5: if possible, restore the appliance from a VM snapshot taken before the OS Upgrade and use the correct upgrade

    Root cause 6: A python 2.7 defect: https://bugzilla.redhat.com/show_bug.cgi?id=1561170 
    In this case, the cores are only caused by segmentation faults. This was diagnosed once so far (May 2020) and caused only one core, and was not reproduced in the following two months). The core was generated by the "options" process but it can probably impact other python processes.
    Solution 6: Upgrade to Discovery 12.0, which uses python 3. This contains the fix for this defect.
    Workaround 6: (not verified) Reducing the load may reduce the frequency of this issue.

     


    Article Number:

    000346728


    Article Type:

    FAQ/Procedural



      Looking for additional information?    Search BMC Support  or  Browse Knowledge Articles