This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.
BMC Discovery 11.3
Discovery: Core file(s) have been created in /usr/tideway/cores
What are these files?
What should be done when such files are found?
What are the known root causes and solutions?
What is a core file?
A core file is an image of a process that is created by the operating system when the process terminates unexpectedly. The core file contains information that is able to identify the PID and name of the process that generated it. For example, a core file will have a name like "core.2675" where 2675 is the PID of the failing process.
What should be done when such files are found in Discovery?
If possible, do not delete the content of /usr/tideway/cores.
- Open a BMC Customer Support case
- Open an ssh session (tideway user) and execute "tw_check_cores"
- For releases prior to 11.2 ONLY
- Create a new file named /usr/tideway/cores/gdbCommandFile with the following content:
thread apply all bt
- Send the txt output of the command below in the support case
Upload the /usr/tideway/cores/*.txt files to the support case
If possible, do not delete core files until the end of the investigation. If the UI is unavailable because /usr is full, try to backup at least the two most recent *.gz files to another location before deleting them.
Gather the logs for the date(s) matching the core creation. The most important logs are Model, Performance, Others, SAR log, and system messages.
Known root causes and solutions:
Root cause 1: Datastore corruption
This is the most frequent cause. This often leads to frequent cores/crashes. For this root cause, the cores are always generated by the python model process. To confirm this cause and try to repair (not always possible), see KA 000134382.
Root cause 2: Hyper threading is enabled on HP server
In one case (HP Proliant DL380 Gen9), the model process was generating cores during malloc. There were other symptoms, but no corruption was reported by db_verify (see root cause #1 above).
Workaround: It is not confirmed, but it seems that root cause 2 is less frequent when the model process is started with the option "-ORBmaxClientThreadPoolSize 500" added in data/installed/startup/15model
Solution 2 : disable the hyper threading.
Root cause 3: Memory management errors
In this case, cores are typically dumped while trying to allocate memory (malloc). A db_verify does not report any errors and the crash/cores occur less frequently (compared to root cause #1). These cores can be dumped because the OS sent a signal 6 (abort) to the process, or there was a segmentation fault in the process. In certain cases these cores could be dumped while the OS was intensively swapping. Note that there is no known solution for root cause 3 at this time, but the requested information above could allow BMC to find it.
Root cause 4: CMDB Sync creates core dumps in the model process. This impacts Discovery 11.2.x and 11.3.x prior to 11.3.04 when the CMDB Sync is used. For more details, see KA 000161376
Solution 4: Upgrade to 18.104.22.168 or later.
Root cause 5: An OS Upgrade for RHEL 6 was executed on an appliance running CentOS 6.
Solution 5: if possible, restore the appliance from a VM snapshot taken before the OS Upgrade and use the correct upgrade
Root cause 6: A python 2.7 defect: https://bugzilla.redhat.com/show_bug.cgi?id=1561170
In this case, the cores are only caused by segmentation faults. This was diagnosed once so far (May 2020) and caused only one core, and was not reproduced in the following two months). The core was generated by the "options" process but it can probably impact other python processes.
Solution 6: Upgrade to Discovery 12.0, which uses python 3. This contains the fix for this defect.
Workaround 6: (not verified) Reducing the load may reduce the frequency of this issue.