BMC Discovery scan got stuck 99%
Though I have logged a case with BMC tech Support, but still posting is in the community to get more help on this case:
Two approaches what BMC tech Support is suggesting in order to take care of this situation:
- Reset everything (Model Wipe), which will an altogether new installation, the only consequences of this is the Hash ID (which we are using as unique ID for Service Now records) for the Hosts will be changed(BMC Tech Support is doing further investigation on this), rest of the other things I will be able to take care.
- Repair the broken chain, this will take a long time (approx. 2-3 months).
Here is the whole summary:
We discussed the below root cause from following knowledge article that the "Root Cause 2" is the match for our use case
Discovery: A scan (Discovery Run) never ends, it is stuck and does not complete
Root cause 2: Some Discovery Access chains are broken
If the unfinished endpoints are rescanned, it does not always reproduce the issue.
To confirm the root cause, execute the query below. It it returns any nodes, the root cause is confirmed
SEARCH DiscoveryAccess WHERE NODECOUNT(TRAVERSE Next:Sequential:Previous:DiscoveryAccess) = 0 AND _first_marker NOT DEFINED AND NOT _doomed show endpoint, state, result, end_state, reason, device_summary, discovery_starttime, discovery_endtime, is_being_held, _first_marker, id(#)
Solution 2A: Reset the datastore, reinstall, or restore a backup
Solution 2B: Contact Customer Support to ask about a repair script. It's recommended to check with db_verify that there are no other integrity problems before trying to repair.
>> By Resetting the Datastore, they meant tw_model wipe. The tw_model_wipe utility enables you to delete all data in the datastore. The utility does not delete configuration data held outside the datastore. However, some configuration information is held in the datastore and is lost when running tw_model_wipe. This is:
Scheduled scan ranges
CAM (saved queries, groups and subgroups, named values, functional component definitions, and generated patterns)
SAAM Model definitions
DDD Removal blackout windows
Configuration as a scanner or consolidator
Registered Windows proxies
Saved Queries (on the Reports page)
CMDB Sync configuration (both connections and filters)
For reference, please see documentation link: https://docs.bmc.com/docs/display/DISCO111/tw_model_wipe
The knowledge article "Best practices for steps to take before and after running tw_model_wipe or tw_model_init to reset the Discovery datastore"
The second approach/solution we discussed is about reparing the script. We can try to resolve it by repairing, however it would take more time to resolve it.
There should be some other way out to get out of this situation.
Our ITSM tool (Service Now) is very much depending on this BMC Discovery data and using this as CMDB, if we will perform a model_wipe, we need to remove all the duplicate data from the Service Now as well afterwards.
I'm seeking for help to find out the other ways to take care of this situation.