This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.
BMC Discovery 11.3
Discovery Accesses eligible for removal does not reduce to zero. To check this, go to Administration > Performance > DDD Removal graph and observe the progress of the red line.
In this case, the datastore size will be larger than it should be, and can result in performance issues.
Following is an example chart showing ~3.75 million DA's eligible for removal, and the number is not decreasing. This will typically cause a measurable performance impact.
The file datastore file *_nDroppedEndpoints_pidx (/usr/tideway/var/tideway.db/data/datadir) can grow very fast when this happens. This is more likely to happen when Discovery frequently scans a large number of IP addresses (> 500K/day) that do not respond. This generates large numbers of DroppedEndpoint nodes (with a lot of IPs). The conditions described in the solution section below cause some DAs to not be deleted. This prevents the related Discovery Run from being deleted, which in turn causes the DroppedEndpoint nodes to not be deleted.
Once this issue is fixed (no matter how), it is recommended to
- wait until the red and orange waves in Administration > Performance > DDD Removal graph converge towards 0. This can take > 1 day depending the volume and the performance.
- run a compaction. The disk overhead caused by this issue won't be released before that.
Root cause 1: Some Discovery Access chains are missing the _first_marker. This condition may be caused by model process crashes as a result of Out of Memory condition when using SAAM models (DRUD1-23551).
To confirm the root cause, execute the query below. It it returns any nodes, the root cause is confirmed.SEARCH DiscoveryAccess WHERE NODECOUNT(TRAVERSE Next:Sequential:Previous:DiscoveryAccess) = 0 AND _first_marker NOT DEFINED AND NOT _doomed show endpoint, state, result, end_state, reason, device_summary, discovery_starttime, discovery_endtime, is_being_held, _first_marker, id(#)
Workaround 1: Contact Customer Support to ask about a repair script.
Solution 1: Upgrade to a version that contains the fix for defect DRUD1-23827. As of March 2020, this is still in progress.
Root cause 2: Some Discovery Access nodes are missing relationship to Discovery Runs and as a result are not aging out (DRUD1-16082)
To confirm the root cause, execute the query below. It it returns any nodes, the root cause is confirmed.
SEARCH DiscoveryAccess WHERE NOT _doomed AND NODECOUNT(TRAVERSE Member:List:List:DiscoveryRun) = 0 show endpoint, state, result, end_state, reason, device_summary, discovery_starttime, discovery_endtime, is_being_held, _first_marker, _last_marker, id(#)
(Note: the “NOT _doomed” clause is included because it’s ok for doomed DAs to not have a _first_marker.)
Workaround 2: Contact Customer Support to ask about a repair script.
Solution 2: Upgrade to a version that contains the fix for defect DRUD1-16082. As of March 2020, this is still in progress.