Issues

Version 2
    Share This:

    Links to Other Chapters:  ADDM Support Guide

     

    Please edit this Support Guide and add content.
    These pages will only be valuable if everyone contributes.

     

     

    These are some issues that I documented for our mappers who were new to ADDM at the time - s some of them are pretty basic:

     

    Issue 1:  No Access – when Running Discovery Job

    Running a Discovery Job on the Consolidation Appliance shows the following results:

     

    adc1.png


    This is caused by the lack of Credentials defined on the Consolidation Appliance.  The resolution is to run all your discovery jobs on the Discovery Appliances – not the Consolidation Server.  The new Application Patterns should be defined on the Consolidation Appliance – Yes.  But all Discovery runs should run on Discovery Appliances.

    Issue 2: Can’t Delete Application Instance

    There is no Destroy command for Application Instances.  This appears to be an oversight by BMC (or maybe by Design?).

    The workaround is to delete the associated Pattern.  This will delete the associated Application Instances.

    Issue 3: CAM - Subgroups are not imported

    When you Import the Application Map, you will find that all Subgroups defined during the Prototyping Phase are missing.

    I suspect the export mechanism was intentionally designed in this way by BMC. When you import a Map definition ( i.e. Group), there is no guarantee that you will have the same set of Software instances in the destination ADDM Appliance and therefore the members of the sub-groups might be absent.

    Issue 4: CAM - Groups Display very slowly

    • To Resdolve, Click on the option Only show Working Set

    Issue 5: CAM Map is "updating"

     

    • After editing a Map, the Map may appear to be updating… 

     

    Cam Map Updating.png

     

    • In addition, the button Generate TPL may be greyed out.
    • You must delete any Named Values (that are defined as identity) so that each hierarchy in your map has only one identity.

     

    Explanation

     

    • Only one identity named variable can exist within each Hierarchy in the Map.   If you delete one Named Value, the Map will stop updating.
    • This appears to be a bug in the CAM Module.  The CAM Module is wanting to tell you that only one identity is allowed and is waiting for you to fix the problem.

     

    Reasoning won't start

     

    Problem Summary

    ADDM Reasoning won't start on my DEV ADDM System.

     

    Problem Description

    I was writing a pattern to detect Hosts with a Lifecycle Status of "Retired".  The pattern created and linked a LifeCycleStatus Node - but it also updated the "description" attribute on the Host Node.  The pattern is triggered by the created or confirmation of the Host Node.  This logic put the pattern into an infinite loop, whereby the change to the Host node triggered the pattern Which then updated the Host Node - which then
    triggered the pattern etc...

     

    I therefore stopped the Discovery and put the Appliance in and out of Maintenance mode and cancelled the test discovery run which triggered the whole problem.  This did not work.  The Appliance was still running with 100% CPU.  I therefore stopped all the tideway services and restarted them.  This did not work either.  Reasoning took forever to start and when it was started, CPU Utilization was at 100%.  After several more attempts to restart the tideway service, I am now in a situation where Reasoning fails to start.  The startup process gets stuck at the Reasoning step.  See below.

    CPU Utilization (according to top) is quiet - but the tideway service just won't start anymore.

    Problem Resolution

    Stop the Appliance

    This is how you can “reboot the appliance” from the command line:

      
    sudo /sbin/service tideway stop
    sudo /sbin/service omniNames stop
    sudo /sbin/service appliance stop

     

    Kill any remaining processes

    ps –ef | grep
    python

    kill -9
    <pid>

     

    ## repeat above 2 steps until there are no more python processes

     

    Remove Files

    #### Optional step in case of emergency (This is the “hammer”)

    ### Only remove the “pq” files in case of dire emergency.

    ### this clears the “reasoning” cache … so reasoning will stop trying to do the rest of the discoveries in its queue

    ### also reasoning takes care of TKU activation, etc.  BE CAREFUL !!!!!

    ##### Really, don’t use this very often.

    cd  /usr/tideway/var/persist/reasoning/engine/queue/*.pq

    rm –f *.pq

     

    Restart the Appliance

    sudo /sbin/service appliance start

    sudo /sbin/service omniNames start

    sudo /sbin/service tideway start

     

    Wrong FQDN on Windows Host

     

    There may be a hidden Network Device that confuses ADDM – especially if the host is created using the P2V (Physical to Virtual) process.

     

    Remove Unused Devices

     

    To view unused devices in Device Manager, do the following:

     

    1. Open a Command Prompt by going to Start>Run and typing "cmd" (without quotes).
      Once the MS DOS prompt is open, type the following lines, pressing return after
      each:

     

    Set devmgr_show_nonpresent_devices=1

     

    1. Devmgmt.msc

     

    ADDM got confused by the unused devices.  I suspect that the original Physical host was in a different domain.

     

    To remove unused devices, seledt the device and press the Del key.  This will uninstall the device driver.

    The step "Remove Unused Devices", should be included in the P2V process for older Windows OS version.

    Darkspace Parameter

    RE:  Darkspace Parameter in Administration -> Model Maintenance

     

    I recently reset my Model Maintenance parameters back to their default - because we are having issues with duplicate Host node creation on Solaris.  What I did not realize is that ADDM would keep a huge number of extra DA Nodes for every NoResponse IP address.  Our
    consolidation server is now having serious issues with keeping up.  The impact on performance is serious.  I have reset the Darkspace parameter back to "Remove All".  I am going to run tw_darskpace_remove tonight.  I am not 100% sure that this was the issue - but it is looking very likely.  Are most people already aware of the impact of this parameter?  Shouldn't "Remove All" be the default?  Are there any advantages at all in
    setting it to "Keep Most Recent"?

    Reply from Andrew Waters

    Keep Most Recent means that you most recent scan gives an indication that all the endpoints were scanned and that some of
    them were no response.

     

    ADDM 10.1 has removed this option and handles dark space differently <http://discovery.bmc.com/confluence/display/101/Dark+space+scanning> . It records information in DroppedEndpoints
    nodes associated with the run and does not build dark space DAs which then need
    to be managed.

     

     

    Darkspace Paremeter

     

    I sent this request internally weithin ANZ:

     

    A) Darspace Parameter

    Please set the Parameter: Adminstration->Model Maintenance->Darkspace to the
    Value:  Remove All on All ADDM Appliances

    I noticed that the Administration->Model Maintenance -> Darkspace parameter is set to "Keep Most recent" on the Full Discovery appliances.  It is currently set correctly on the Sweep Scanners but not on the Discovery Appliances or the Consolidation Appliances.

    This option can slow down the scanning process and increase scan times by a factor of 2. 
    For example, the NZ datacentre scan takes 6.5 hours (with 16 cores on this Physical appliance).

    With the Option set to "Keep Most Recent", this time extends to 24 to 25 hours.
    "Keep Most Recent" keeps DA Nodes (records) in the database for every IP-Address scanned - even if nothing was found.  This bloats out the database and causes significant performance degradation. The Option needs to be set to "Remove All" on all Scanning Appliances (Full Scanners, Sweep Scanners and Consolidation Appliances).

     

    B)  tw_remove_darkspace

    Please also run tw_remove_darkspace to remove darkspace records on ALL ADDM Appliances.  It is best to suspend all scans before runinng tw_remove_darkspace. It may take 24 hours for this utility to complete on each scanner.  I noticed that some Sweep Scan jobs have been run on Full Discovery scanners, so the datatbasaes on these devices may have a lot of darkspace DA Nodes.  It is therfore best to run tw_remove_darkspace on ALLLLL appliances. Once the database is back to its minimum size, scans will perform better.

     

     

     

     

     

    Links to Other Chapters:  ADDM Support Guide