2 Replies Latest reply on Apr 25, 2017 4:02 AM by Duncan Grisby

    Query Generator for SAAM and CAM

    Doug Connell
      Share This:

      I created a simple Query Generator for SAAM using a spreadsheet (Template - Query Generator.xlsx ).  This spreadsheet works as follows:


      1. You paste in a list of hosts.  Most of the application Support teams here can easily give you a list of hosts on which their application runs.  A list of hosts is the most easily obtainable "seed data".
      2. The spreadsheet then generates this query which lists all the Software Types on these hosts:



      //Software Types

      SEARCH Host WHERE name in ['host1','host2','host3','host4','host5','host6','host7','host8','host9','host10','host11','host12','host13','host14','host15','host16','host17','host18','host19','host20','host21','host22','host23','host24'] traverse Host:HostedSoftware:RunningSoftware:SoftwareInstance SHOW type PROCESSWITH countUnique(0)



         3.   You then select the types of software that you what,  The spreadsheet then generates this Query:



      "//Software Instances

      SEARCH Host WHERE name in ['host1','host2','host3','host4','host5','host6','host7','host8','host9','host10','host11','host12','host13','host14','host15','host16','host17','host18','host19','host20','host21','host22','host23','host24'] traverse Host:HostedSoftware:RunningSoftware:SoftwareInstance WHERE type IN ['Apache Derby Database Engine','Apache Webserver','IBM FileNet Image Services','IBM Sterling Connect:Direct Server','IBM WebSphere Application Server','IBM WebSphere Application Server Node agent','McAfee VirusScan','Microsoft IIS Webserver','Oracle Database Server']"



         4.  This query then gives you a starting point for SAAM with only the most relevant SIs on the screen.


      What do you think.  Is this useful?

        • 1. Re: Query Generator for SAAM and CAM
          Doug Connell

          I have spent 3 weeks intensive analysing CAM and SAAM for a training course.  There is no substitute for Visualizer, the Query Language and a good Brain.  My attempt to create a formula or spreadsheet for the task was not right.  It can't be done.  The best advice is some rules of thumb and examples:


          1. Start with the database - search DatabaseDetail
          2. For the application server layer - search for SIs - but if you find none look at processes and services
          3. For the web layer - search Software Components
          4. Beware using Observed Communication.  The application may not have been developed with permanent Connection pools.  Next scan the coms may be gone - causing a map flip-flop.


          There are also some rules of thumb around performance.  For example "ends with" can be very slow.  Need to make sure your regex'es are optimized.  I am still trying to figure out what's indexed in ADDM and what's not.   The documntation is good - but scant in certain areas such as how ADDM uses hash keys and b-trees and regex.  For example "contains word" is just a regex in TPL with a look ahead and look behind assertion.  Bad regex's can be slow.   I am just testing an Advanced Query regex for the string "fred$".  I only selected DiscoveredProcesses.  We have 5 million.  My Search is still running after 43 minutes!  I will do a post about this.

          1 of 1 people found this helpful
          • 2. Re: Query Generator for SAAM and CAM
            Duncan Grisby

            Exactly. There cannot be a single "right formula" for finding the parts of an application. Different applications can be very different from each other, so the best way to find their constituent parts and model them within Discovery varies too. Clearly, if you have a large number of in-house developed applications that all have the same basic structure, then it makes sense to have a common approach to modelling them, but for the general case of diverse applications, it is counter-productive to try to limit the approach.


            On the subject of performance, it is important to distinguish between finding nodes in the first place, and filtering nodes when you already have some. When finding an initial set of nodes, the data store uses its indexes; when filtering existing nodes it does not. Absolutely all the data is indexed in a full-text word/phrase index. String values are also indexed by hash. That means that if you are searching for nodes, it is fast to do exact equality tests and to do subword tests. Substring tests are worse, but can often use the word indexes too. A very few special cases of regular expressions are handled as substring or subword tests, but in general regular expressions involve retrieving all the data for the attribute(s) in question, and testing against the regex.


            When you already have a set of nodes, either because a pattern is acting on some data, or because you have a search that filters the results of a traversal, the indexes are not used. The data store just directly retrieves the data it needs to evaluate the filtering expressions.

            3 of 3 people found this helpful