5 Replies Latest reply: Mar 16, 2011 2:01 PM by Carl Gerlach RSS

Using a CSV File for Server IP Addresses - Need to track failures by IP

Carl Gerlach

Hello everyone:

 

I created a basic login / logout script to cycle through the 20 Server IP addresses I have that are in a cluster of application servers that support a URL request to http://ourapplication.company.com.

 

As I have increased my BDL scripting skills (not by much I'm sorry to say) I decided to use a CSV file that contains the IP addresses for the servers and simply execute a do the script once every 15 minutes:

 

transaction TMain

var
hWeb0 : number;

begin

// Place file name from Project attributes into file name parameter.

sFileName := s_filename;

FileCSVLoad(hFile,sFileName);

nMaxRows := FileGetNumRows(hfile);

nRow := 1;

// Place file name from Project attributes into file name parameter.

sFileName := s_filename;

FileCSVLoad(hFile,sFileName);

nMaxRows := FileGetNumRows(hfile);

nRow := 1;

 

// START of FOR Loop

For nRow := 1 to nMaxRows do

FileGetNextRow(hFile);
sServerIPAddress := FileGetCol(hFile, 1, MAX_LEN);
Server_Address := ("http://" + sServerIPAddress + "/");

 

.... {rest of script}

 

end; // END of FOR Loop

end TMain;

 

In general, this script worked OK except that I have received the following request:

  1. Test all IP addresses every 10 minutes (The "Try Script" takes 8 minutes and 25 seconds to execute and I need to simulate an "Average User" so it can take longer than 10 minutes to complete.) -Sometimes up to 12 minutes-
  2. Send an email "Warning" to support staff if the script fails once in a series for an IP Address
  3. Send an email "Critical" to support staff if the script fails twice in a series for an IP address

 

The "key" issue with 2 and 3 is that the IP address sent in the email for #2 above may not be the same IP address sent in the email ffor #3 above. While knowing that two IP addresses failed in a series is of some value, it's not as valuable or as critical as having the SAME IP address fail twice in a row.

 

I thought of two options to resolve this issue and would appreciate any additional ideas, thoughts etc that you might have.

 

Please note that I may be able to verbalize what I would like to do, however programming it may be a whole different story.

 

Option 1: After the Login/Logout portion of the script completes but before the END TMain; Check to see if an error was raised (Any error at this point) and if so, execute the Login/Logout portion of the script again for the SAME IP address. If it fails the second time, send an Alert.

 

Option 2: Same as Option 1 except instead of running the Login/Logout portion of the script again I would update a "New" field called nStatus in the same row as the IP Address in the CSV file: I would add 1 to nStatus if there was an error raised, if no error was raised I would set nStatus to 0. I would then execute a IF-THEN-ELSE logic: IF nStatus =1 send Warning email, IF nStatus =2 send Critical email, ELSE goto END.

 

Am I making to much out of this? Is there an easier way? Is option two possible within the confines of a BDL script?

 

Any thoughts, ideas, comments would be greatly appreciated.

 

Thank you.

 

 

Carl

  • 1. Re: Using a CSV File for Server IP Addresses - Need to track failures by IP
    Adam Wemlinger

    Hi Carl,

     

    Sounds like you barely have time to run through the script before needing it to run again. If you have several execution servers you could divide the targets among them so each location is doing only a portion and thus gets done faster. You should be able to use the same project and just grab the agents hostname to determine what file to use or use another column in the one file.

     

     

     

    As for catching the same server failure for the alarms, is this all done in TMART or do you feed this into Proactivenet? In essence it sounds like you need to create a unique measure for each server through the loop. I know this would work if you feed info to Pan and alert from there. Not sure about TMART alone.

     

    // START of FOR Loop

     

    For nRow := 1 to nMaxRows do

     

    FileGetNextRow(hFile);

    sServerIPAddress := FileGetCol(hFile, 1, MAX_LEN);

    Server_Address := ("http://" + sServerIPAddress + "/");

     

     

     

    StartMeasure(Server_Address);

     

    .... {rest of script}

     

    StopMeasure(Server_Address);

     

     

     

    end; // END of FOR Loop

     

     

     

    Each target IP would then become an individual metric in Proactivenet that can have thresholds set against it.

     

     

     

    -Adam

  • 2. Using a CSV File for Server IP Addresses - Need to track failures by IP
    Hal DeVore

    Carl,

     

    Nice work on the script.  I do have a caution for you.

     

    There is a global "dead script" timer that will summarily kill any monitor that runs more than 15 minutes.

     

    If you have a script that takes 12 minutes when everything is more or less normal, you are definitely at risk of having it go over 15 minutes if you get, for example, a host down so that you have to wait for timeouts.

     

    When the "dead script" timer is exceeded, the process that is running the monitor is killed.  No results or truelogs can be salvaged if this happens.  It is intended as a last resort for a monitor that is hopelessly stuck.

     

    Is it necessary for all the servers to be tested by a single script so that each one is tested serially?

     

    --Hal

  • 3. Re: Using a CSV File for Server IP Addresses - Need to track failures by IP
    Hal DeVore

    Adam,

     

    The approach you describe will work perfectly well for TM ART without BPPM.

     

    --Hal

  • 4. Re: Using a CSV File for Server IP Addresses - Need to track failures by IP
    Carl Gerlach

    Thanks for your reply Adam!

     

    I have broken the file into "Environment Type" in order to keep the number of servers to a manageable level. What you suggested...

     

    "If you have several execution servers you could divide the targets among them so each location is doing only a portion and thus gets done faster. You should be able to use the same project and just grab the agents hostname to determine what file to use or use another column in the one file."

     

    ... sounds like a better alternative because we will contiue to roll-out the application to additional countries and there by adding additional servers.

     

    I have three execution servers available for thos script. If I can bother you for some additional information could you elaborate just a bit more on the "just grab the agents hostname to determine what file to use or use another column in the one file."?

     

    Yes, this is all handled with TM ART. (We have an APM project this year in which we will be reviewing ProactiveNet along with a few other top vendors in the APM market.)

     

    Thanks for the use of a custom measure. This should do the trick for now.

     

     

    Thanks again for your reply, you have been very helpful.

     

    Warmest Regards,

     

    Carl

  • 5. Re: Using a CSV File for Server IP Addresses - Need to track failures by IP
    Carl Gerlach

    Thanks Hal,

     

    I recall a previous discussion thread that I read sometime ago about the Global "Dead Script" timer. I completely forgot about it.

     

    "Is it necessary for all the servers to be tested by a single script so that each one is tested serially?"

     

    No, not necessarily. The goal is to test all of the server every 10 minutes. The idea of using a CSV file was to simplify the maintenance of the script.

     

    Warmest Regards,

     

    Carl