Skip navigation

Server Automation

3 Posts authored by: Bill Robinson Moderator
Share:|

This seems to be a fairly common case:  you want to run a job from the blcli and wait for it to finish and then do something with the result.  You can use something like Job.executeJobAndWait but this can be problematic if the job runs for long enough that the blcli to appserver connection will be deemed idle.  The blcli sits there with no output so you aren't getting any indication anything is going on (hence the idle timeout problem), if that blcli to appserver connection is still alive or anything else.  Instead maybe you can start the job and get something back you can use to check the run status and then loop over running that command until you see the job run has finished.

 

The first thing to do is to find the right blcli command to run.  There are a couple that look promising: Job.executeJobAndReturnScheduleID and Job.executeJobAndWaitForRunID.  We need a command that will output something we can use to get the running status and I see a couple commands in the JobRun namespace that look like they will do that: JobRun.getJobRunStatusByScheduleId and JobRun.getJobRunIsRunningByRunKey  (One thing that's a little confusing that well see later is the executeJobAndWaitForRunID actually returns the jobRunKey, not the jobRunId.  And that's ok because we can use that with the getJobRunIsRunningByRunKey command)  There are some other commands that let you 'execute against' a set of targets for that run and they also return the scheduleId or jobRunKey.  As long as the execution command returns the scheduleId or jobRunKey or jobRunId it looks like we will be in good shape.  As you can tell there is no right blcli command.

 

We've found a couple different sets of commands that look like they will accomplish our goal of starting a job run and returning something that we can use to check status with another command.  We can use either command set.  The schedule one looks nice to me because it gives more information about status than running or not.  Let's look at that one first.  We need to get the jobKey, run it and get the scheduleId and then do a loop until the job returns a 'COMPLETE' status.  That's pretty straightforward:

 

blcli_execute NSHScriptJob getDBKeyByGroupAndName "/Workspace" "myJob"
blcli_storeenv JOB_KEY
blcli_execute Job executeJobAndReturnScheduleID ${JOB_KEY}
blcli_storeenv JOB_SCHEDULE_ID
JOB_STATUS="INCOMPLETE"
while [[ -n "${${JOB_STATUS//COMPLETE}//$'\n'}" ]]
     do
     sleep 10
     blcli_execute JobRun getJobRunStatusByScheduleId ${JOB_SCHEDULE_ID}
     blcli_storeenv JOB_STATUS
     echo "Schedule ${JOB_SCHEDULE_ID} status is: ${JOB_STATUS//$'\n'/ }"
done

 

There are some zsh-isms going on there I can explain: ${JOB_STATUS//$'\n'/ } removes the trailing new line from the return of getJobRunStatusByScheduleId.  ${${JOB_STATUS//COMPLETE}//$'\n'} does the same thing and also removes the string COMPLETE from the output and then the while test looks to see if the variable is non-zero in size.  When the job completes then it will show a status of COMPLETE only and by removing that string the variable will be zero and the loop will break.

 

You could put a counter in there as well if you want to prevent it from getting stuck in the loop, so something like:

 

blcli_execute NSHScriptJob getDBKeyByGroupAndName "/Workspace" "myJob"
blcli_storeenv JOB_KEY
blcli_execute Job executeJobAndReturnScheduleID ${JOB_KEY}
blcli_storeenv JOB_SCHEDULE_ID
JOB_STATUS="INCOMPLETE"
COUNT=1
while [[ -n "${${JOB_STATUS//COMPLETE}//$'\n'}" ]] && [[ ${COUNT} -le 100 ]]
     do
     sleep 10
     blcli_execute JobRun getJobRunStatusByScheduleId ${JOB_SCHEDULE_ID}
     blcli_storeenv JOB_STATUS
     echo "Schedule ${JOB_SCHEDULE_ID} status is: ${JOB_STATUS//$'\n'/ }"
     let COUNT+=1
done

Now I might want to check if the job run had errors or dump the job run log items out or use one of the Utility commands to export a job result or log to a file.  With the scheduleId approach I'll need to use a couple unreleased blcli commands to convert the schedule id into something i can use.

 

I can run something like this to convert the scheduled into a job run id or job run key:

blcli_execute JobRun findByScheduleId ${JOB_SCHEDULE_ID}
blcli_execute JobRun getJobRunId
blcli_storeenv JOB_RUN_ID

or

blcli_execute JobRun findByScheduleId ${JOB_SCHEDULE_ID}
blcli_execute JobRun getJobRunKey
blcli_storeenv JOB_RUN_KEY

 

Now lets look at the getJobRunIsRunningByRunKey version.  It's very similar:

blcli_execute Job executeJobAndWaitForRunID ${JOB_KEY}
blcli_storeenv JOB_RUN_KEY
JOB_IS_RUNNING="true"
while [[ "${JOB_IS_RUNNING}" = "true" ]]
        do
        sleep 10
        blcli_execute JobRun getJobRunIsRunningByRunKey ${JOB_RUN_KEY}
        blcli_storeenv JOB_IS_RUNNING
        echo "${JOB_IS_RUNNING}"
done

 

In a normally functioning environment these will more or less work the same.  A problem with the getJobRunIsRunning approach is that if the job doesn't start running, for example you have a job routing rule sending the job to a downed appserver, your appservers are maxed out on WorkItemThreads, or you've reached the MaxJobs threshold, then your executeJobAndWaitForRunID is going to sit waiting for the job to start running and you are back to the possible idle connection timeout happening.

 

There are other commands that look interesting in the JobRun namespace (released and unreleased).  The JobRun.showRunningJobs looks neat - it outputs a nice table - but it only shows the job name which is not enough to know if your job is actually running since you can have duplicate job names in different workspace folders. 

 

The above example is pretty simple.  This can be extended to manage multiple job runs - for example if you wanted to kick off a number of jobs at once and then wait until they were all complete, check their status, and then move on to some other action.  One approach to that would be some arrays to hold the job information an another loop around what I've done above.  That may be a future post.

Share:|

The Utility.exportPatch* and related commands only pull out the patch analysis results.  That won't help if you want to look at the log messages to troubleshoot an error during the run.  For that you need to get the contents of the job run log.  There's no command to dump that (yet?).  Also it would make sense to only get the log messages for the servers that threw a warning or error, since you probably don't care about the successful ones.

 

The Patching Job is like a Batch Job - it contains the Analysis Job and possibly the Remediation and Download Jobs.  Since the Analysis Job is doing most of the work that's where the information we need will be stored.  So we start with getting the latest run of a Patching Job, getting the associated Analysis Job Run (there should be only one and we exclude the Download and Remediation), then finding the failed targets for that run and pulling out the log messages for each server and write it off to a file.

 

You could also

  • Look at a particular Patching Job run, not the latest, knowing the start date or if you just ran it and had the run key.
  • Email off the file(s), collect them, ship them to some other system
  • get the logs for all the servers in the job
  • get the top-level job run logs for the patching job and analysis job runs with the 'LogItem.getLogItemsByJobRun' or 'JobRun.getLogItemsByJobRunId' (second one calls the first actually)

 

 

 

#!/bin/nsh
blcli_setjvmoption -Dcom.bladelogic.cli.execute.quietmode.enabled=true
blcli_setoption serviceProfileName defaultProfile
blcli_setoption roleName BLAdmins
blcli_connect

patchingJob="/Workspace/Patching Jobs/RedHat Patching Job"
blcli_execute PatchingJob getDBKeyByGroupAndName "${patchingJob%/*}" "${patchingJob##*/}"
blcli_storeenv jobKey
blcli_execute JobRun findLastRunKeyByJobKey ${jobKey}
blcli_storeenv jobRunKey
blcli_execute JobRun jobRunKeyToJobRunId ${jobRunKey}
blcli_storeenv jobRunId
blcli_execute JobRun findPatchingJobChildrenJobsByRunKey ${jobRunId}
blcli_execute JobRun getJobRunId
blcli_execute Utility setTargetObject
blcli_execute Utility listPrint
blcli_storeenv patchChildJobRunIds
patchChildJobRunIds="$(awk 'NF' <<< "${patchChildJobRunIds}")"
while read patchChildJobRunId
        do
        blcli_execute JobRun findById ${patchChildJobRunId}
        blcli_execute JobRun getType
        blcli_storeenv jobRunType
        blcli_execute JobRun getJobKey
        blcli_storeenv analysisKey
        # skip remediation
        if [[ ${jobRunType} -ne 7033 ]] && [[ ${jobRuntype} -ne 7031 ]]
         then
           blcli_execute JobRun getServersStatusByJobRun ${patchChildJobRunId}
           blcli_execute Utility mapPrint
           blcli_storeenv serverStatusMap
           serverStatusMap="$(awk 'NF' <<< "${serverStatusMap}")"
           while read serverStatus
                do
                     serverKey="$(awk '{print $1}' <<< "${serverStatus}")"
                     echo "ServerKey: ${serverKey}"
                     serverResult="$(awk '{print $3}' <<< "${serverStatus}")"
                     # if it is an error get the job run log
                     if [[ ${serverResult} -eq 1 ]] || [[ ${serverResult} -eq 2 ]]
                          then
                          blcli_execute Server findByDBKey ${serverKey}
                          blcli_execute Server getName
                          blcli_storeenv serverName
                          blcli_execute Server getServerId
                          blcli_storeenv serverId
                          blcli_execute LogItem getLogItemsByDevice ${analysisKey} ${patchChildJobRunId} ${serverId}
                          blcli_execute Utility listPrint
                          blcli_storeenv jobRunLog
                          echo "${jobRunLog}" > "/tmp/${patchChildJobRunId}-${serverName}"
                fi
           done <<< "${serverStatusMap}"
    fi
done <<< "${patchChildJobRunIds}"
Share:|

This has come up at a couple customers recently so I figured I'd post something about it.

 

Problem: You have a number of approved exceptions that should be set on a components associated w/ a component template.  As new servers are added and components created there is no mechanism to have exceptions applied to the new components so they are in place before the next compliance run.

 

Solution: While there is no built-in mechanism to do this in BSA currently this can be accomplished with some blcli and nsh scripting.

 

There's a couple parts to this - first is where to keep the exceptions.  A couple obvious places are the Property Dictionary or a text file.  So of course I'm going to use the Property Dictionary because BSA! but the text file is fine also.

 

So I have a custom property class called ApproveExceptions that contains properties that define the exception (exceptionName, exceptionDescription, etc).  I ended up writing a script to create this and populate it, more on that later. So in this class I created sub-classes for each Component Template that I need to set exceptions on.  Under each sub-class I create the instances for each exception I need to set.  That's a pretty simple setup since I'm only working w/ the out-of-the-box component templates.  You might need to make something with more depth, or add some properties to better identify the template.  The point here was to make this as automated and self-resolving as possible.  I also define the Templates workspace folder my templates are in - I'm assuming I'll loop through them all, then loop through the sub-class that list what templates should get some exceptions and match them up.  Once I know what template I'm looking at I grab all the exceptions I need to set.  Then I grab all the components for the template and loop through them.  I see what exceptions are already set, compare them to the exceptions that should be set, and if there is a mismatch, set the exception.  New exceptions are new property instances that can be added in the Property Dictionary manually through the gui or via the blcli.

 

A few ideas:

  • Around line 80 instead of getting all the components for the template, just get the components created in the last few days and only process those.  If the default exception list is static that should reduce the processing time, especially in larger environments.  This could be done w/ the creation of a smart component group based on the template and the DATE_CREATED property.
  • Read the exceptions out of a file instead of pulling the exceptions out of the Property Dictionary.  Might work as an integration point with another compliance tracking system or CMDB.
  • Parameterize the script so it can either process just the newly components or all and then you have a script that can be used to only handle newly created components or push out a new exception across the existing components.
  • Put this in a batch job that runs Discovery, this NSH Script and then Compliance so you know the Components always have the exceptions set before Compliance runs.

 

 

#!/bin/nsh

blcli_setjvmoption -Dcom.bladelogic.cli.execute.quietmode.enabled=true
blcli_setoption serviceProfileName defaultProfile
blcli_setoption roleName BLAdmins
blcli_connect

exceptionClass="Class://SystemObject/ApprovedExceptions"
exceptionFile="/tmp/exceptions.csv"
templateGroup="/CIS Compliance Content"


# get all the templates in the group
blcli_execute TemplateGroup groupNameToDBKey "${templateGroup}"
blcli_storeenv templateGroupKey
blcli_execute Template findAllByGroup ${templateGroupKey} true
blcli_execute Template getDBKey
blcli_execute Utility setTargetObject
blcli_execute Utility listPrint
blcli_execute Utility setTargetObject
blcli_storeenv templateKeys
templateKeys="$(awk 'NF' <<< "${templateKeys}")"

blcli_execute Utility getCurrentRole
blcli_storeenv rbacRole
blcli_execute Utility getCurrentUserName
blcli_storeenv rbacUser


# for each template, get the exception name and other info
while read templateKey
        do
        blcli_execute Template findByDBKey ${templateKey}
        blcli_execute Template getName
        blcli_storeenv templateName
        blcli_execute Template getGroupId
        blcli_storeenv templateGroupId
        blcli_execute Group getQualifiedGroupName 5008 ${templateGroupId}
        blcli_storeenv templateGroupPath
        echo "Processing ${templateGroupPath}/${templateName}..."
        blcli_execute PropertyClass isPropertyClassDefined "${exceptionClass}/${templateName}"
        blcli_storeenv classExists

        if [[ "${classExists}" = "false" ]]
                then
                echo "   Cannot find exception class ${exceptionClass}/${templateName}, will not set exceptions for ${templateName}..."
                break
        fi

        # get the instance (exception) names in the class (for the template)
        blcli_execute PropertyClass listAllInstanceNames "${exceptionClass}/${templateName}"
        blcli_storeenv instanceNames
        instanceNames="$(awk 'NF' <<< "${instanceNames}")"

        typeset -a exceptionNames
        typeset -A exceptionDescriptions
        typeset -A exceptionComments
        typeset -A exceptionRuleNames

        if [[ -n ${instanceNames} ]]
                then
                while read i
                        do
                        echo "   Getting exception information for: ${i}..."
                        blcli_execute PropertyInstance getPropertyValue "${i}" "exceptionName"
                        blcli_storeenv exceptionName
                        exceptionNames+=("${exceptionName}")
                        blcli_execute PropertyInstance getPropertyValue "${i}" "exceptionDescription"
                        blcli_storeenv exceptionDescription
                        exceptionDescriptions[${exceptionName}]="${exceptionDescription}"
                        blcli_execute PropertyInstance getPropertyValue "${i}" "exceptionComment"
                        blcli_storeenv exceptionComment
                        exceptionComments[${exceptionName}]="${exceptionComment}"
                        blcli_execute PropertyInstance getPropertyValue "${i}" "exceptionRuleName"
                        blcli_storeenv exceptionRuleName
                        exceptionRuleNames[${exceptionName}]="${exceptionRuleName}"
                done <<< "${instanceNames}"

                # get all the components for a template
                blcli_execute Component getAllComponentKeysByTemplateKey ${templateKey}
                blcli_storeenv componentKeys
                componentKeys="$(awk 'NF' <<< "${componentKeys}")"

                while read componentKey
                        do
                        blcli_execute Component componentKeyToName ${componentKey}
                        blcli_storeenv componentName
                        echo "   Processing exceptions on ${componentName}..."
                        blcli_execute Component findComponentExceptions ${componentKey}
                        blcli_execute ComponentException getName
                        blcli_execute Utility setTargetObject
                        blcli_execute Utility listPrint
                        blcli_storeenv componentExceptions
                        for i in {1..${#exceptionNames}}
                                do
                                exceptionName="${exceptionNames[${i}]}"
                                echo "    Checking ${componentName} for ${exceptionName}..."
                                hasException="false"
                                while read componentException
                                        do
                                        if [[ "${exceptionName}" = "${componentException}" ]]
                                                then
                                                hasException="true"
                                                echo "     ${componentName} has exception: ${exceptionName}..."
                                                break
                                        fi
                                done <<< "${componentExceptions}"
                        
                                if [[ "${hasException}" = "false" ]]
                                        then
                                        echo "     ${componentName} needs exception: ${exceptionName}, setting..."
                                        echo "      Adding exception: ${exceptionName} ${exceptionDescriptions[${exceptionName}]} ${exceptionComments[${exceptionName}]} on rule: ${exceptionRuleNames[${exceptionName}]}..."
                                        blcli_execute ComponentException createComponentExceptionWithOneRule ${componentKey} "${exceptionName}" "${exceptionDescriptions[${exceptionName}]}" "${rbacRole}" "${rbacUser}" "${exceptionComments[${exceptionName}]}" "${templateGroupPath}" "${templateName}" "${exceptionRuleNames[${exceptionName}]}"
                                        blcli_storeenv componentKey
                                fi
                        done 
                done <<< "${componentKeys}"

                unset exceptionNames exceptionDescriptions exceptionComments exceptionRuleNames
        else    
                echo "   No exceptions for ${templateName}..."
        fi
done <<< "${templateKeys}"

 

When I was building the script above I realized I wanted a way to create the Property Dictionary instances by processing a text file so I had something to work with in the other script, instead of creating the instances by hand.  So that script is below.  It reads a csv file (further below) and creates the instances.

 

#!/bin/nsh

blcli_setjvmoption -Dcom.bladelogic.cli.execute.quietmode.enabled=true
blcli_setoption serviceProfileName defaultProfile
blcli_setoption roleName BLAdmins
blcli_connect

exceptionClass="Class://SystemObject/ApprovedExceptions"
typeset -A exceptionProperties
exceptionProperties=(exceptionName Primitive:/String exceptionDescription Primitive:/String exceptionComment Primitive:/String exceptionRuleName Primitive:/String)
exceptionFile="/tmp/testExceptions.csv"
bulkPropertyFile="/tmp/bulkSetPSIProps.csv"
templateGroup="/CIS Compliance Content"

[[ -f "${bulkPropertyFile}" ]] && rm -f "${bulkPropertyFile}"

blcli_execute PropertyClass isPropertyClassDefined "${exceptionClass}"
blcli_storeenv classExists

if [[ "${classExists}" = "false" ]]
        then
        blcli_execute PropertyClass createSubClass "${exceptionClass%/*}" "${exceptionClass##*/}" ""
fi

for i in ${(k)exceptionProperties}
        do
        blcli_execute PropertyClass isPropertyDefined "${exceptionClass}" "${i}"
        blcli_storeenv propertyExists

        if [[ "${propertyExists}" = "false" ]]
                then
                blcli_execute PropertyClass addProperty "${exceptionClass}" "${i}" "${i}" "${exceptionProperties[${i}]}" true false ""
        fi
done

# get all templates in the group, will create a subclass for each one to hold the exceptions
blcli_execute TemplateGroup groupNameToDBKey "${templateGroup}"
blcli_storeenv templateGroupKey
blcli_execute Template findAllByGroup ${templateGroupKey} true
blcli_execute Template getDBKey
blcli_execute Utility setTargetObject
blcli_execute Utility listPrint
blcli_execute Utility setTargetObject
blcli_storeenv templateKeys
templateKeys="$(awk 'NF' <<< "${templateKeys}")"

while read templateKey
        do
        blcli_execute Template findByDBKey ${templateKey}
        blcli_execute Template getName
        blcli_storeenv templateName
        blcli_execute PropertyClass isPropertyClassDefined "${exceptionClass}/${templateName}"
        blcli_storeenv subClassExists
        if [[ "${subClassExists}" = "false" ]]
                then
                blcli_execute PropertyClass createSubClass "${exceptionClass}" "${templateName}" ""
        fi
done <<< "${templateKeys}"

# read the csv file that lists the env-wide exceptions for each template and create the PSI and set the property values for each one.
if [[ -f "${exceptionFile}" ]]
        then
        while IFS=, read templateName exceptionName exceptionDescription exceptionComment exceptionRuleName
                do
                blcli_execute PropertyClass isPropertyClassInstanceDefined "${exceptionClass}/${templateName}" "${exceptionName}"
                blcli_storeenv instanceExists

                if [[ "${instanceExists}" = "false" ]]
                        then
                        blcli_execute PropertyInstance createInstance "${exceptionClass}/${templateName}" "${exceptionName}" ""
                else
                        echo "\"${exceptionClass}/${templateName}/${exceptionName}\",\"exceptionName\",\"${exceptionName}\"" >> "${bulkPropertyFile}"
                        echo "\"${exceptionClass}/${templateName}/${exceptionName}\",\"exceptionDescription\",\"${exceptionDescription}\"" >> "${bulkPropertyFile}"
                        echo "\"${exceptionClass}/${templateName}/${exceptionName}\",\"exceptionRuleName\",\"${exceptionRuleName}\"" >> "${bulkPropertyFile}"
                        echo "\"${exceptionClass}/${templateName}/${exceptionName}\",\"exceptionComment\",\"${exceptionComment}\"" >> "${bulkPropertyFile}"
                fi
        done < "${exceptionFile}"
        if [[ -s "${exceptionFile}" ]]
                then
                blcli_execute PropertyInstance bulkSetPropertyValues "${bulkPropertyFile%/*}" "${bulkPropertyFile##*/}"
        fi
else
        echo "Cannot find ${exceptionFile}..."
        exit 1
fi

 

Sample input file for the Property Instance creation script:

Template Name, exception name, exception description,comment,rule path/name.

CIS - Windows Server 2008 R2,exception-1,description-1,comment-1,/1.1.1.1 System Services/1.1.1.1.3 Configure IIS Admin Service
CIS - Windows Server 2008 R2,exception-2,description-2,comment-2,/1.1.1.1 System Services/1.1.1.1.4 Configure Windows Installer
CIS - Windows Server 2012 R2,exception-1a,description-1a,comment-1a,/1 Account Policies/1.1.1 Set Enforce password history to 24 or more passwords
CIS - Windows Server 2012 R2,exception-2a,description-2a,comment-2a,/1 Account Policies/1.1.3 Set Minimum password age to 1 or more days

 

This was a couple hours of hacking at it.  There are likely some performance improvements and error handling to be added and it hasn't been tested at any large scale.

Filter Blog

By date:
By tag: