Share:|

This seems to be a fairly common case:  you want to run a job from the blcli and wait for it to finish and then do something with the result.  You can use something like Job.executeJobAndWait but this can be problematic if the job runs for long enough that the blcli to appserver connection will be deemed idle.  The blcli sits there with no output so you aren't getting any indication anything is going on (hence the idle timeout problem), if that blcli to appserver connection is still alive or anything else.  Instead maybe you can start the job and get something back you can use to check the run status and then loop over running that command until you see the job run has finished.

 

The first thing to do is to find the right blcli command to run.  There are a couple that look promising: Job.executeJobAndReturnScheduleID and Job.executeJobAndWaitForRunID.  We need a command that will output something we can use to get the running status and I see a couple commands in the JobRun namespace that look like they will do that: JobRun.getJobRunStatusByScheduleId and JobRun.getJobRunIsRunningByRunKey  (One thing that's a little confusing that well see later is the executeJobAndWaitForRunID actually returns the jobRunKey, not the jobRunId.  And that's ok because we can use that with the getJobRunIsRunningByRunKey command)  There are some other commands that let you 'execute against' a set of targets for that run and they also return the scheduleId or jobRunKey.  As long as the execution command returns the scheduleId or jobRunKey or jobRunId it looks like we will be in good shape.  As you can tell there is no right blcli command.

 

We've found a couple different sets of commands that look like they will accomplish our goal of starting a job run and returning something that we can use to check status with another command.  We can use either command set.  The schedule one looks nice to me because it gives more information about status than running or not.  Let's look at that one first.  We need to get the jobKey, run it and get the scheduleId and then do a loop until the job returns a 'COMPLETE' status.  That's pretty straightforward:

 

blcli_execute NSHScriptJob getDBKeyByGroupAndName "/Workspace" "myJob"
blcli_storeenv JOB_KEY
blcli_execute Job executeJobAndReturnScheduleID ${JOB_KEY}
blcli_storeenv JOB_SCHEDULE_ID
JOB_STATUS="INCOMPLETE"
while [[ -n "${${JOB_STATUS//COMPLETE}//$'\n'}" ]]
     do
     sleep 10
     blcli_execute JobRun getJobRunStatusByScheduleId ${JOB_SCHEDULE_ID}
     blcli_storeenv JOB_STATUS
     echo "Schedule ${JOB_SCHEDULE_ID} status is: ${JOB_STATUS//$'\n'/ }"
done

 

There are some zsh-isms going on there I can explain: ${JOB_STATUS//$'\n'/ } removes the trailing new line from the return of getJobRunStatusByScheduleId.  ${${JOB_STATUS//COMPLETE}//$'\n'} does the same thing and also removes the string COMPLETE from the output and then the while test looks to see if the variable is non-zero in size.  When the job completes then it will show a status of COMPLETE only and by removing that string the variable will be zero and the loop will break.

 

You could put a counter in there as well if you want to prevent it from getting stuck in the loop, so something like:

 

blcli_execute NSHScriptJob getDBKeyByGroupAndName "/Workspace" "myJob"
blcli_storeenv JOB_KEY
blcli_execute Job executeJobAndReturnScheduleID ${JOB_KEY}
blcli_storeenv JOB_SCHEDULE_ID
JOB_STATUS="INCOMPLETE"
COUNT=1
while [[ -n "${${JOB_STATUS//COMPLETE}//$'\n'}" ]] && [[ ${COUNT} -le 100 ]]
     do
     sleep 10
     blcli_execute JobRun getJobRunStatusByScheduleId ${JOB_SCHEDULE_ID}
     blcli_storeenv JOB_STATUS
     echo "Schedule ${JOB_SCHEDULE_ID} status is: ${JOB_STATUS//$'\n'/ }"
     let COUNT+=1
done

Now I might want to check if the job run had errors or dump the job run log items out or use one of the Utility commands to export a job result or log to a file.  With the scheduleId approach I'll need to use a couple unreleased blcli commands to convert the schedule id into something i can use.

 

I can run something like this to convert the scheduled into a job run id or job run key:

blcli_execute JobRun findByScheduleId ${JOB_SCHEDULE_ID}
blcli_execute JobRun getJobRunId
blcli_storeenv JOB_RUN_ID

or

blcli_execute JobRun findByScheduleId ${JOB_SCHEDULE_ID}
blcli_execute JobRun getJobRunKey
blcli_storeenv JOB_RUN_KEY

 

Now lets look at the getJobRunIsRunningByRunKey version.  It's very similar:

blcli_execute Job executeJobAndWaitForRunID ${JOB_KEY}
blcli_storeenv JOB_RUN_KEY
JOB_IS_RUNNING="true"
while [[ "${JOB_IS_RUNNING}" = "true" ]]
        do
        sleep 10
        blcli_execute JobRun getJobRunIsRunningByRunKey ${JOB_RUN_KEY}
        blcli_storeenv JOB_IS_RUNNING
        echo "${JOB_IS_RUNNING}"
done

 

In a normally functioning environment these will more or less work the same.  A problem with the getJobRunIsRunning approach is that if the job doesn't start running, for example you have a job routing rule sending the job to a downed appserver, your appservers are maxed out on WorkItemThreads, or you've reached the MaxJobs threshold, then your executeJobAndWaitForRunID is going to sit waiting for the job to start running and you are back to the possible idle connection timeout happening.

 

There are other commands that look interesting in the JobRun namespace (released and unreleased).  The JobRun.showRunningJobs looks neat - it outputs a nice table - but it only shows the job name which is not enough to know if your job is actually running since you can have duplicate job names in different workspace folders. 

 

The above example is pretty simple.  This can be extended to manage multiple job runs - for example if you wanted to kick off a number of jobs at once and then wait until they were all complete, check their status, and then move on to some other action.  One approach to that would be some arrays to hold the job information an another loop around what I've done above.  That may be a future post.