if the commands you are running will down the nic that blade is using to talk to the box, you need to set the 'out of band reboot' option on those items in the blpackage.
I did that already for two of the items, where I could think of that they break connection.
But in the log you also can see:
05/07/14 19:41:18.761 WARN bldeploy - [Execute Teamingw2k8.cmd] A potential out of band reboot as indicated by the item did not occur. This out of band reboot could be conditional. If the reboot was mandatory for this item then there may have been an issue while deploying the item.
05/07/14 19:41:18.995 WARN bldeploy - [SetIPAddress MGT] A potential out of band reboot as indicated by the item did not occur. This out of band reboot could be conditional. If the reboot was mandatory for this item then there may have been an issue while deploying the item.
Sounds for me that they were not used!?
But this entry from job log proofs that we are loosing the connection:
Warning May 7, 2014 5:42:38 PM Failed to receive a heartbeat within 90 seconds.
I added the option "out of band reboot" for all items now. It still fails with the exact same error.
1 of 1 people found this helpful
Logs suggest that bltargetmanager has restarted.
Connection loss between appserver and target could not necessarily mean a reboot. But appserver has assumed that server is rebooting.
Did target server actually reboot?
If not than bldeploy continues running and restart of bltargetmanager has triggered a new instance of bldeploy which will run on same deploy artifacts. Which could mean that both these are eating up each others files in same staging location.
Can we confirm two things - Has server actually rebooted? and, if not, do we see two instances of bldeploy running with same set of parameters?
Indeed, the server was not restarting during the deployment.
I got it fixed with adding a reboot at the end of last item deployment.
It is not ideal as this is a unnecessary reboot but it works.