Currently BDA does not have the ability to automatically or manually restart a build job when a failure occurs.
During database provisioning, the Provisioning Progress is displayed both under the Jobs section as well as the Job Summary for a given job. However, if a job failure occurs, BDA does not perform job clean-up on the node being provisioned, or back out the installed components on the node, to allow the install to be restarted.
Additionally, the ability to automatically, or manually restart a Provisioning Job at the point of failure does not exist.
BDA Provisioning Recovery Idea:
Add functionality to BDA to automatically recover or intervene at the point of failure following a provision job failure. This functionality would include:
- Post-job failure recovery whereby, BDA restarts a database provision at the point of job failure, and after performing any needed failure clean-up.
- The ability to manually fail a provision job and restart at the point of manual failure. This functionality currently exists in BMC BladeLogic Server Automation (BSA).
- Configurability, to set a finite number of job failures before permanently failing the provision, and a function to completely roll back (full installer clean-up on node) regardless of the current stage of the provision job.
- Modularity to this functionality in the BDA-CLI allowing each task type to be individually called, such as roll-back, clean-up, etc.
- BDA-API access to these new functions.
This functionality would improve efficiency, provisioning flexibility, and reduce resource waste when a provision failure occurs. Adding this recovery feature would also allow earlier phases in the server build process (OS provisioning) to pause or restart regardless of previous BDA provisioning failures via API or BDA-CLI calls.