These are the below options for rollback the complete RHEL patches including the kernal ones.
1) A snapshot done at the storage level. E.g. yum-fs-snapshot – which can make an LVM or BTRFS snapshot. Fully automating this would work on systems that a) have enough free spare volume space, and b) have a new enough version of yum. (This wouldn’t work on RHEL 5, or earlier machines
2) A snapshot done at a platform, e.g. a snapshot done in vmware. Less useful here, as our physical server count is very high.
3) A file system-level rollback. What was done here was an LD_PRELOAD library, that offers its own unlink/open/fdopen etc system calls, and backs up files being open for writing, and then passes the call onto the real libc library. (It would also keep track of symlink creation/changes, new file creation, to be deleted on a rollback etc). This was ultimately a copy-on-write design, at the file-system level. However, this got around all the other limitations of the other solutions. It didn’t care if it was a virt of physical, and it didn’t matter how much LVM volume space was left – all that matters was how much file system space was left.
So the questions I had:
1) Can bladelogic leverage things like yum-fs-snapshot natively.
2) Is there any better solution available now, or coming soon, that would offer a proper, fast, fully automated rollback.
3) Can bladelogic execute a job with certain environment variables set, like an LD_PRELOAD, so we could make use of the 3rd option; should we write such a tool?