Performing zero down-time upgrade and viewing the upgrade status

Version 5
    Share This:

    Refer to this topic if you are performing the zero down-time (ZDT) upgrade of Remedy platform components (AR, Atrium CMDB, and Atrium Integrator) from 9.x to 9.1.04 and above. This topic provides information about the zero down-time upgrade flow, platform installer enhancements, viewing the upgrade status, and rollback utility.

     

     

    ZDT upgrade flow

    The ZDT upgrade sequence is as follows:

    1. Upgrade Mid Tier
    2. Upgrade platform components on your primary server
    3. Upgrade platform components on your secondary servers

     

    The following webinar demonstrates the enhancements introduced in version 9.1.04 relevant to zero-downtime upgrade:

    https://communities.bmc.com/community/bmcdn/bmc_it_service_support/blog/2017/12/08/remedy-9104zero-down-time-zdt-upgrade-recorded-session

     

    Platform installers: enhancements

     

    The following enhancements are introduced in Remedy 9.1.04 to support the zero down-time upgrade of platform components:

     

    Creating the backup of existing binary files

    The platform installer checks the version from which you are upgrading. If you are upgrading from 9.X (X = 9.0/9.1), the platform installer takes the backup of binary files. The backup is needed to roll back in case of installation failure. Though you can perform zero down-time upgrade of the platform components from Remedy versions 7.x and 8.x, the installer does not create the backup as rollback is not supported for upgrades from versions 7.x and 8.x.

     

    If you are upgrading AR System, the installer takes the backup of the entire AR System folder and FTS collection (indexes) folder provided the FTS folder resides outside the AR System folder.

     

    The following snapshot displays the default location of the backup folder:          

     

    • The default name for the backup folder under the installation directory is ZdtArBackup.
    • If select a location under the installation directory with some other name example <AR System installation directory>\zdtbk, the following error message is displayed:

         Invalid backup directory name

    • You can take the backup in a different directory or a drive.
    • You cannot take the backup in temp folder. For example, you cannot have the backup directory under C:\tmp
    • To create backup, the installer needs more disk space.

     

     

    Setting and resetting the operating mode

    Let us consider a Remedy stack where AR System, Atrium Core, and Atrium Integrator are installed. When you perform upgrade, AR installer sets the operating mode and Atrium Integrator installer resets the operating mode based on the following conditions:

    • AR installer sets the operating mode. If AR installer detects that Atrium Core is installed, it does not re-set operating mode and will not run the sanity test.
    • Atrium Core installer sets the operating mode and if detects that Atrium Integrator is installed, it does not reset the operating mode.
    • Atrium Integrator finally resets the operating mode.

     

    Why to change the operating mode?

    Let us consider a Remedy environment where primary server and secondary servers exist. FTS indexer is configured on the primary server.

    • When you upgrade the primary server, the installer takes the backup of FTS collection folder and sets the operating mode.
    • As the operating mode is set, the ft_pending table continues to get updated with the records received from the secondary server. Thus, operating mode keeps the backup and FTS Collection folders in sync.
    • After the primary server is upgraded, the operating mode is reset. FTS collection folder gets updated with the records received from the primary server. The sync between FTS Collection and backup folders is lost.

    To know more about operating mode, see Operating Mode.

     

    Additional column in the Control table

     

    Remedy database version is maintained in the Control table. In 9.1.03 and older versions, the dbVersion column in the Control table is updated after upgrading the primary server. Upgraded db version restricted the restart of non-upgraded servers. To remove this restriction, new column currentDbVersion is added to the Control table. Upgraded servers point to currentDbVersion and non-upgraded server point to dbVersion column. The dbVersion column is updated with the latest database version after upgrading all the servers in a server group.

    How to verify if all the servers are upgraded?

     

    The following fields are added to the AR System Server Group Operation Ranking form:

     

      • AR Server Version

      • CMDB Version

      • Atrium Integration(AI) Version

     

    After successfully upgrading the platform components on all the servers of a server group, the platform installers update the above fields with the latest version. For example, 9.1.04. As a next step, ZDT post installation tasks are initiated. Upon the successful completion of post installation tasks, upgrade status is marked as Done.

      

     

    Platform upgrade status

    The following example explains when the server group upgrade status marked as Done.

    Scenario 1Scenario 2Scenario 3

     

    Server group contains 3 servers which are getting upgraded:

    Server 1- Upgrading

    Server 2- Active but not upgraded

    Server 3- Active but not upgraded

    Active means AR Server is up and running.

    In this scenario, post installation tasks are started when all the 3 servers in the server group are upgraded.

      Upgrade status is marked as Done when the post installation tasks are completed.

     

    Server group contains 3 servers which are getting upgraded:

    Server 1- Upgraded

    Server 2- Upgraded

    Server 3- Active but not upgraded

     

    In this scenario, post installation tasks are started when all the 3 servers in the server group are upgraded.

    Upgrade status is marked as Done when the post installation tasks are completed.

    Please note , it is must that Server 3 should be upgraded. if you don't upgrade it , the upgrade status will be not marked as Done.

     

    Server group contains 3 servers which are getting upgraded:

    Server 1- Upgraded

    Server 2- Upgraded

    Server 3- In-Active but not upgraded

    An in-active server is not up and not running. Sometimes, dummy or invalid entries in the "AR System Server Group Operational Ranking" form are considered as in-active servers.

    After upgrading the active servers in the server group, the installer waits for 48hrs and starts the post installation tasks. it recommended that before starting upgrade delete those in-active server entries from "AR System Server Group Operational Ranking" to avoid this delay of 48hr

    After completing the post installation tasks for the active servers, the upgrade status is marked as Done.

      In this scenario, the in-active server cannot start because of db version mismatch. The server starts only after you upgrade it.

     

     

    Viewing the server group upgrade status

    You can view the server upgrade status on AR System Administration > Server Information > Platform > Server Group Upgrade Status. Alternatively, you can view the server group upgrade status in the UpgradeStatus column of the database control table.

    • After upgrading the primary server, the server group upgrade status is marked as Pending. It means, the post installation tasks are not yet completed.       
    • After upgrading all the active servers in a server group, the upgrade status is marked as Done. It means, the post installation tasks are completed.

    Note: You can start upgrading the ITSM application only when the Server Group Upgrade Status is marked as Done.

     

     

    Post installation tasks

    The following post installation tasks are triggered after all the servers of a server group are upgraded.

    • Enabling useSHA256 flag to true in control table
    • Creating index in the ft_pending table
    • Deleting email ranking information from the ranking form
    • Deleting/setting UDM related metadata
    • Importing Atrium Integrator sample jobs
    • Updating the dbVersion column in the control table

    The above list is applicable for Remedy 9.1.04 release only. For each release, the above list is updated if needed.

     

    Rollback and Cleanup utilities

     

    When zero-downtime upgrade fails, platform components and file system are rolled back to the older version automatically. If automatic rollback fails, you have to run the rollback utility manually. For example, if the upgrade process of a primary server fails, the installer automatically triggers the rollback mechanism. However, automatic rollback might fail one of the cause could be binaries is locked by a process. To resolve this issue, you have to run the rollback utility manually.

     

    After upgrading the platform components successfully, the installer deletes the backup folders. If the backup folders are not deleted automatically, you have to manually run the cleanup utility.

    To know how to run the rollback utility, see Troubleshooting installer failure during upgrade.

     

    Roll back and Restore is supported only for version 9.x and above , its not supported for 7.x & 8.x version , as explained above 7.x and 8.x servers installer does not take file system back and does not have capability to restore them. so the installer failure will be just reported as it use to do it earlier

     

    There are 2 cases

     

    Case 1 :- Installer Rollback in case of failure :- When you are upgrading and there is a installation failure ( failure could be because of def import / rik error) , installer itself will trigger the roll back event and does set of following action -

      1. Re-set upgrade mode
      2. Restore old DB Version
      3. Stop all the process related to new AR Server ( example  AR Server , Flashboard etc.)
      4. Deletes all the WINDOWS registry services ( UNIX does not need this step)
      5. Deletes all the new binary file system ( 91SP4 in this case )
      6. Restore old binary from ZDT_Backup folder
      7. Creates WINDOWS registry services for old server
      8. Start the AR Server and its related process

     

    Case 2:- Command line utility to roll back and clean up :- BMC Ships 2 batch file with Installer(..\ARSuiteKitWindows9.1.04\installcompletionutility) one for rollback and other for clean-up  , When to use this , for example if installer fails to roll back ( may be because installer was not able to stop process , or delete file system etc.) , you can run the roll back utility to roll back to earlier version. Uutility does following set of actions

      1. Re-set upgrade mode.
      2. Restore old DB Version
      3. Stop all the process.
      4. In case it detects that process were not stopped , it will kill the processes
      5. Deletes all the WINDOWS registry services ( UNIX does not need this step)
      6. Deletes all the new binary file system ( 91SP4 in this case )
      7. Restore old binary from ZDT_Backup folder
      8. Creates WINDOWS registry services for old server
      9. Start the AR Server and its related process

    Note :-  This utility can been executed multiple time if required.

    Note :-  After fixing the issue , you can re-run 91SP4 to upgrade your server

    Note :-  There is no way you can skip the back and roll back for 9.x version.

    Note :-  To run this utility two things are important back-up folder and installation xlm file. if you have that in place you can run this utility without any issue , it will restore you setup even your AR Server is down , even it can restore if WINDOWS registry services got deleted.

     

    FAQs and troubleshooting

     

    To troubleshoot the ZDT upgrade issues, see Troubleshooting installer failure during upgrade.