2 Replies Latest reply on Feb 18, 2020 8:47 PM by Paul Robins

    Enterprise wide system Maintenance Procedures for Control-M

    Daniel Sivar
      Share This:

      Hi All,


      Let me start by saying if there is already info out on this topic sorry for the repeat.  I've been searching and found nothing so here is my situation.


      I got sucked into Control-M several years ago, not knowing anything.  At that time we had migrated from UC4 to Control-M and BMC support folks built our system, and also created a document on what to do during system maintenance.


      At my company, they do system wide patching of servers (Both Windows and Linux), then restart them.  Back when we were on all I had to do was follow the document created, it was very straight forward and never had anything crazy go on.


      Basically it had you do:  (FYI: our Control-M server's are all Windows).

      *System Stop on ALL jobs from running in application

      *Log into the EM secondary (we have High Availability on):

           Stop the Sever and EM config agents in service manager

      *Open CCM and stop : BIM, Forecast Server, Gateway, GCS, GUI Server, Self Service Server, and Web Server on primary.

      *Log into Primary EM and shut down: EM & Server Config agent, and Control-m/server.


      Once the patching to Windows completes and DB stuff is done we basically reverse the ordering and bring everything back up.


      Again, I've done this many of times and each time has always been straight forward.  Recently (Dec. 2019) I upgraded EM, Server, etc. to the version starting with our non prod system.  The upgrade went smooth, however once we hit our first maintenance window in Jan (first maint. window for this version), bringing the system back up with the steps above proved to have complications.  I had to do all kinds of random things like kill background processing in the windows task manager, restart service, etc.  We just had another one, and the same thing occurred; I'll also have prod happening later this month which I am guessing will occur there too.   


      My questions are:

      1.) What is really the best practice in Starting/Stopping the application for Enterprise wide server maintenance?

                *Does anything really need done at all?

      2.) Should I even be doing this?  Or can the application be left alone?



      Any guidance/suggestions would be helpful.


      Thank you,

        • 1. Re: Enterprise wide system Maintenance Procedures for Control-M
          Debra Greszler

          We too have HA configured. When I first got involved with Control-M two years ago, I did the manual service stops and server reboots as this is what I had been told we need to do. I have had this same conversation with BMC support and was told that we did not have to stop the services or reboot the servers in any special order. Our next patching cycle, I let it run automatically including the server restarts and all went well. The only thing I did was stop the services on our HA server since I did not want it to be primary. It patches in a later patching window. What we have since done is set up a Control-M job to stop the HA services prior to the automatic patching window and to put our servers in maintenance mode in our enterprise monitoring system (Zabbix). Now we do nothing for our automated patching that occurs Sunday morning. We have mostly safeguarded this window from having jobs scheduled in Control-M as this is our standard maintenance window for the majority of our systems.


          Hope this helps.

          1 of 1 people found this helpful
          • 2. Re: Enterprise wide system Maintenance Procedures for Control-M
            Paul Robins

            We have our primary and secondary servers in different patching windows at least 1 week apart (the servers are on different VM infrastructure at different sites, including MSSQL cluster setup).

            We fail over to secondary prior to our primary patching, then we fallback to primary prior to our secondary patching. This way if the patching introduces any issues on the primary system we can run from secondary and delay the secondary patching until the issue is resolved.

            It also means we don't get called at 4am to PVT Control-M