9 Replies Latest reply on Aug 18, 2015 2:30 AM by Mark Francome

    Load Balancing in a virtual environment

    Mark Francome
      Share This:

      Hi,

      We use Control-M Agents on Linux hosts and the jobs are set to run on virtual package names (as opposed to the underlying physical name of the server). Control-M works well in this way (btw, we are on Version 8 with all latest patches).

       

      However, the user now wants to limit the number of jobs running on any one physical box - our virtual packages can dynamically fail over so potentially all virtual packages could be running on one server.

       

      It might be easier to give an example:

       

      Job DBA_001 runs on virtual package (i.e. %%NODEID) ora_bnkg1, this job also runs on virtual packages ora_bnkg2 and ora_bnkg3 and so on. These virtual packages can run on a variety of servers (i.e. lbnds100, lbnds101, lbnds102 and so on). It is possible that all ora_bnkg* packages could be running on the one server (e.g. lbnds101). How can we limit (via QRs or CRs) the number of DBA jobs running on lbnds101 at any 1 given time?

       

      I can get the actual server from $(hostname) - so using this command -

       

      phyhost=$(hostname) && echo $phyhost && ecaqrtab UPDATE $phyhost 0

       

      but I cannot use the above for an already defined job ... I have also tried to get this to work via CTMLOADSET but I think that only works via the defined NODEID and doesn't actually reflect the physical server - is this correct?

        • 1. Re: Load Balancing in a virtual environment
          AVINASH MAHAPATRA

          Hi Mark,

           

          I think you can try with "Hosts Management" in CCM for v8.

           

          Thanks,

          Avinash

          1 of 1 people found this helpful
          • 2. Re: Load Balancing in a virtual environment
            Robert Stinnett

            Avinash is correct, There are two ways to accomplish this:

             

            Via Host Restrictions in the Hosts Manager of the CCM you can define the maximum number of jobs (total) to run on the agent at any given time.

             

            If you wanted to get a little more sophisticated with it, you could use a Workload Policy.  In this manner you could say that, for example, at any given time you only want a grant total of x number of jobs that meet a certain type criteria running.   You would then filter by host (or you could filter by your host group).

             

            Hope that helps.

             

            Cheers,

            Robert Stinnett

            • 3. Re: Load Balancing in a virtual environment
              MunKeong Lee

              Hi Mark

              If I understand your question correctly, Host Restrictions will not work in your case. Control-M is going to use the virtual package names to execute jobs. You can control the no. of jobs running on a particular virtual package name through host restrictions but you can't do it on the physical host because Control-M is not even aware which physical host is used.

              I suggest that you look at the maximum no. of jobs for all physical hosts instead of individual physical host, This is based on the assumption that your virtual environment will do the necessary load balancing. Using your example, let's say you have 2 physical hosts, lbnds101 and lbnds102 and each one of them can handle 10 jobs. Create a workload policy with the following criteria:

              For filter: set Host=ora_bnkg*

              For running jobs: set to 20 (if both physical hosts are available), set to 10 (if only 1 physical host is available)

              If both lbnds101 and lbnds102 are available, you can have up to 20 ora_bnkg* running jobs. The load balancing may not be perfect. There may be more jobs on lbnds101 than lbnds102 but the difference should not be great. When lbnds101 is not available, update the max running jobs to 10 and now you can have up to 10 ora_bnkg* jobs running on lbnds102.


              Regards,

              MK

              1 of 1 people found this helpful
              • 4. Re: Load Balancing in a virtual environment

                Another alternative would be to use the quantitative resources load balancing.

                 

                For example:

                • Create a note-group named ‘my-group’ the includes ‘host-a’ and ‘host-b’.
                • Create 2 quantitative resources as following:

                          cpu@host-a       20

                          cpu@host-b       10

                • In the job definitions (those that will run on ‘my-group’), configure how many resources each individual job requires:

                          cpu@    5

                 

                This means that 4 jobs will run in parallel on ‘host-a’, but only 2 will run in parallel on ‘host-b’.

                 

                The benefit of this alternative is that you can automate changes to the load balancing configuration using the ‘ecaqrtab’ and ‘ctmloadset’ utilities.

                Search for these 2 utilities in the utilities guide and you’ll find more information and examples.

                • 5. Re: Load Balancing in a virtual environment
                  Mark Francome

                  Hi,

                   

                  Thanks guys, all the replies gave me help and I have made progress. The fundamental issue (as MK points out) is that my host Agent is not reflected in the NODEID field and virtualization makes it hard for me to control the job submission with QRs when I don't know the relevant QR before the job is executing. Maybe it's one for the Control-M "wish list" - i.e. make a variable QR (or exclusive CR) based on the returned value of "hostname" as part of a pre-command step.

                   

                  I'm a little bit cautious and don't want to see any "unintended consequences" in the batch, therefore I'm going to try Avinash and Robert's suggestion of limiting via host management (in the CCM). I'm going to try limiting the jobs by CPU % on a single virtual node. I've just run a ctmstats -list report to give me an idea of the expected numbers involved and hopefully we'll keep our DBAs happy with their housekeeping jobs. DBAs, as you know, can be angry people - especially when you use the BMC supplied PostGres for your DB

                   

                  Thanks again,

                   

                  Mark.

                  • 6. Re: Load Balancing in a virtual environment
                    MunKeong Lee

                    Hi Mark:

                    Thanks for sharing your approach. Going with host management is definitely much better and less complicated. Also noticed that you are using CPU based on virtual node. This is correct as CPU is measured just before executing the job. If it's based on max. concurrent jobs, the count will be based on multiple physical nodes that was assigned to the virtual node over time and this will not reflect the true loading of the host.

                    Regards,

                    Mun Keong

                    • 7. Re: Load Balancing in a virtual environment
                      Mark Francome

                      Hi Mun Keong,

                      Is there any way to get further information on how the CPU load is calculated by Control-M? The only clue I have is by running the ctmstats utility on the Control-M Server and viewing the "Avg. CPU" number listed. I guess if a job starts with low CPU usage and then suddenly increases then we'd only see that in subsequently scheduled jobs being held back from execution.

                      Regards,

                      Mark.

                      • 8. Re: Load Balancing in a virtual environment
                        MunKeong Lee

                        Hi Mark:

                        Your question leads me to this knowledge article https://kb.bmc.com/infocenter/index?page=content&id=KA389062 . It's a good read on how cpu data on the agent is send to the server. I expect the server to pull the cpu data from the agent before deciding whether to execute on the agent but it turns out to be the agent pushing it to the server at periodic interval. I'm not sure what will actually happens in your environment but these are my thoughts. The physical host will be the one pushing the information. For every push, the cpu data will end up in 1 of the many virtual hosts and the virtual host that get the cpu data may not be next one to send a job to this physical host. Overall it can be quite messy and the host restriction rule may be wrongly applied because it is not using the most up to date data.

                        Unless there are other advantages that this virtual environment can offer, my suggestion is to switch to physical hosts to make use of Control-M load balancing. Create a host group consisting of all the physical hosts that your ora* jobs will use. Set the host restrictions for all the physical hosts. Next, do a Find and Update and change the host value from ora* to the host group name. Control-M will now select the physical host for job execution based on the host restriction criteria that you have imposed.

                        Regards,

                        MK

                        • 9. Re: Load Balancing in a virtual environment
                          Mark Francome

                          Ok, after opening a case with BMC I've discovered the following -

                           

                          1. The CPU limitation only works with physical servers, not virtual/aliased nodes. Limiting on number of jobs, however, works fine in both scenarios.

                           

                          2. To verify that you have defined the CPU limitation you can check in the CM Server DB -

                           

                               select * from CMS_NODEID_RUNNING;


                          3. But if the limitation is not in place then there will be no relevant output from this CM Server command -


                               ctmipc -dest ce -msgid ctl -data "CpuLimits PrintAll"


                          I will try and find a work-around for this.