1 2 Previous Next 20 Replies Latest reply: Jun 12, 2012 1:05 PM by Bill Robinson RSS

    help on issue with compliance jobs hanging

    Johann BIGOT

      Hello,

       

      I have an issue with software compliance jobs hanging whitout starting anything.

      My version is 8.0 SP11.

      I have a batch job which contain several compliance jobs (22). When I run the batch job I only the first 5 compliance jobs in the task view.

      If I open the result log of the batch job I can see all the others are supposed to be launched. But nothing happens even for the 5 I see in the task view.

      I have no entries in the appserver.log except the ones about the run of the jobs.

      When I open the result log of a compliance job, there's also nothing except the line that say the job is running.

      I have waited up to one hour but the 5 first jobs don't move.

      I have try the create the compliance jobs, restart appservers, reboot all the infra but with the same result :-(

      Does anyone has an idea or suggestion on my issue ?

       

      Regards

       

      Johann

        • 1. Re: help on issue with compliance jobs hanging
          Sean (BladeLogic Fan) Berry

          How many appservers do you have, and are they on more than one host?

          • 2. help on issue with compliance jobs hanging
            Joe Piotrowski

            I would try to break it down into smaller troubleshooting steps:

            - If you run the first compliance job outside of the batch job does it run successfully?

            - Is the batch job set to use job targets or specify targets?

            - Are your target servers online?

            - Do you use logic in your Smart Server Groups to only run against responding agents?

            - Do the rest of the compliance jobs run successfully individually?

            - What OS are your targets?

            - What version are your agents?

            - Have you run a USP job recently?

            • 3. help on issue with compliance jobs hanging
              Johann BIGOT

              Hi Sean

               

              I have 2 physical server each hosting 2 job server instances

              • 4. help on issue with compliance jobs hanging
                Johann BIGOT

                Hi Joe,

                 

                1) I tried to run the same compliance jobs one by one outside the batch job but with the same result.

                2) batch job uses job targets

                3) yes we are running a update properties job first and we use smartgroups which contain only alive machines

                5) no

                6) we are targeting window 2003 and 2008 R2 machines

                7) the older version is 8.0 SP4 but majority are in SP10

                 

                regards

                • 5. help on issue with compliance jobs hanging
                  Joe Piotrowski

                  OK, so the problem is your compliance jobs are failing? Are these custom compliance jobs or OOTB jobs supplied by BMC? What type of compliance jobs are they? From each of your appservers, can you succesfully run an agentinfo against a target server that is hanging/failing? Also, do all of your target servers have large mapped shared storage?

                  • 6. help on issue with compliance jobs hanging
                    Bill Robinson

                    what is the job parallelism set to on the compliance jobs?

                     

                    is there a job_timeout property and/or job_part_timeout property value set for the CJ ?

                     

                    is the 'ignore compliance failures' option selected in the cj ?

                     

                    as joe mentioned, what is in the templates?  EOs? 

                    • 7. help on issue with compliance jobs hanging
                      Johann BIGOT

                      First of all...thanks for all your answer :-)

                      My problem is that the compliance jobs are doing nothing. They are just launched by the appserver and then do nothing.

                      they are custom custom compliance for software compliancy. I can run them from any of my appservers and the result is the same...nothing :-(

                      Yes I can sucessfully run an agentinfo on any target machine but the compliance job doesn't even start to query machines. It seems to wait something that I don't know.

                      And no most of my servers don't have large storage.

                      • 8. Re: help on issue with compliance jobs hanging
                        Bill Robinson

                        What conditions are in the templates ?  what parts are they looking at?

                         

                        When you say they are doing nothing, can you tail the appserver and target agent logs when you run the job and see if there is any activity ?

                        • 9. help on issue with compliance jobs hanging
                          Johann BIGOT

                          parallelism in our compliance jobs is set to unlimited.

                           

                          There is no job_timeout ou job_part_timeout set but for the job_part_timeout as the jobs don't even start to query the targets I'm not sure it would be useful.

                           

                          No the option "Continue despite compliance data collection errors" is not set.

                           

                          In fact in the component template filtering as we can't use smartgroup we set a fake template and we use a component smartgroup which will regroup all software/machines that we want to check

                           

                          everything was running perfectly until 1 or 2 months where we began to have some slowness on the jobs but now they are doing nothing.

                          • 10. help on issue with compliance jobs hanging
                            Joe Piotrowski

                            Have your Component Templates changed? If so, have you tried running a Discover job again? As Bill asked, what discovery conditions (if any) are there, what parts are you including and what are the rules?

                            • 11. help on issue with compliance jobs hanging
                              Johann BIGOT

                              we are mainly looking for registry keys as parts. The compliance rules are veru basic with if conditions.

                              here is a example of compliance rule

                               

                              if

                                   ??TARGET._SERVERPROFILE.

                              SERVERPROFILES_V_4._ADAM_32_V_4?? != "_empty"
                                then
                                   "Registry
                              Value:HKEY_LOCAL_MACHINE\SOFTWARE\Packages\??TARGET._SERVERPROFILE.SERVERPROFILES_V_4._ADAM_32_V_4._SIGNATURE??"
                              exists

                               

                              When I run the jobs, they are launched but there isn't activity.

                              Here is an extract from my appserver log

                               

                              [01 Jun 2012 13:16:07,869] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running the job '6.4.1.3 - PROD - Start Release Compliance' on application server 'p2twisdt01l01job01'(4)

                              [01 Jun 2012 13:16:08,492] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN01

                              [01 Jun 2012 13:16:08,679] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN02

                              [01 Jun 2012 13:16:09,395] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN03

                              [01 Jun 2012 13:16:09,582] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN04

                              [01 Jun 2012 13:16:09,784] [Job-Execution-1] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Compliance] Started running the job '6.4.1.3.1 - PROD_CAN04' on application server 'p2twisdt01l01job01'(4)

                              [01 Jun 2012 13:16:09,893] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN05

                              [01 Jun 2012 13:16:09,908] [Job-Execution-4] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Compliance] Started running the job '6.4.1.3.1 - PROD_CAN02' on application server 'p2twisdt01l01job01'(4)

                              [01 Jun 2012 13:16:09,908] [Job-Execution-0] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Compliance] Started running the job '6.4.1.3.1 - PROD_CAN03' on application server 'p2twisdt01l01job01'(4)

                              [01 Jun 2012 13:16:09,908] [Job-Execution-3] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Compliance] Started running the job '6.4.1.3.1 - PROD_CAN01' on application server 'p2twisdt01l01job01'(4)

                              [01 Jun 2012 13:16:10,547] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN06

                              [01 Jun 2012 13:16:10,733] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN07

                              [01 Jun 2012 13:16:10,936] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN08

                              [01 Jun 2012 13:16:11,107] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN09

                              [01 Jun 2012 13:16:11,278] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN10

                              [01 Jun 2012 13:16:11,512] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_AUTH01

                              [01 Jun 2012 13:16:11,698] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_AUTH02

                              [01 Jun 2012 13:16:11,916] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_BATCH01

                              [01 Jun 2012 13:16:12,103] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_BATCH02

                              [01 Jun 2012 13:16:12,290] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_DIV01

                              [01 Jun 2012 13:16:12,477] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_DIV02

                              [01 Jun 2012 13:16:12,664] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_FS01

                              [01 Jun 2012 13:16:12,835] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_FS02

                              [01 Jun 2012 13:16:13,006] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_MON01

                              [01 Jun 2012 13:16:13,177] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_MON02

                              [01 Jun 2012 13:16:13,395] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_BMC01

                              [01 Jun 2012 13:16:13,566] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_BMC02

                              [01 Jun 2012 13:16:13,753] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Compliance] Started running the job '6.4.1.3.1 - PROD_CAN05' on application server 'p2twisdt01l01job01'(4)

                              [01 Jun 2012 13:16:48,076] [Scheduled-System-Tasks-Thread-6] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 207159296,Free JVM (B): 82156176,Used JVM (B): 125003120,VSize (B): 360439808,RSS (B): 325246976,Used File Descriptors: 1967

                              [01 Jun 2012 13:17:47,863] [Scheduled-System-Tasks-Thread-5] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 209387520,Free JVM (B): 98815200,Used JVM (B): 110572320,VSize (B): 364261376,RSS (B): 330264576,Used File Descriptors: 1967

                              [01 Jun 2012 13:18:47,636] [Scheduled-System-Tasks-Thread-1] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 209387520,Free JVM (B): 43430456,Used JVM (B): 165957064,VSize (B): 363270144,RSS (B): 329441280,Used File Descriptors: 1935

                              [01 Jun 2012 13:19:47,423] [Scheduled-System-Tasks-Thread-2] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 211419136,Free JVM (B): 61654760,Used JVM (B): 149764376,VSize (B): 367083520,RSS (B): 330346496,Used File Descriptors: 1950

                              [01 Jun 2012 13:20:47,239] [Scheduled-System-Tasks-Thread-6] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 212860928,Free JVM (B): 79198168,Used JVM (B): 133662760,VSize (B): 368549888,RSS (B): 332754944,Used File Descriptors: 1939

                              [01 Jun 2012 13:21:47,173] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 194641920,Free JVM (B): 76605056,Used JVM (B): 118036864,VSize (B): 363749376,RSS (B): 331141120,Used File Descriptors: 1970

                              [01 Jun 2012 13:22:47,107] [Scheduled-System-Tasks-Thread-3] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 194641920,Free JVM (B): 21709640,Used JVM (B): 172932280,VSize (B): 374521856,RSS (B): 341393408,Used File Descriptors: 1940

                              [01 Jun 2012 13:23:47,025] [Scheduled-System-Tasks-Thread-3] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 206831616,Free JVM (B): 47513552,Used JVM (B): 159318064,VSize (B): 372621312,RSS (B): 341745664,Used File Descriptors: 1962

                              [01 Jun 2012 13:24:46,959] [Scheduled-System-Tasks-Thread-1] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 189202432,Free JVM (B): 40909032,Used JVM (B): 148293400,VSize (B): 364376064,RSS (B): 336453632,Used File Descriptors: 1949

                              [01 Jun 2012 13:25:46,892] [Scheduled-System-Tasks-Thread-5] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 201916416,Free JVM (B): 60937352,Used JVM (B): 140979064,VSize (B): 365125632,RSS (B): 338190336,Used File Descriptors: 1958

                              [01 Jun 2012 13:26:46,879] [Scheduled-System-Tasks-Thread-3] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 215089152,Free JVM (B): 80853384,Used JVM (B): 134235768,VSize (B): 412426240,RSS (B): 349573120,Used File Descriptors: 1936

                              [01 Jun 2012 13:27:46,850] [Scheduled-System-Tasks-Thread-1] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 234946560,Free JVM (B): 90904480,Used JVM (B): 144042080,VSize (B): 412868608,RSS (B): 350666752,Used File Descriptors: 1958

                              [01 Jun 2012 13:28:46,836] [Scheduled-System-Tasks-Thread-2] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 204931072,Free JVM (B): 64538152,Used JVM (B): 140392920,VSize (B): 411471872,RSS (B): 350109696,Used File Descriptors: 1944

                              [01 Jun 2012 13:29:46,807] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 233046016,Free JVM (B): 96633216,Used JVM (B): 136412800,VSize (B): 408035328,RSS (B): 346959872,Used File Descriptors: 1959

                              [01 Jun 2012 13:30:46,799] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 205258752,Free JVM (B): 73302296,Used JVM (B): 131956456,VSize (B): 406773760,RSS (B): 345780224,Used File Descriptors: 1944

                              [01 Jun 2012 13:31:46,788] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 230752256,Free JVM (B): 102232168,Used JVM (B): 128520088,VSize (B): 404807680,RSS (B): 343863296,Used File Descriptors: 1959

                              [01 Jun 2012 13:32:46,794] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 205193216,Free JVM (B): 82029384,Used JVM (B): 123163832,VSize (B): 402321408,RSS (B): 341467136,Used File Descriptors: 1953

                              [01 Jun 2012 13:33:46,753] [Scheduled-System-Tasks-Thread-3] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 228589568,Free JVM (B): 109260800,Used JVM (B): 119328768,VSize (B): 400371712,RSS (B): 339435520,Used File Descriptors: 1975

                              [01 Jun 2012 13:34:46,727] [Scheduled-System-Tasks-Thread-3] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 205258752,Free JVM (B): 90117456,Used JVM (B): 115141296,VSize (B): 396881920,RSS (B): 336244736,Used File Descriptors: 1962

                              [01 Jun 2012 13:35:46,685] [Scheduled-System-Tasks-Thread-3] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 226230272,Free JVM (B): 114320808,Used JVM (B): 111909464,VSize (B): 395505664,RSS (B): 334688256,Used File Descriptors: 1973

                              [01 Jun 2012 13:36:46,658] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 205258752,Free JVM (B): 94888200,Used JVM (B): 110370552,VSize (B): 392491008,RSS (B): 331853824,Used File Descriptors: 1967

                              [01 Jun 2012 13:37:46,616] [Scheduled-System-Tasks-Thread-6] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 205258752,Free JVM (B): 38882136,Used JVM (B): 166376616,VSize (B): 393621504,RSS (B): 332730368,Used File Descriptors: 1957

                              [01 Jun 2012 13:38:46,614] [Scheduled-System-Tasks-Thread-1] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 223936512,Free JVM (B): 62437176,Used JVM (B): 161499336,VSize (B): 391106560,RSS (B): 330366976,Used File Descriptors: 1950

                              [01 Jun 2012 13:39:46,597] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 205389824,Free JVM (B): 47360544,Used JVM (B): 158029280,VSize (B): 389541888,RSS (B): 328159232,Used File Descriptors: 1957

                               

                               

                              As you can see more than 20 min after the launch of the jobs, nothing happens

                              • 12. help on issue with compliance jobs hanging
                                Joe Piotrowski

                                Have you tried re-running the Discover job that creates the Components? Can you successfully live Browse the Components on the servers afterwards?

                                • 13. help on issue with compliance jobs hanging
                                  Johann BIGOT

                                  Hi Joe,

                                   

                                  I have successfully re-run discovery jobs several time.

                                  Each time I was able to browse components on servers after that.

                                  I'm currently running cleaning jobs (appserver, file server, database) to see if it helps in my issue.

                                  • 14. help on issue with compliance jobs hanging
                                    Bill Robinson

                                    what does the property SERVERPROFILES_V_4._ADAM_32_V_4._SIGNATURE resolve to?

                                     

                                    can the role running the compliance job see that property instance and property value ?

                                    1 2 Previous Next