1 2 Previous Next 20 Replies Latest reply: Jun 12, 2012 1:05 PM by Bill Robinson RSS

help on issue with compliance jobs hanging

Johann BIGOT

Hello,

 

I have an issue with software compliance jobs hanging whitout starting anything.

My version is 8.0 SP11.

I have a batch job which contain several compliance jobs (22). When I run the batch job I only the first 5 compliance jobs in the task view.

If I open the result log of the batch job I can see all the others are supposed to be launched. But nothing happens even for the 5 I see in the task view.

I have no entries in the appserver.log except the ones about the run of the jobs.

When I open the result log of a compliance job, there's also nothing except the line that say the job is running.

I have waited up to one hour but the 5 first jobs don't move.

I have try the create the compliance jobs, restart appservers, reboot all the infra but with the same result :-(

Does anyone has an idea or suggestion on my issue ?

 

Regards

 

Johann

  • 1. Re: help on issue with compliance jobs hanging
    Sean (BladeLogic Fan) Berry

    How many appservers do you have, and are they on more than one host?

  • 2. help on issue with compliance jobs hanging
    Joe Piotrowski

    I would try to break it down into smaller troubleshooting steps:

    - If you run the first compliance job outside of the batch job does it run successfully?

    - Is the batch job set to use job targets or specify targets?

    - Are your target servers online?

    - Do you use logic in your Smart Server Groups to only run against responding agents?

    - Do the rest of the compliance jobs run successfully individually?

    - What OS are your targets?

    - What version are your agents?

    - Have you run a USP job recently?

  • 3. help on issue with compliance jobs hanging
    Johann BIGOT

    Hi Sean

     

    I have 2 physical server each hosting 2 job server instances

  • 4. help on issue with compliance jobs hanging
    Johann BIGOT

    Hi Joe,

     

    1) I tried to run the same compliance jobs one by one outside the batch job but with the same result.

    2) batch job uses job targets

    3) yes we are running a update properties job first and we use smartgroups which contain only alive machines

    5) no

    6) we are targeting window 2003 and 2008 R2 machines

    7) the older version is 8.0 SP4 but majority are in SP10

     

    regards

  • 5. help on issue with compliance jobs hanging
    Joe Piotrowski

    OK, so the problem is your compliance jobs are failing? Are these custom compliance jobs or OOTB jobs supplied by BMC? What type of compliance jobs are they? From each of your appservers, can you succesfully run an agentinfo against a target server that is hanging/failing? Also, do all of your target servers have large mapped shared storage?

  • 6. help on issue with compliance jobs hanging
    Bill Robinson

    what is the job parallelism set to on the compliance jobs?

     

    is there a job_timeout property and/or job_part_timeout property value set for the CJ ?

     

    is the 'ignore compliance failures' option selected in the cj ?

     

    as joe mentioned, what is in the templates?  EOs? 

  • 7. help on issue with compliance jobs hanging
    Johann BIGOT

    First of all...thanks for all your answer :-)

    My problem is that the compliance jobs are doing nothing. They are just launched by the appserver and then do nothing.

    they are custom custom compliance for software compliancy. I can run them from any of my appservers and the result is the same...nothing :-(

    Yes I can sucessfully run an agentinfo on any target machine but the compliance job doesn't even start to query machines. It seems to wait something that I don't know.

    And no most of my servers don't have large storage.

  • 8. Re: help on issue with compliance jobs hanging
    Bill Robinson

    What conditions are in the templates ?  what parts are they looking at?

     

    When you say they are doing nothing, can you tail the appserver and target agent logs when you run the job and see if there is any activity ?

  • 9. help on issue with compliance jobs hanging
    Johann BIGOT

    parallelism in our compliance jobs is set to unlimited.

     

    There is no job_timeout ou job_part_timeout set but for the job_part_timeout as the jobs don't even start to query the targets I'm not sure it would be useful.

     

    No the option "Continue despite compliance data collection errors" is not set.

     

    In fact in the component template filtering as we can't use smartgroup we set a fake template and we use a component smartgroup which will regroup all software/machines that we want to check

     

    everything was running perfectly until 1 or 2 months where we began to have some slowness on the jobs but now they are doing nothing.

  • 10. help on issue with compliance jobs hanging
    Joe Piotrowski

    Have your Component Templates changed? If so, have you tried running a Discover job again? As Bill asked, what discovery conditions (if any) are there, what parts are you including and what are the rules?

  • 11. help on issue with compliance jobs hanging
    Johann BIGOT

    we are mainly looking for registry keys as parts. The compliance rules are veru basic with if conditions.

    here is a example of compliance rule

     

    if

         ??TARGET._SERVERPROFILE.

    SERVERPROFILES_V_4._ADAM_32_V_4?? != "_empty"
      then
         "Registry
    Value:HKEY_LOCAL_MACHINE\SOFTWARE\Packages\??TARGET._SERVERPROFILE.SERVERPROFILES_V_4._ADAM_32_V_4._SIGNATURE??"
    exists

     

    When I run the jobs, they are launched but there isn't activity.

    Here is an extract from my appserver log

     

    [01 Jun 2012 13:16:07,869] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running the job '6.4.1.3 - PROD - Start Release Compliance' on application server 'p2twisdt01l01job01'(4)

    [01 Jun 2012 13:16:08,492] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN01

    [01 Jun 2012 13:16:08,679] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN02

    [01 Jun 2012 13:16:09,395] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN03

    [01 Jun 2012 13:16:09,582] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN04

    [01 Jun 2012 13:16:09,784] [Job-Execution-1] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Compliance] Started running the job '6.4.1.3.1 - PROD_CAN04' on application server 'p2twisdt01l01job01'(4)

    [01 Jun 2012 13:16:09,893] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN05

    [01 Jun 2012 13:16:09,908] [Job-Execution-4] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Compliance] Started running the job '6.4.1.3.1 - PROD_CAN02' on application server 'p2twisdt01l01job01'(4)

    [01 Jun 2012 13:16:09,908] [Job-Execution-0] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Compliance] Started running the job '6.4.1.3.1 - PROD_CAN03' on application server 'p2twisdt01l01job01'(4)

    [01 Jun 2012 13:16:09,908] [Job-Execution-3] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Compliance] Started running the job '6.4.1.3.1 - PROD_CAN01' on application server 'p2twisdt01l01job01'(4)

    [01 Jun 2012 13:16:10,547] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN06

    [01 Jun 2012 13:16:10,733] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN07

    [01 Jun 2012 13:16:10,936] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN08

    [01 Jun 2012 13:16:11,107] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN09

    [01 Jun 2012 13:16:11,278] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3.1 - PROD_CAN10

    [01 Jun 2012 13:16:11,512] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_AUTH01

    [01 Jun 2012 13:16:11,698] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_AUTH02

    [01 Jun 2012 13:16:11,916] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_BATCH01

    [01 Jun 2012 13:16:12,103] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_BATCH02

    [01 Jun 2012 13:16:12,290] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_DIV01

    [01 Jun 2012 13:16:12,477] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_DIV02

    [01 Jun 2012 13:16:12,664] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_FS01

    [01 Jun 2012 13:16:12,835] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_FS02

    [01 Jun 2012 13:16:13,006] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_MON01

    [01 Jun 2012 13:16:13,177] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_MON02

    [01 Jun 2012 13:16:13,395] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_BMC01

    [01 Jun 2012 13:16:13,566] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Batch] Started running member job 6.4.1.3 - PROD_BMC02

    [01 Jun 2012 13:16:13,753] [Job-Execution-2] [INFO] [devcp@beprod01.eoc.net:EC_Windows_Evolution_PROD:] [Compliance] Started running the job '6.4.1.3.1 - PROD_CAN05' on application server 'p2twisdt01l01job01'(4)

    [01 Jun 2012 13:16:48,076] [Scheduled-System-Tasks-Thread-6] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 207159296,Free JVM (B): 82156176,Used JVM (B): 125003120,VSize (B): 360439808,RSS (B): 325246976,Used File Descriptors: 1967

    [01 Jun 2012 13:17:47,863] [Scheduled-System-Tasks-Thread-5] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 209387520,Free JVM (B): 98815200,Used JVM (B): 110572320,VSize (B): 364261376,RSS (B): 330264576,Used File Descriptors: 1967

    [01 Jun 2012 13:18:47,636] [Scheduled-System-Tasks-Thread-1] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 209387520,Free JVM (B): 43430456,Used JVM (B): 165957064,VSize (B): 363270144,RSS (B): 329441280,Used File Descriptors: 1935

    [01 Jun 2012 13:19:47,423] [Scheduled-System-Tasks-Thread-2] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 211419136,Free JVM (B): 61654760,Used JVM (B): 149764376,VSize (B): 367083520,RSS (B): 330346496,Used File Descriptors: 1950

    [01 Jun 2012 13:20:47,239] [Scheduled-System-Tasks-Thread-6] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 212860928,Free JVM (B): 79198168,Used JVM (B): 133662760,VSize (B): 368549888,RSS (B): 332754944,Used File Descriptors: 1939

    [01 Jun 2012 13:21:47,173] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 194641920,Free JVM (B): 76605056,Used JVM (B): 118036864,VSize (B): 363749376,RSS (B): 331141120,Used File Descriptors: 1970

    [01 Jun 2012 13:22:47,107] [Scheduled-System-Tasks-Thread-3] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 194641920,Free JVM (B): 21709640,Used JVM (B): 172932280,VSize (B): 374521856,RSS (B): 341393408,Used File Descriptors: 1940

    [01 Jun 2012 13:23:47,025] [Scheduled-System-Tasks-Thread-3] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 206831616,Free JVM (B): 47513552,Used JVM (B): 159318064,VSize (B): 372621312,RSS (B): 341745664,Used File Descriptors: 1962

    [01 Jun 2012 13:24:46,959] [Scheduled-System-Tasks-Thread-1] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 189202432,Free JVM (B): 40909032,Used JVM (B): 148293400,VSize (B): 364376064,RSS (B): 336453632,Used File Descriptors: 1949

    [01 Jun 2012 13:25:46,892] [Scheduled-System-Tasks-Thread-5] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 201916416,Free JVM (B): 60937352,Used JVM (B): 140979064,VSize (B): 365125632,RSS (B): 338190336,Used File Descriptors: 1958

    [01 Jun 2012 13:26:46,879] [Scheduled-System-Tasks-Thread-3] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 215089152,Free JVM (B): 80853384,Used JVM (B): 134235768,VSize (B): 412426240,RSS (B): 349573120,Used File Descriptors: 1936

    [01 Jun 2012 13:27:46,850] [Scheduled-System-Tasks-Thread-1] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 234946560,Free JVM (B): 90904480,Used JVM (B): 144042080,VSize (B): 412868608,RSS (B): 350666752,Used File Descriptors: 1958

    [01 Jun 2012 13:28:46,836] [Scheduled-System-Tasks-Thread-2] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 204931072,Free JVM (B): 64538152,Used JVM (B): 140392920,VSize (B): 411471872,RSS (B): 350109696,Used File Descriptors: 1944

    [01 Jun 2012 13:29:46,807] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 233046016,Free JVM (B): 96633216,Used JVM (B): 136412800,VSize (B): 408035328,RSS (B): 346959872,Used File Descriptors: 1959

    [01 Jun 2012 13:30:46,799] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 205258752,Free JVM (B): 73302296,Used JVM (B): 131956456,VSize (B): 406773760,RSS (B): 345780224,Used File Descriptors: 1944

    [01 Jun 2012 13:31:46,788] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 230752256,Free JVM (B): 102232168,Used JVM (B): 128520088,VSize (B): 404807680,RSS (B): 343863296,Used File Descriptors: 1959

    [01 Jun 2012 13:32:46,794] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 205193216,Free JVM (B): 82029384,Used JVM (B): 123163832,VSize (B): 402321408,RSS (B): 341467136,Used File Descriptors: 1953

    [01 Jun 2012 13:33:46,753] [Scheduled-System-Tasks-Thread-3] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 228589568,Free JVM (B): 109260800,Used JVM (B): 119328768,VSize (B): 400371712,RSS (B): 339435520,Used File Descriptors: 1975

    [01 Jun 2012 13:34:46,727] [Scheduled-System-Tasks-Thread-3] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 205258752,Free JVM (B): 90117456,Used JVM (B): 115141296,VSize (B): 396881920,RSS (B): 336244736,Used File Descriptors: 1962

    [01 Jun 2012 13:35:46,685] [Scheduled-System-Tasks-Thread-3] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 226230272,Free JVM (B): 114320808,Used JVM (B): 111909464,VSize (B): 395505664,RSS (B): 334688256,Used File Descriptors: 1973

    [01 Jun 2012 13:36:46,658] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 205258752,Free JVM (B): 94888200,Used JVM (B): 110370552,VSize (B): 392491008,RSS (B): 331853824,Used File Descriptors: 1967

    [01 Jun 2012 13:37:46,616] [Scheduled-System-Tasks-Thread-6] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 205258752,Free JVM (B): 38882136,Used JVM (B): 166376616,VSize (B): 393621504,RSS (B): 332730368,Used File Descriptors: 1957

    [01 Jun 2012 13:38:46,614] [Scheduled-System-Tasks-Thread-1] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 223936512,Free JVM (B): 62437176,Used JVM (B): 161499336,VSize (B): 391106560,RSS (B): 330366976,Used File Descriptors: 1950

    [01 Jun 2012 13:39:46,597] [Scheduled-System-Tasks-Thread-4] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 205389824,Free JVM (B): 47360544,Used JVM (B): 158029280,VSize (B): 389541888,RSS (B): 328159232,Used File Descriptors: 1957

     

     

    As you can see more than 20 min after the launch of the jobs, nothing happens

  • 12. help on issue with compliance jobs hanging
    Joe Piotrowski

    Have you tried re-running the Discover job that creates the Components? Can you successfully live Browse the Components on the servers afterwards?

  • 13. help on issue with compliance jobs hanging
    Johann BIGOT

    Hi Joe,

     

    I have successfully re-run discovery jobs several time.

    Each time I was able to browse components on servers after that.

    I'm currently running cleaning jobs (appserver, file server, database) to see if it helps in my issue.

  • 14. help on issue with compliance jobs hanging
    Bill Robinson

    what does the property SERVERPROFILES_V_4._ADAM_32_V_4._SIGNATURE resolve to?

     

    can the role running the compliance job see that property instance and property value ?

1 2 Previous Next