3 Replies Latest reply on May 15, 2015 6:57 AM by Bill Robinson

    Is there a way to correlate defunct nsh processes to bl jobs?

    Jason Culver

      I have some defunct/zombie processes reporting from the BL (8.1.0) app server that is consuming system resources on RHEL5.5 platform.  Was curious if there was a way to cross reference the .nsh defunct process to a specific bl job that could be causing the runaway process?  This would be helpful so i can kill the specific culprit job and have the job owner review his job/script before running again.

       

      [culverj@lit-vablm-p001 ~]$ ps aux | grep 'Z'

      USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND

      bladmin  12588  0.1  0.0      0     0 ?        Z    18:45   0:00 [nsh] <defunct>

      bladmin  13105  0.1  0.0      0     0 ?        Z    18:45   0:00 [nsh] <defunct>

      bladmin  13400  0.0  0.0      0     0 ?        Z    18:45   0:00 [nsh] <defunct>

      culverj  13425  0.0  0.0  61172   764 pts/1    R+   18:45   0:00 grep Z

        • 1. Re: Is there a way to correlate defunct nsh processes to bl jobs?
          Bill Robinson

          The pids should be in the appserver or spawner logs when the job starts.

          • 2. Re: Is there a way to correlate defunct nsh processes to bl jobs?
            Jason Culver

            Thanks Bill!  I found them in appserver.log and was able to find the matching PIDs, which sheds light on the owner and role for the job.  Is there a way from here to match the workitem-thead# to the job name running in the bbsa console do you know?

             

            [root@lit-vablm-p001 br]# cat appserver.log | grep 12588

            [14 May 2015 19:28:40,337] [WorkItem-Thread-48] [INFO] [sesettyr:BLAdmins:] [NSHScript] Executing command : /opt/bmc/BladeLogic/8.1/NSH/bin/nsh --norc -c /opt/bmc/BladeLogic/8.1/NSH/tmp/application_server/scripts/job__ef6808f8-6d2d-4ac5-93f5-f973c3170066/master_b5679e6b-cf3a-4613-b90f-5b2ce6c12588

            [14 May 2015 19:28:44,054] [WorkItem-Thread-48] [INFO] [sesettyr:BLAdmins:] [NSHScript] Started pid 711: /opt/bmc/BladeLogic/8.1/NSH/bin/nsh --norc -c /opt/bmc/BladeLogic/8.1/NSH/tmp/application_server/scripts/job__ef6808f8-6d2d-4ac5-93f5-f973c3170066/master_b5679e6b-cf3a-4613-b90f-5b2ce6c12588

            [14 May 2015 20:18:29,323] [WorkItem-Thread-22] [INFO] [sesettyr:Application:] [NSHScript] Started pid 12588: /opt/bmc/BladeLogic/8.1/NSH/bin/nsh --norc -c /opt/bmc/BladeLogic/8.1/NSH/tmp/application_server/scripts/job__b42c5973-9307-4641-916f-840f1c0dffdd/master_023e3f5b-d469-4503-b5a3-8b9efffc75ea

            [14 May 2015 20:18:41,714] [WaitForProcessThread-PID-12588-307252] [INFO] [sesettyr:Application:] [NSHScript] Process finished: 12588

            You have new mail in /var/spool/mail/root

             

            [root@lit-vablm-p001 br]# cat appserver.log | grep 13105

            [14 May 2015 20:06:42,985] [WorkItem-Thread-15] [INFO] [sesettyr:BLAdmins:] [NSHScript] Started pid 13105: /opt/bmc/BladeLogic/8.1/NSH/bin/nsh --norc -c /opt/bmc/BladeLogic/8.1/NSH/tmp/application_server/scripts/job__3a1cd89f-2b4d-4a76-a361-663c6e419af5/master_3399d383-6a0c-4b64-ad44-74a205ac9ed6

            [14 May 2015 20:06:45,656] [WaitForProcessThread-PID-13105-304536] [INFO] [sesettyr:BLAdmins:] [NSHScript] Process finished: 13105

             

            [root@lit-vablm-p001 br]# cat appserver.log | grep 13400

            [14 May 2015 19:02:40,052] [WorkItem-Thread-7] [INFO] [sesettyr:BLAdmins:] [NSHScript] Executing command : /opt/bmc/BladeLogic/8.1/NSH/bin/nsh --norc -c /opt/bmc/BladeLogic/8.1/NSH/tmp/application_server/scripts/job__cdc7e521-b377-4827-97c4-a134006d9eb7/master_67246603-e1a1-4604-bb38-fdaff540fc3b

            [14 May 2015 19:02:40,621] [WorkItem-Thread-7] [INFO] [sesettyr:BLAdmins:] [NSHScript] Started pid 24594: /opt/bmc/BladeLogic/8.1/NSH/bin/nsh --norc -c /opt/bmc/BladeLogic/8.1/NSH/tmp/application_server/scripts/job__cdc7e521-b377-4827-97c4-a134006d9eb7/master_67246603-e1a1-4604-bb38-fdaff540fc3b

            [root@lit-vablm-p001 br]#

            • 3. Re: Is there a way to correlate defunct nsh processes to bl jobs?
              Bill Robinson

              The ‘ef6808f8-6d2d-4ac5-93f5-f973c3170066’ type stuff should be the uuid of the nsh script object or job which is in the database.  you should be able to trace it back from there.