14 Replies Latest reply on May 29, 2015 1:05 PM by richard mcleod

    Windows 2012 Teaming job hangs - BSA

      BSA 8.2 SP4

       

      When running a post command of a cmd file to configure teaming and set up teaming IP address the job hangs. The commands that are run during the cmd are:

       

      echo Powershell "New-NetLbfoTeam -Name 'Management LAN' -TeamMembers 'NIC1 Mgmt' -TeamingMode SwitchIndependent -Confirm:$FALSE" > C:/teaming.cmd

      echo Powershell "start-sleep 30" >> C:/teaming.cmd

      echo Powershell "get-netadapter 'Management LAN' | new-netipaddress -ipaddress 'xxx.xxx.xxx.xxx' -addressfamily ipv4 -prefixlength 26" >> C:/teaming.cmd

      echo netsh interface ip add address "Management LAN" gateway=xxx.xxx.xxx.xxx gwmetric=1 >> C:/teaming.cmd

      exit >> C:/teaming.cmd

       

      The teaming.cmd does configure the Team correctly but the job doesn't finish.

       

      The rscd.log is showing the following:

       

      dadfb282dd9d38d4a9d0 0000001032 01/24/14 10:58:21.147 INFO1    rscd -  10.120.248.36 4296 BladeLogicRSCD@TestServer->Administrator@TestServer:PrivilegeMapped (BLAdmins:BLAdmin): CM: > [Deploy] Job 'setTeamingWindows2012' is executing a dry run

      bdc31aeb1a65b116904b 0000001033 01/24/14 10:58:39.422 INFO1    rscd -  10.120.248.36 5004 BladeLogicRSCD@TestServer->Administrator@TestServer:PrivilegeMapped (BLAdmins:BLAdmin): CM: > [Deploy] Deleting //TestServer/C/Program Files/BMC Software/BladeLogic/RSCD//Transactions/log/tmp/bldeploy-6e6d649fb39032409b4919e78d049633.log

      3f9e52f774fea4f10fb5 0000001034 01/24/14 10:58:41.896 INFO1    rscd -  10.120.248.36 2780 BladeLogicRSCD@TestServer->Administrator@TestServer:PrivilegeMapped (BLAdmins:BLAdmin): CM: > [Deploy] Retrieving the root filesystem

      0227d13ad600dd2addb6 0000001035 01/24/14 10:58:41.974 INFO1    rscd -  10.120.248.36 2780 BladeLogicRSCD@TestServer->Administrator@TestServer:PrivilegeMapped (BLAdmins:BLAdmin): CM: > [Deploy] Copying '//bmcfile/D/storage/blpackages/765c2292-a492-4df2-9fa6-f2dfcf07c504' to '//TestServer/tmp/stage/33d704fed3cb363e86dacfd227630fa7'

      123311ae22934b7270fe 0000001036 01/24/14 10:58:44.664 INFO1    rscd -  10.120.248.36 4424 BladeLogicRSCD@TestServer->Administrator@TestServer:PrivilegeMapped (BLAdmins:BLAdmin): CM: > [Deploy] Job 'setTeamingWindows2012' is applying

      92b635e1e7ca39d5cc61 0000001037 01/24/14 10:59:05.895 INFO1    rscd -  10.120.248.36 3816 BladeLogicRSCD@TestServer->Administrator@TestServer:PrivilegeMapped (BLAdmins:BLAdmin): CM: > [Deploy] Deleting //TestServer/C/Program Files/BMC Software/BladeLogic/RSCD//Transactions/log/tmp/bldeploy-6e6d649fb39032409b4919e78d049633.log

      e55c94991de2a4adecab 0000001038 01/24/14 10:59:05.973 INFO1    rscd -  10.120.248.36 3816 BladeLogicRSCD@TestServer->Administrator@TestServer:PrivilegeMapped (BLAdmins:BLAdmin): CM: > [Deploy] Job 'setTeamingWindows2012' is executing a post-command of 'start C:/teaming.cmd exit'

      22d7a06e086af91bdddf 0000001039 01/24/14 10:59:09.698 INFO     rscd -  10.120.248.36 3816 BladeLogicRSCD@TestServer->Administrator@TestServer:PrivilegeMapped (BLAdmins:BLAdmin): CM: Nexec: Connection got reset errorno:10054

      594728539d1a5f36c2fc 0000001040 01/24/14 10:59:10.214 INFO1    rscd -  10.120.248.36 3816 BladeLogicRSCD@TestServer->Administrator@TestServer:PrivilegeMapped (BLAdmins:BLAdmin): CM: Connection aborted by the peer process

      44d700049d3b9029b3e3 0000001041 01/24/14 10:59:10.214 ERROR    rscd -  TestServer 3816 SYSTEM (Not_available): (Not_available): SSL error : .\ssl\s3_pkt.c:740 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry

      2ce85a04a65e78390950 0000001042 01/24/14 10:59:10.229 ERROR    rscd -  TestServer 3816 SYSTEM (Not_available): (Not_available):

       

      Any suggestions?

        • 1. Re: Windows 2012 Teaming job hangs - BSA
          Steffen Kreis

          Hi Kate,

           

          we have gone through a lot of problems with Teaming Jobs ourselves.

           

          What we basically see that happens here is that the RSCD agents kills all the related Processes and Scripts, when the Server looses it's Network-Connections. You can monitor that behavior using Process-Explorer as an example.

           

          Therefore, you'll see that bldeploy.exe as an example never ends the Deploy-Job properly and the Job then hangs forever in BSA, or times out if you have set that.

          You should see that the according bldeploy.log under RSCD_INSTALL_DIR\Transactions\logs\bldeployxxxxxx.log just ends in middle of the execution.

           

          We have reported that to BMC and have been told that this behavior will be fixed in the upcoming 8.3 SP3 release.

           

          As a workaround, you can try with executing your Powershell-Script using start

          So as an External-Command : "start C:/teaming.cmd"

           

          This will launch the script outside of the context of the dieing bldeploy.exe.

          The job should finish successfully, but as it is launched outside of the context, you can't tell from the Job-Status if your script worked, failed, or whatever.

           

          What we have done to go around this is to use a second job after the Teaming-Job, which checks if the Teaming has been put in place correctly.

           

          Steffen

          • 2. Re: Windows 2012 Teaming job hangs - BSA

            Hi Steffen,

             

            I have tried this but unfortunately even though the teaming is successful the job fails...

             

            Any other suggestions?

             

            Kate  

            • 3. Re: Windows 2012 Teaming job hangs - BSA
              Bill Robinson

              So there was a fix in a recent version of the agent where if the appserver -> agent connection is terminated then any child processes are killed.  this is needed when we have canceled jobs that cannot nicely terminate whatever they are running on the target.  but this should not have been implemented for bldeploy because w/ a bldeploy you can have a package initiated reboot or network cut.  so i believe we created a hotfixed agent w/ the fix as well as getting the fix into 8.3.03, and you probably also want to make sure to set the 'out of band reboot' option on the item in the blpackge that will cut the network.

              • 4. Re: Windows 2012 Teaming job hangs - BSA

                Once the second NIC is added the Job comes back as successful...

                • 5. Re: Windows 2012 Teaming job hangs - BSA
                  Jim Campbell

                  The modified agent worked for us.  We encountered the problem that Steffen described for teaming, though in our case it occurred because our script looked like :

                   

                  1) Team NICs

                  2) Set MAC address for the team

                   

                  Because the network connection would briefly cut out during the team creation, the process would be terminated using the standard agent (we tried both 8.2sp4 and 8.3sp1).  Thus, the second step would never actually execute.  The modified version of the agent we were provided by support (8.3sp1) does not have this issue.  I believe you have to request it through support and there may not be a version available for 8.2sp4.

                   

                  This problem occurred in a number of places - anything that terminated the network connection (network adapter driver updates as a prime example) would cause the process to abort and leave the result of the job in an indeterminate state (often with network adapters disabled as the process was aborted mid-driver update).  We also went with the solution that Steffen mentioned (use 'start' in the external command) but this requires some ugly hacks to insure that the job completes before moving to the next job in the batch if this is part of a post-provisioning batch job.

                  • 6. Re: Windows 2012 Teaming job hangs - BSA
                    Steffen Kreis

                    Hi Bill,

                     

                    we tested the behaviour of the 8.3 SP3 agent on a 2012 R2 box again today.

                    Unfortunately it still doesn't work as expected.

                     

                    So we have a BLpackage that incl a PowerShell script, that

                         - Disables all Network Adapters

                         - Sleeps for 10 seconds

                         - Activates the Adapters again.

                    The External-Command item in the BLpackage is set to Out-of-Band Reboot and the Job is set to Use-Item-Defined Reboot setting.

                     

                    When monitoring the Deployment on the target via Process-Explorer, we can see that in the moment where the PS Script disables the Adapter, all child processes under RSCD.exe are teminated (incl. bldeploy.exe) and due to that, the Network also never comes back.

                     

                    We will re-open the ticket for this.

                    Can you confirm that the fix indeed made it into SP3 ?

                     

                    Looking here https://docs.bmc.com/docs/display/public/bsa83/Known+and+corrected+issues+in+Agent+and+Network+Shell

                     

                     

                    Steffen

                    • 7. Re: Windows 2012 Teaming job hangs - BSA
                      Bill Robinson

                      It seems there is an additional fix that needs to be added.  Working on a hf.

                      • 8. Re: Windows 2012 Teaming job hangs - BSA
                        richard mcleod

                        We're seeing a similar error when installing VMWare Tools

                         

                        05/28/15 17:10:03.090 ERROR    rscd -  SERVER02 3528 SYSTEM (Not_available): (Not_available): SSL error : .\ssl\s3_pkt.c:781 error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry

                        05/28/15 17:10:03.106 ERROR    rscd -  SERVER02 3528 SYSTEM (Not_available): (Not_available): SSL_write

                         

                        Agent is 8.5.01.260

                        • 9. Re: Windows 2012 Teaming job hangs - BSA
                          Bill Robinson

                          Did you specify the ‘out of band reboot’ option for the item in the blpackage ?  vmware tools drops the nic.  To bsa that looks like an oob reboot.

                          • 10. Re: Windows 2012 Teaming job hangs - BSA
                            richard mcleod

                            This is done via NSH Script where we copy the executable then run nexec to install

                            • 11. Re: Windows 2012 Teaming job hangs - BSA
                              Bill Robinson

                              Why?  why not use a blpackage ?

                              • 12. Re: Windows 2012 Teaming job hangs - BSA
                                richard mcleod

                                I can only assume this is not a bl package because they didnt want to to have to continually upload a new binary to blade where as with NSH they can just blindly copy from a "current" directory nsh path.

                                • 13. Re: Windows 2012 Teaming job hangs - BSA
                                  Bill Robinson

                                  You can do that w/ depot software (nsh_copy_at_staging url type) and soft-link the depot software into the blpackage.

                                  • 14. Re: Windows 2012 Teaming job hangs - BSA
                                    richard mcleod

                                    thanks - will see if can get a sustainable solution going using this method