1 2 Previous Next 19 Replies Latest reply on Oct 2, 2012 2:44 AM by Scott Dunbar

    File Deploy Job Exit code 13

    Iain Taylor

      Hi I am putting together a number of file deploy jobs for upgrading RSCD agents after we upgraded from 8.1 SP3 to the latest version of agent 8.2.2

      The file deploy job comes back with a number of different errors even though the Agent is successfully upgraded.

       

      The main one being: //"servername"/c/:I/O error

      which creates a Exit Code 13 error.

       

      I also get the same with an exit code 1 error on other servers.

       

      In the Post Command I have the following: msiexec /i c:\tmp\blade\RSCD82-SP2-WIN64.msi /qn /l* c:\logs\rscd_update.log REBOOT=Really Suppress

       

      I am guessing that the reason for this error is that the agent is starting to get installed therefore stopping the agent before the file deploy job is completing giving the I/O error.

       

      For this error would it be worth putting a WAIT 30 into the Post Command ? there are over 3000 servers that we need upgrade and resolving this will assist with error checking or having to check on false negative results.

       

      Many thanks

        • 1. Re: File Deploy Job Exit code 13

          Iain,

           

          We've seen many various exit codes since we switched to the msi installers, and in most cases the installer completed successfully just like in your case. Various exit codes could be an outcome of the state of the system, presence of locked files, previous agent installation history and upgrade paths, etc. Development is aware of this and are working on stabilizing the msi installer.

           

          With regards to wait 30, I personally do not know, but if you have an ability to test, please do and share your results.

           

          Lazar

          • 2. Re: File Deploy Job Exit code 13
            Steffen Kreis

            Hi,

             

            the Exit Code 13 looks like an old friend to me :-(

             

            We've seen this in our NSH scripts running against the target and it happens everytime when the App-Server looses connection to the agent.

            (We have an NSH script that runs specific Hardware-Tool jobs, where one of those jobs installs NIC drivers and the therefore the connections drops).

             

            Within the NSH scripts we could fix it by adding a "disconnect" command to the script, which closes unused network connections.

             

            Unfortunately this doesn't help with file deploy job, but as you are facing the same error in a similiar situation (Connection loss because of Agent upgrade) this could somehow be related.

             

            Bill Robinson explained the "Exit Code 13" like this:

             

            ------------------------------------------

            Essentially, the exit code 13 means “broken pipe.” In cases when the NSH script exits normally with “exit code 0”, but the script job exits with “exit code 13”, it appears that the target server encounters an error closing down a remote connection during the cleanup from exiting the script to returning the job result to the appserver. This error code is then returned as the job result.

             

            Adding the disconnect statement to the script at the right point causes the target server to close down remote connections before exiting the script. At this point even if exit code 13 is encountered there is nothing to consume that error. The script then runs to successful completion, and exit code 0 is always returned both as the script result and the job result.

             

            The exit code 13 is essentially benign. It indicates that the target server encountered a broken pipe while it was closing a remote connection after completing the NSH script being run by the script job. This exit code is returned as the job result, but it essentially is meaningless.  The disconnect statement is a good way to suppress this error.

            ------------------------------------------------------

             

            Again, not the solution for a file deploy job, but probably related as a File-Deploy job is some kind of NSH thingy as well.

             

            Cheers

            Steffen

            • 3. Re: File Deploy Job Exit code 13

              Hi

               

              I am being plauged with this issue at the moment.  We are deploying BPM v9.0 and for the base agent install, 90% of jobs will fail with this error.

               

              Our script already includes the @//;disconnect command

               

              Deployment of BPM paches, BPA and KM's does not generate the error code.  What else can I diagnose?

               

               

              Info    Sep 25, 2012 7:33:46 PM    uid=502(bladmin) gid=502(bladmin) groups=501(bbsa),502(bladmin)

              Info    Sep 25, 2012 7:33:59 PM    cBQelgsKACD7BQG2vsjRvLTv6iq3nFTI

              Info    Sep 25, 2012 7:41:11 PM    The installation completed successfully.

              Info    Sep 25, 2012 7:42:13 PM    0

              Info    Sep 25, 2012 7:42:13 PM    Final check for Agent status

              Info    Sep 25, 2012 7:42:14 PM    SERVICE_NAME: PatrolAgent

              Info    Sep 25, 2012 7:42:14 PM    TYPE               : 10  WIN32_OWN_PROCESS

              Info    Sep 25, 2012 7:42:14 PM    STATE              : 4  RUNNING

              Info    Sep 25, 2012 7:42:14 PM    (STOPPABLE, NOT_PAUSABLE, ACCEPTS_SHUTDOWN)

              Info    Sep 25, 2012 7:42:14 PM    WIN32_EXIT_CODE    : 0  (0x0)

              Info    Sep 25, 2012 7:42:14 PM    SERVICE_EXIT_CODE  : 0  (0x0)

              Info    Sep 25, 2012 7:42:14 PM    CHECKPOINT         : 0x0

              Info    Sep 25, 2012 7:42:14 PM    WAIT_HINT          : 0x0

              Info    Sep 25, 2012 7:42:15 PM    STATE              : 4  RUNNING

              Info    Sep 25, 2012 7:42:15 PM    Patrol Agent install complete

              Info    Sep 25, 2012 7:42:15 PM    Cleaning up InstallTemp

              Info    Sep 25, 2012 7:42:15 PM    Exit Code 13

               

               

              ... script ends

              echo "All done...exiting"

              cd //@;disconnect

               

              exit 0

              • 4. Re: File Deploy Job Exit code 13
                Steffen Kreis

                Hm...

                 

                in our situation, our script was just executing the correct HW-Tools jobs (HP, Dell, or VM) and executed them via blcli.

                 

                So as it actually did not do anything ON the target server, we've put the cd //@; disconnect on the top of the NSH script.

                 

                Does your script actually execute stuff on the target, or is it also just "controlling" stuff centrally ?

                What i'm trying to say is, i think you need to run the disconnect before anything drops the connection to your target.

                 

                Bit crazy that so many people face this issue now and there seems to be no real solution !

                 

                Cheers

                Steffen

                • 5. Re: File Deploy Job Exit code 13
                  Bill Robinson

                  Scott - are you seeing exit code 13 in the FDJ ?

                   

                  as noted here:

                  https://communities.bmc.com/communities/message/246162#246162

                   

                  Essentially, the exit code 13 means “broken pipe.” In cases when the NSH script exits normally with “exit code 0”, but the script job exits with “exit code 13”, it appears that the target server encounters an error closing down a remote connection during the cleanup from exiting the script to returning the job result to the appserver. This error code is then returned as the job result.

                   

                  Adding the disconnect statement to the script at the right point causes the target server to close down remote connections before exiting the script. At this point even if exit code 13 is encountered there is nothing to consume that error. The script then runs to successful completion, and exit code 0 is always returned both as the script result and the job result.

                   

                  The exit code 13 is essentially benign. It indicates that the target server encountered a broken pipe while it was closing a remote connection after completing the NSH script being run by the script job. This exit code is returned as the job result, but it essentially is meaningless. The disconnect statement is a good way to suppress this error.

                   

                  specifically in the context of an agent upgrade job i'd expect to see this as the connection to the agent drops when the agent upgrade process starts.  adding a sleep won't really do any good - the sleep would keep the connection open - you could maybe do something like:

                   

                  start sleep 30 & msiexec ...

                   

                  to background the msiexec so the job exits w/ a success before the upgrade process starts and drops the agent connection - though i don't know if 'sleep' is installed in windows by default.

                   

                  steffen - i'm not sure what you maybe by not having a solution?  is the 'cd //@;disconnect' not working ?  i believe the explaination above describes what is going on, why you see the error and how to resolve it.

                  • 6. Re: File Deploy Job Exit code 13

                    All of the scrpits are using nexec to call installers on the targets.  Only the BPM base agent installer has this issue.

                     

                    EDIT to say there should be no reason why the comms to the agent fail at this stage.  We are installing Patrol (BPPM).

                    EDIT to say these are not FDJ they are NSH scripts that run after a BLP Deploy Job

                    • 7. Re: File Deploy Job Exit code 13
                      Bill Robinson

                      What scripts?  I thought you were using file deploy jobs ?

                      • 8. Re: File Deploy Job Exit code 13
                        Iain Taylor

                        Thank you Bill, I will ask the customer if they have any more servers we can test the "sleep" command on,

                        if sleep is not installed we will have to create a blpackage to package up the RSCD agent sleep and put the command in the post commands.

                        I think they are willing to give anything a try as they have over 3k agents to upgrade and the overheads of checking false negatives isnt one that they wish to address.

                        • 9. Re: File Deploy Job Exit code 13

                          hey bill

                           

                          see my edits above

                          these are not FDJ they are NSH scripts that run after a BLP Deploy Job

                          • 10. Re: File Deploy Job Exit code 13
                            Bill Robinson

                            Ah.  So you need to add the ‘cd //@;disconnect’ at the end of the script…  try that.  by end I mean before any script exit points.

                            • 11. Re: File Deploy Job Exit code 13
                              Bill Robinson

                              I typically just ignore the FDJ and use something like:

                               

                              Batch job of

                              FDJ targeted at a smart group job of RSCD_VERSION != x.x.xx.xxx

                              USP job targeted at the same group

                               

                              Just keep re-running that and ignore the fdj output ☺  someday we will get the UAI to handle upgrades…

                              • 12. Re: File Deploy Job Exit code 13

                                Hey bill

                                Using cd //@;disconnect at the end of the scrip before any exit points

                                Starting to think its a bug in the windows BPA 9 agent install

                                • 13. Re: File Deploy Job Exit code 13
                                  Bill Robinson

                                  Well, that’s a good question – is the exit coming from the nexec?  Can you get the exit code from the installer run or installer log ?

                                  • 14. Re: File Deploy Job Exit code 13
                                    Iain Taylor

                                    Hi Bill, unfortunatly I dont know if that is going to be an option, I had already put a smart group together where RSCD Version does not = 8.2.2.321 and agent is alive, but they deem this as a "work around".

                                     

                                    They want something that is going to show the correct results eg correctly successful or failed. at the moment 80% of the time the FDJ shows as failed because of the "broken pipe" even though the job is actually successful, and they do not want to have to go through each server that has had a failure to check for false negatives.

                                    1 2 Previous Next