5 Replies Latest reply on Sep 2, 2010 2:11 PM by Chris NameToUpdate

    Repeatable Error in

      Here are the steps to reproduce:

      1. Successfully provision a Linux machine.  (Or start a provisioning run and have to cancel it for whatever reason...the result is the same.)
      2. Attempt to provision another machien without making any changes to the environment.


      Here's what's going on.


      Under "Configurations -> Image Files -> Skip Linux Pre-Installation" choose edit.  The "Kernel name" box is the source of the error.




      If I try to provision a machine WITHOUT CHANGING THE VALUE in this textbox, the target device will literately try to boot to whatever is listed in that text box.


      In the example above....  I change the value in "Kernel_Name" from "Used_Prior" to "Error_Here", the target device will boot and provision as normal.  If I try to provision again using "Error_Here", the target device will error out saying "Could not find boot file: Error_Here".


      If I then shut the provisioning run down and edit "Error_Here" to be "Some_New_Value", the provisioning run will work as expected the next run.


      One last oddity to throw at this situation.  I booted one time using "asdfjkl" as the "Kernel Name".  On the next run, I changed "Kernel Name" to read ONLY "asdf".  The boot failed for trying to boot the "asdf" boot image.


      So the "Kernel Name" box needs to be COMPLETELY DIFFERENT from the previous run to work correctly.  If it's a substring of what was used before, it will error out.


      I'm currently investigating REV 236 but would really like to avoid patching again.  Has anyone seen this behavior before and know how to fix it?

        • 1. Re: Repeatable Error in

          I'm a TSA for the BladeLogic Provisioning team and this thread was brought to my attention.


          The correct "Kernel Name" value for this place holder boot image should be "dummy" by default.  It should not be changed.  Can you try setting it to "dummy" and deleting all of the files in the <tftproot>/x86pc/pxelinux/pxelinux.cfg folder on the tftp server and then try booting successfully multiple times?



          If that doesn't work create a ticket and reference this thread.

          • 2. Re: Repeatable Error in

            Hi Tim,


            This bug kind of crept up on me.  I'll provide a brief history of how I ran into it here since I'll be creating a Support ticket referencing this thread.


            I freshly patch to 212 and take a VM snapshot.  I set up a Linux Provisioning job and it works.  (Run #1 on "dummy")  Happy with my work, I take another VM snapshot and move on to testing other issues.  For one reason or another, my attention gets drawn back to Provisioning.  I attempt to provision another Linux machine.  This time, it fails for "could not boot to file: dummy".  (Run #2 on "dummy").


            I get confused because I know that this was _just working_ a short time ago and I've made no changes.  I start digging around, trying to figure out what's going on.  I've booted to that "dummy" error so many times, I just want to see something go through....  So I changed "dummy" to point to my valid OEL54/vmlinuz file.  Next time I boot, it worked!


            So, now I get into a cycle of being able to get it to work sometimes, but never back to back.  I finally narrow it down to a repeatable point and post this in the forums.


            Here are the steps I took after your reply:


            1. Delete files in pxelinux\pxelinux.cfg
            2. Change "Kernel Name" back to default of "dummy"
            3. Boot target device into PXE using Windows boot files.
            4. Provision with "Skip Linux Pre-Install"
            5. Server does the switch boot step and errors on "Could not boot to file: dummy"
            6. Power down target device and cancel provisioning job in PM.
            7. Set target device properties back to Windows boot files.
            8. Change "Kernel Name" from "dummy" to "random_string_of_characters"
            9. Delete files in pxelinux\pxelinux.cfg
            10. Boot target device into PXE using Windows boot files.
            11. Provision with "Skip Linux Pre-Install"
            12. Success


            I say "Success" but, to be clear, my provisioning job gets stuck at step 15/16.  What's happened is OEL installs on target device and then hangs out at a prompt.  I can log in and then run "startx" and the GUI pops up as it should.  I'm not sure if that's a bug yet or not as I've not been able to focus on testing this yet.


            Just to be sure.  I fired up a new VM after step 12 and tried to provision it.  It failed after switch-boot for "Could not boot to file: random_string_of_characters"

            • 3. Re: Repeatable Error in

              Patching to 236 seems to have made it worse.  I now cannot get any provisioning run to go through even with the tricks I was using before.  The failure is the same. "Could not boot to file: dummy"


              I'm going to reset my VMs to out of box 115 and patch to 236 directly as a test while i wait to hear back on the Support ticket.  One question I have about the 236 patch is, if the file "w2k8_r2_inserts_Oracle.sql" is required for REV 212...why is that file NOT included with the REV 236 .ZIP?

              • 4. Re: Repeatable Error in
                Reinhard Vielhaber

                "why is that file NOT included with the REV 236 .ZIP?"


                did you ever got an answer to this?

                • 5. Re: Repeatable Error in

                  No, I did not.  The situation is now resolved, of course.  We ended up staying at 212 due to this question among other issues.


                  I think the final answer to the issue this problem was orignally about had something to do with versions of squash not matching up from the supplied provisioning_files.zip with the OS that I was using to build the boot files.  I downgraded my build OS and the proper boot files generated.