5 Replies Latest reply on Apr 1, 2011 3:08 PM by Jim Campbell

    Blade and Name resolution

    Jim Campbell

      We've been getting some odd behaviour that may be related to DNS.  Our appservers are Windows and have a dns suffix order of x.y.z.com and z.com (in this order).  Some of our rebuilds have required moving networks and when this happens the names usually don't match: server.x.y.z.com will resolve to the new IP, server.z.com will resolve to the old IP.  This has caused a few problems: during the provisioning, sometimes (not always) the agent will fail to enroll at the end of the provisioning process.  Eventually we're able to run a verify on the server object and move on.  The other problem comes from our qa job and occurs even long after any dns names cached (by Windows at least) would have expired.  It also occurs when live browsing the hardware information node for affected servers:


      Error Mar 26, 2011 7:49:35 PM com.bladelogic.app.collector.AssetCollectionException: com.bladelogic.app.service.agentservice.AgentConnectionException: com.bladelogic.app.remote.BlRemoteException: org.apache.xmlrpc.XmlRpcException: Failed to connect to SERVER:4750(component=QA_TEMPLATE (SERVER), selector=LogicalStorageDevices:/Hardware/StorageDevices/LogicalStorageDevices)


      Jobs run against SERVER, i can live browse other nodes, extended objects run, but when i try to access the Hardware Information Node I get essentially this same error.  I ran a network trace while attempting to live browse the hardware information node and saw that traffic was being sent to the old IP: i.e. the one to which server.z.com resolves.  Running nslookup, ping, etc on the appserver all return the IP for server.x.y.z.com


      Does Blade have any sort of separate name caching?  Or is there any reason it would be resolving names differently than windows?  All of our server objects use a single host name (unqualified).  The server object has the fqdn right in its fq_host property and its ip_address property is also correct (i.e. the new IP).  Is there any way blade could be using an old IP from the previous server object of the same name (the previous server object was decommed prior to reprovisioning the server).

        • 1. Re: Blade and Name resolution
          Gerardo Bartoccini


          I have experienced something like you described.

          As I understand, some sort of IP caching happens, so if you change IP of one of your servers, the server may still be resolved to the old IP.

          An appserver restart will solve the issue, although I understand it’s not very elegant

          • 2. Re: Blade and Name resolution
            Bill Robinson

            There's a file in the java install in the bladelogic dir - on unix this is in br/java/lib/security/java.security iirc.


            Look at this: 



            and then make the appropriate modifications to the java.security file and restart your appserver.  this should deal w/ the dns caching issues.

            • 3. Re: Blade and Name resolution
              Jim Campbell

              Well this definitely looks like the problem.  However, I'm confused as to which setting is the culprit:





              Other posts seemed to indicate the latter, but in our case I think we're caching a successful result?  The comments in the file say:


              # The Java-level namelookup cache policy for successful lookups:


              # any negative value: caching forever

              # any positive value: the number of seconds to cache an address for

              # zero: do not cache


              # default value is forever (FOREVER). For security reasons, this

              # caching is made forever when a security manager is set. When a security

              # manager is not set, the default behavior is to cache for 30 seconds.


              # NOTE: setting this to anything other than the default value can have

              #       serious security implications. Do not set it unless

              #       you are sure you are not exposed to DNS spoofing attack.


              Does this really mean that once blade cached a name/ip resolution that it keeps that mapping indefinitely?  So once I resolve the name of a managed server, its in the cache permanently until I restart the appserver?  How does this work with multiple app servers?  Each one maintains its own, separate, permanent cache?  How would we know if a 'security manager is set' ?  If i understand correctly, the 'security implications' are no different than those the standard windows dns client faces at all times anyway.

              • 4. Re: Blade and Name resolution
                Bill Robinson

                i think the caching change will fix the 2nd problem.


                for the 1st one i'm not sure i understand what's happening.  when during the provisioning process is the server moving from one domain to the other?  what domain should the server resolve in ?  and what domain is the server in when the agent is enrolled ?

                • 5. Re: Blade and Name resolution
                  Jim Campbell

                  The servers should resolve in both domains.  The problem occurs when we rebuild a server that was on one network and it has to be moved to another network so that it can reach our PXE server.  The z.com domain is (i believe) BIND and the x.y.z.com domain is a Windows dns domain.  During the provisioning process, the server successfully registers its dns name in the Windows x.y.z.com domain but does not change its registration in the z.com domain.


                  I think our problems stem from the java caching, however.  Everything looks to be accurately using the x.y.z.com domain as its first suffix.