1 2 Previous Next 23 Replies Latest reply on Feb 29, 2012 2:38 PM by Jeff Claunch

    Out of java heap space (java.lang.OutOfMemoryError)

    S Crawford

      We are currently running on BladeLogic 7.4.5 (on RHEL).


      Some of our jobs cannot run because we are out of heap space. This is the error we receive in the 'Refresh Server Properties' job:


      java.lang.OutOfMemoryError: Java heap space


      What do I need to change to add more heap space? Is it just as simple as changing the blappserv file? This is what it currently has set for heap size (see attached file also):


      $JAVA_HOME/bin/java -Xss1m -Xmx1024


      Is it recommended to go over 1024MB for the App Server JVM instance? Our RHEL server has 8 GB total available. Here are the BladeLogic java processes running on that box:


      +bladmin 3519 3516 0 Feb16 ? 00:00:01 /usr/nsh/br/java/bin/java -Xss1m -Xmx1024M -Djava.security.egd=file:/dev/../dev/urandom -Djava.io.tmpdir=/usr/nsh/tmp -Dblx.rootdir=/usr/nsh -Dblx.cmrootdir=/usr/nsh/br -classpath /usr/nsh/br:/usr/nsh/br/bllocale.jar:/usr/nsh/br/bladelogic.jar:/usr/nsh/br/java/jre/lib/rt.jar:/usr/nsh/br/stdlib:/usr/nsh/br/stdlib/activation.jar:/usr/nsh/br/stdlib/bcmail-jdk15-136.jar:/usr/nsh/br/stdlib/bcprov-jdk15-136.jar:/usr/nsh/br/stdlib/binding-1.0.jar:/usr/nsh/br/stdlib/commons-codec-1.3.jar:/usr/nsh/br/stdlib/commons-dbcp-1.2.2.jar:/usr/nsh/br/stdlib/commons-httpclient-3.0.1.jar:/usr/nsh/br/stdlib/commons-logging-1.1.jar:/usr/nsh/br/stdlib/commons-pool-1.3.jar:/usr/nsh/br/stdlib/forms-1.0.5.jar:/usr/nsh/br/stdlib/http.jar:/usr/nsh/br/stdlib/jaxp.jar:/usr/nsh/br/stdlib/jdom.jar:/usr/nsh/br/stdlib/log4j-1.2.4.jar:/usr/nsh/br/stdlib/looks-1.1.2.jar:/usr/nsh/br/stdlib/mailapi.jar:/usr/nsh/br/stdlib/mssqlserver.jar:/usr/nsh/br/stdlib/objectprofiler.jar:/usr/nsh/br/stdlib/oracle.jar:/usr/nsh/br/stdlib/parser.jar:/usr/nsh/br/stdlib/smtp.jar:/usr/nsh/br/stdlib/SNMP4J-agent.jar:/usr/nsh/br/stdlib/SNMP4J.jar:/usr/nsh/br/stdlib/spin.jar:/usr/nsh/br/stdlib/ws-commons-util-1.0.1.jar:/usr/nsh/br/stdlib/xerces.jar:/usr/nsh/br/stdlib/xmlrpc-client-3.0.jar:/usr/nsh/br/stdlib/xmlrpc-common-3.0.jar:/usr/nsh/br/stdlib/xml-writer.jar:/usr/nsh/br/stdlib/xpp3_min- com.bladelogic.app.profile.ServerProfileService


      bladmin 3605 3604 0 Feb16 ? 00:11:00 /usr/nsh/br/java/bin/java -Xss1m -Djava.security.egd=file:/dev/../dev/urandom -Xmx128M -Djava.io.tmpdir=/usr/nsh/tmp -Dblx.rootdir=/usr/nsh -Dblx.cmrootdir=/usr/nsh/br -classpath /usr/nsh/br:/usr/nsh/br/bllocale.jar:/usr/nsh/br/bladelogic.jar:/usr/nsh/br/java/jre/lib/rt.jar:/usr/nsh/br/stdlib:/usr/nsh/br/stdlib/activation.jar:/usr/nsh/br/stdlib/bcmail-jdk15-136.jar:/usr/nsh/br/stdlib/bcprov-jdk15-136.jar:/usr/nsh/br/stdlib/binding-1.0.jar:/usr/nsh/br/stdlib/commons-codec-1.3.jar:/usr/nsh/br/stdlib/commons-dbcp-1.2.2.jar:/usr/nsh/br/stdlib/commons-httpclient-3.0.1.jar:/usr/nsh/br/stdlib/commons-logging-1.1.jar:/usr/nsh/br/stdlib/commons-pool-1.3.jar:/usr/nsh/br/stdlib/forms-1.0.5.jar:/usr/nsh/br/stdlib/http.jar:/usr/nsh/br/stdlib/jaxp.jar:/usr/nsh/br/stdlib/jdom.jar:/usr/nsh/br/stdlib/log4j-1.2.4.jar:/usr/nsh/br/stdlib/looks-1.1.2.jar:/usr/nsh/br/stdlib/mailapi.jar:/usr/nsh/br/stdlib/mssqlserver.jar:/usr/nsh/br/stdlib/objectprofiler.jar:/usr/nsh/br/stdlib/oracle.jar:/usr/nsh/br/stdlib/parser.jar:/usr/nsh/br/stdlib/smtp.jar:/usr/nsh/br/stdlib/SNMP4J-agent.jar:/usr/nsh/br/stdlib/SNMP4J.jar:/usr/nsh/br/stdlib/spin.jar:/usr/nsh/br/stdlib/ws-commons-util-1.0.1.jar:/usr/nsh/br/stdlib/xerces.jar:/usr/nsh/br/stdlib/xmlrpc-client-3.0.jar:/usr/nsh/br/stdlib/xmlrpc-common-3.0.jar:/usr/nsh/br/stdlib/xml-writer.jar:/usr/nsh/br/stdlib/xpp3_min- com.bladelogic.app.process.ProcessSpawnerManager


      bladmin 4198 3519 1 Feb16 ? 08:00:31 /bladelogic/nsh-app/br/java/bin/java -Djava.library.path=/bladelogic/nsh-app/br/java/lib/i386/server:/bladelogic/nsh-app/br/java/lib/i386:/bladelogic/nsh-app/br/java/../lib/i386:/usr/nsh/lib -Djava.security.egd=file:/dev/../dev/urandom -Djava.io.tmpdir=/usr/nsh/tmp -Djava.class.path=/usr/nsh/br:/usr/nsh/br/bllocale.jar:/usr/nsh/br/bladelogic.jar:/usr/nsh/br/java/jre/lib/rt.jar:/usr/nsh/br/stdlib:/usr/nsh/br/stdlib/activation.jar:/usr/nsh/br/stdlib/bcmail-jdk15-136.jar:/usr/nsh/br/stdlib/bcprov-jdk15-136.jar:/usr/nsh/br/stdlib/binding-1.0.jar:/usr/nsh/br/stdlib/commons-codec-1.3.jar:/usr/nsh/br/stdlib/commons-dbcp-1.2.2.jar:/usr/nsh/br/stdlib/commons-httpclient-3.0.1.jar:/usr/nsh/br/stdlib/commons-logging-1.1.jar:/usr/nsh/br/stdlib/commons-pool-1.3.jar:/usr/nsh/br/stdlib/forms-1.0.5.jar:/usr/nsh/br/stdlib/http.jar:/usr/nsh/br/stdlib/jaxp.jar:/usr/nsh/br/stdlib/jdom.jar:/usr/nsh/br/stdlib/log4j-1.2.4.jar:/usr/nsh/br/stdlib/looks-1.1.2.jar:/usr/nsh/br/stdlib/mailapi.jar:/usr/nsh/br/stdlib/mssqlserver.jar:/usr/nsh/br/stdlib/objectprofiler.jar:/usr/nsh/br/stdlib/oracle.jar:/usr/nsh/br/stdlib/parser.jar:/usr/nsh/br/stdlib/smtp.jar:/usr/nsh/br/stdlib/SNMP4J-agent.jar:/usr/nsh/br/stdlib/SNMP4J.jar:/usr/nsh/br/stdlib/spin.jar:/usr/nsh/br/stdlib/ws-commons-util-1.0.1.jar:/usr/nsh/br/stdlib/xerces.jar:/usr/nsh/br/stdlib/xmlrpc-client-3.0.jar:/usr/nsh/br/stdlib/xmlrpc-common-3.0.jar:/usr/nsh/br/stdlib/xml-writer.jar:/usr/nsh/br/stdlib/xpp3_min- -Xss1m -Xmx1034027008 -Dblx.rootdir=/usr/nsh -Dblx.cmrootdir=/usr/nsh/br com.bladelogic.mfw.fw.BlManager config1-lx-blogic-p1 config1-lx-blogic-p1 CONFIGURATION,NSH_PROXY LogfileName=ConfigAppServer1.log ConsoleLogfileName=ConfigAppServer1.console.log TempDirectoryName=AppServer1 ProxySvcPort=9842 AuthSvcPort=9840 AppSvcPort=9841 SRPPort=9829 ClientControlTimeout=20


      bladmin 4202 3519 3 Feb16 ? 19:38:30 /bladelogic/nsh-app/br/java/bin/java -Djava.library.path=/bladelogic/nsh-app/br/java/lib/i386/server:/bladelogic/nsh-app/br/java/lib/i386:/bladelogic/nsh-app/br/java/../lib/i386:/usr/nsh/lib -Djava.security.egd=file:/dev/../dev/urandom -Djava.io.tmpdir=/usr/nsh/tmp -Djava.class.path=/usr/nsh/br:/usr/nsh/br/bllocale.jar:/usr/nsh/br/bladelogic.jar:/usr/nsh/br/java/jre/lib/rt.jar:/usr/nsh/br/stdlib:/usr/nsh/br/stdlib/activation.jar:/usr/nsh/br/stdlib/bcmail-jdk15-136.jar:/usr/nsh/br/stdlib/bcprov-jdk15-136.jar:/usr/nsh/br/stdlib/binding-1.0.jar:/usr/nsh/br/stdlib/commons-codec-1.3.jar:/usr/nsh/br/stdlib/commons-dbcp-1.2.2.jar:/usr/nsh/br/stdlib/commons-httpclient-3.0.1.jar:/usr/nsh/br/stdlib/commons-logging-1.1.jar:/usr/nsh/br/stdlib/commons-pool-1.3.jar:/usr/nsh/br/stdlib/forms-1.0.5.jar:/usr/nsh/br/stdlib/http.jar:/usr/nsh/br/stdlib/jaxp.jar:/usr/nsh/br/stdlib/jdom.jar:/usr/nsh/br/stdlib/log4j-1.2.4.jar:/usr/nsh/br/stdlib/looks-1.1.2.jar:/usr/nsh/br/stdlib/mailapi.jar:/usr/nsh/br/stdlib/mssqlserver.jar:/usr/nsh/br/stdlib/objectprofiler.jar:/usr/nsh/br/stdlib/oracle.jar:/usr/nsh/br/stdlib/parser.jar:/usr/nsh/br/stdlib/smtp.jar:/usr/nsh/br/stdlib/SNMP4J-agent.jar:/usr/nsh/br/stdlib/SNMP4J.jar:/usr/nsh/br/stdlib/spin.jar:/usr/nsh/br/stdlib/ws-commons-util-1.0.1.jar:/usr/nsh/br/stdlib/xerces.jar:/usr/nsh/br/stdlib/xmlrpc-client-3.0.jar:/usr/nsh/br/stdlib/xmlrpc-common-3.0.jar:/usr/nsh/br/stdlib/xml-writer.jar:/usr/nsh/br/stdlib/xpp3_min- -Xss1m -Xmx1034027008 -Dblx.rootdir=/usr/nsh -Dblx.cmrootdir=/usr/nsh/br com.bladelogic.mfw.fw.BlManager job1-lx-blogic-p1 job1-lx-blogic-p1 JOB LogfileName=JobAppServer1.log ConsoleLogfileName=JobAppServer1.console.log TempDirectoryName=JobAppServer1 RegistryPort=9936 RMIExecutionPort=9937 JMXManagementPort=9938 MaxJobs=25 MaxWorkItemThreads=30 ClientControlTimeout=20+

        • 1. Re: Out of java heap space (java.lang.OutOfMemoryError)
          Bill Robinson

          max heap on 32bit rhel is ~ 1536M - 2048M, you can up the heap, also, open a support ticket on this.

          • 2. Re: Out of java heap space (java.lang.OutOfMemoryError)
            S Crawford

            Thanks Bill. After more investigating, we've found what could be the root cause. There is a server called 'birch' that keeps causing our jobs to hang whenever it is targeted. It was re-enrolled not too long ago, could it be possible the old one is still being referenced? We think an app server restart will take care of this.


            This is one of our two job servers. We can't get a job to start on EITHER job server.


            +Work Item Thread status:


            Name,State,Time in state (ms),Additional Info



            Job-WorkItem-Thread-1,STUCK,442h:45m:39s62ms,Update Server Properties Job:Refresh Server Properties; Server:birch;

            Job-WorkItem-Thread-2,EXECUTING,393h:47m:5s4ms,NSH Script Job:OS Config; Server:vx-pmecf-d5;


            Job-WorkItem-Thread-4,STUCK,322h:45m:24s68ms,Update Server Properties Job:Refresh Server Properties; Server:birch;

            Job-WorkItem-Thread-5,STUCK,394h:45m:42s911ms,Update Server Properties Job:Refresh Server Properties; Server:birch;

            Job-WorkItem-Thread-6,EXECUTING,394h:0m:27s386ms,NSH Script Job:OS Config; Server:vx-pmecf-tf;

            Job-WorkItem-Thread-7,STUCK,466h:45m:31s340ms,Update Server Properties Job:Refresh Server Properties; Server:birch;

            Job-WorkItem-Thread-8,EXECUTING,393h:52m:8s333ms,NSH Script Job:OS Config; Server:vx-pmecf-tc;



            Job-WorkItem-Thread-11,STUCK,346h:45m:16s62ms,Update Server Properties Job:Refresh Server Properties; Server:birch;

            Job-WorkItem-Thread-12,STUCK,514h:45m:28s748ms,Update Server Properties Job:Refresh Server Properties; Server:birch;




            Job-WorkItem-Thread-16,STUCK,250h:45m:11s740ms,Update Server Properties Job:Refresh Server Properties; Server:birch;

            Job-WorkItem-Thread-17,STUCK,202h:45m:7s50ms,Update Server Properties Job:Refresh Server Properties; Server:birch;

            Job-WorkItem-Thread-18,EXECUTING,393h:57m:59s531ms,NSH Script Job:OS Config; Server:vx-pmencf-t2;


            Job-WorkItem-Thread-20,STUCK,418h:45m:43s161ms,Update Server Properties Job:Refresh Server Properties; Server:birch;

            Job-WorkItem-Thread-21,EXECUTING,473h:19m:13s297ms,Compliance Job:Test-CIS; Template:CIS - AIX 4.3-5L - v1.0.1; Component:CLINT; Server:CLINT;

            Job-WorkItem-Thread-22,EXECUTING,225h:50m:15s814ms,NSH Script Job:OS Config; Server:lx-blogic-rp2;




            Job-WorkItem-Thread-26,STUCK,490h:45m:56s299ms,Update Server Properties Job:Refresh Server Properties; Server:birch;

            Job-WorkItem-Thread-27,STUCK,178h:44m:18s635ms,Update Server Properties Job:Refresh Server Properties; Server:birch;


            Job-WorkItem-Thread-29,STUCK,226h:45m:3s888ms,Update Server Properties Job:Refresh Server Properties; Server:birch;+

            • 3. Re: Out of java heap space (java.lang.OutOfMemoryError)
              Bill Robinson

              If you removed the old reference to the server no jobs should be running against it, unless it was de-enrolled while a job was running.


              if you restart the appserver services this should clear the stuck job.

              • 4. Re: Out of java heap space (java.lang.OutOfMemoryError)
                S Crawford

                Actually the server is currently enrolled in BL. I'm not sure if the server was removed during a job or not. That could certainly be an issue if it was removed then. I will be restarting the app servers ASAP (along with the process spawner, I assume this should be restarted as well?).

                • 5. Re: Out of java heap space (java.lang.OutOfMemoryError)
                  Bill Robinson

                  the process spawner would only come into play if you were running something that used an extended object or nsh script.

                  • 6. Re: Out of java heap space (java.lang.OutOfMemoryError)
                    Steven Alexson

                    We are experiencing this same memory error with 7.5.320. It occurs when we run the Inventory Snapshots used by BSARA (I believe these jobs were provided by a VPC). Our snapshot job runs a discovery first then the actual snapshot. We have roughy 3800 servers being targeted by this job. The snapshot does capture all extended objects. Currently, the job is set to run against unlimited targets. The memory error begins sometime during the discovery phase of the job.


                    We have 3 physical servers, each running a config server and 2 job servers. These servers are running Red Hat Linux AS4 64-bit. The app servers are currently allocated 1GB RAM each (plans are in place to increase this to 1.5GB this weekend). There is 12GB RAM installed on the server. The servers have 4 CPUs (2 Dual-Core AMD Opteron 2222 Processors).


                    A few of questions...


                    1. Is the memory limit for the JVMs on 64-bit Linux still 2GB, or is it greater? Either way, what is the optimal memory allocation for a server such as ours?


                    2. Should the inventory jobs be throttle to a fixed number of servers? If so, Is there some way to calculate an appropriate throttling level for jobs based on factors like # of job server, # of target server, amount of memory available?


                    3. Is there a way to limit how many processes a process spawner starts? We have had jobs in the past that have crashed our servers because the process spawner started so many nsh instances that our load averages were > 100. We have had to throttle those jobs to prevent this. I think it would be better to be able to prevent the process spawner from overrunning the server.

                    • 7. Re: Out of java heap space (java.lang.OutOfMemoryError)
                      Bill Robinson

                      Are you running the x64 version of BladeLogic or the x86 on your boxes?  you need the x64 to use the 64-bit jvm...


                      1 - memory limit is removed, though w/ a larger heap you run into garbage collection issues so you may not want to run a 10GB heap.  I think we've done some deployments w/ a 4-5GB heap, still gathering some real-world-metrics on that though.  I'd start out w/ a 3GB heap and go from there.  I think in 7.5 or maybe 7.6 there are some more details in the Infrastructure Management so you can see more about what's going on in the jvm.


                      2 - You're looking at Work Item Threads, each job spawns a number of WITs.  They are going to compete w/ WITs from other jobs going on in the environment.  For some other large environments (5k, 6k, etc) we've set the default parallelism to 20 instead of unlimited (there is a patch for this in 7.6) for all jobs, which gives those customers a balance between speed and not hosing their appservers.  You have to look at how many WITs you have available in the environment (how many appservers/instances x WIT/server).  I think it's something like 1 WIT per server per job, but that may be too simple.


                      3 - you could change ulimits, but that will only cause the app to error out.  this maybe an RFE...  i've also hosed a box by running 20 blcli scripts in parallel (20x256M heap = lots of swap thrashing).  I think I got over 100 too.  Other than throttling back the jobs themselves, I don't know of a way to do this.


                      in general, the 'unlimited' seems to be bad for large environments, I've fixed alot of things by setting the parallelism to a fixed number.

                      • 8. Re: Out of java heap space (java.lang.OutOfMemoryError)
                        Steven Alexson

                        We are running the x64 version of BladeLogic, so it is good to know that we are able to use more memory. We will probably start by increasing to 2GB per JVM. Since we only have 12GB in these servers currently, we don't want to overdo it. We do plan on installing more memory into these servers in the future. What type of performance impact should be expect from increased memory allocation?

                        • 9. Re: Out of java heap space (java.lang.OutOfMemoryError)
                          Bill Robinson

                          I don't know that we have any specifics we can hand out yet - I'd google a bit on the impact of Garbage Collection and heap size on performance - that's where I think most of the hit will come.  It's more of a java thing than a bladelogic thing, though I agree this would be useful for us to have.

                          • 10. Re: Out of java heap space (java.lang.OutOfMemoryError)

                            I believe a heap larger than 6GB provides no incremental value since you will rarely need more than that size. At that point it is more efficient to spin up another Application Server to utilize the remaining 6GB more effectively. Another option is 3 instances X 4GB, etc. A lot of it depends on the type and frequency and size of jobs you are running as to which option is better.


                            • 11. Re: Out of java heap space (java.lang.OutOfMemoryError)
                              Bill Robinson

                              You'd still want to leave some room on the box for other things, like blcli scripts running, extended objects, etc - some of those will happen outside of the appserver heap.  So something like 3x3 GB or 2x4GB on the 12GB box.  otherwise I think you could start hitting swap badly...


                              like I mentioned previously, I hozed a solaris box by kicking off a job w/ a ton of blcli in parallel across 20 or 30 boxes.  i did not have enough memory left on the appserver to handle it all.

                              • 12. Re: Out of java heap space (java.lang.OutOfMemoryError)

                                I think we are going to start with 3X2GB. Once we stabilize the environment, then upgrade the memory, we may move to a 3X4GB. For now, though, I don't want to use more than 50% since we do run many jobs that spawn nsh/blcli commands.

                                • 13. Re: Out of java heap space (java.lang.OutOfMemoryError)
                                  Steven Alexson

                                  So, I reconfigured our app servers to run with a 2GB heap, but it still doesn't seem to be allocating more than 1.5GB. We are running Red Hat Linux AS4 64-bit and BladeLogic 64-bit (OM750-264-RHAS4X86_64.SH installer with hotfix 320 applied).


                                  I modified the blappserv file in the <OM_DIR>/br directory so that it runs with 2GB allocated:


                                  $JAVA_HOME/bin/java -Xss1m -Xmx2048M -Djava.security.egd=file:/dev/../dev/urandom -Djava.io.tmpdir=$BLADELOGIC_HOME/tmp -Dblx.rootdi
                                  r=$BLADELOGIC_HOME -Dblx.cmrootdir=$BLADELOGIC_HOME/br -classpath $CLASSPATH com.bladelogic.app.profile.AppServerLauncher $@


                                  After making the changes, I restarted all app servers. In the Infrastructure Management window of the console, it only shows 1.5GB max JVM memory:




                                  As you can see from the above image, I am runing the 64-bit JRE on 64-bit architecture.The allocated memory reports the same within the appserver logs.


                                  Any ideas why? Am I missing something?

                                  • 14. Re: Out of java heap space (java.lang.OutOfMemoryError)
                                    Bill Robinson

                                    You need to set this in the infrastructure manager for the particular instance of the appserver (and change the blappserv back)

                                    1 2 Previous Next