max heap for each instance?
memory on the box ?
load on the appserver when this happens?
any db issues on the db when this happens ?
version of bsa ?
max heap for each instance? - 8192M
memory on the box ? - 64GB
load on the appserver when this happens? -
Will check on other metrics too and update.
[29 Feb 2016 06:30:09,870] [Scheduled-System-Tasks-Thread-13] [INFO] [System:System:] [Memory Monitor] Total JVM (B): 7564034048,Free JVM (B): 617319464,Used JVM (B): 6946714584,VSize (B): 10225319936,RSS (B): 9018077184,Used File Descriptors: 388
any db issues on the db when this happens ? - Nope. If any issue happens from DB side, it should impact all other instances too?
version of bsa ? - 8.3.03
I dont think large object cache cleaning will have any affect on the heart beat. Large Object cleaning task doesnt talk to database at all.
Heart beat failing purely means your database connection is timing out.
Can you please provide output blasadmin -a show Database all?
Ok, below logs are from another instance which went down recently.
[01 Mar 2016 06:59:02,250] [Scheduled-System-Tasks-Thread-8] [ERROR] [System:System:] [Large Object Cleanup Task] Failed to cleanup large object cacheAn error occurred while attempting to access the database:
Message : IO Error: Connection reset SQLState: null ErrorCode: 0
This is from the instance which we have been discussing about:
lrcha60226# blasadmin -a show data all
blasadmin now running against the following deployments: APPINST0226c, APPINST0226b, _template, APPINST0226a
Or [Large Object Cleanup Task] is a symptom/task performed prior to the instance marked as down?
Large Object Cleanup Task works on in-memory objects, they are maintenance tasks, they are just like garbage collectors.
I see that you dont have any value set for MaxWaitTime
What value do you have for this parameter on other appservers where you dont see the issue?