5 Replies Latest reply on Sep 12, 2019 4:01 PM by Edison Pioneer

    Probable causes for Sev-1 on ROD

    Edison Pioneer
      Share This:

      Hi everyone,

       

      We are on ROD right now and I thought it might be a respite to be on ROD since BMC looks after the servers.

      One person went so far as to say that BMC monitors the entire environment with monitoring tools every single second and they receive notifications if anything's to go haywire, well in advance.

       

      However, last week, we started observing some "SQL error" message while performing some routine actions on some forms, which honestly, had me off my feet. Also, I initiated daily health checks as soon as we logged in at start of day.

       

      Still, these measures have done little to allay my fears.

       

      So, I am trying to compile a list of causes that might cause a critical/ widespread issue, such as full blown system outage or maybe some module being knackered, thereby being inaccessible to end users

       

      I have a decent on-premises experience, and some of the things that appear on the top of my head are RAM usage being full, hard drives being full due to incessant log collection, email engine server being out of whack, some java processes running amock or not running at all et al.

       

      Would someone please note down some probable causes that might create a Sev-1 issue in a ROD environment, based on their knowledge or experience? And what can be done to mitigate them?

       

      I know its quite a tall order but any feedback can be really useful here.

      Thanks in advance for any suggestion put forth.

       

      Even the minutest of suggestions would be reciprocated with utmost gratitude.