Patrol Agent - Corrupt History files questions and fix_hist switch best practices.

Version 1
    Share This:

    This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.


    PRODUCT:

    BMC Performance Manager Reporting


    APPLIES TO:

    BMC Performance Manager Reporting



    QUESTION:

    Corrupt History files questions and fix_hist switch best practices.

    Running fix_hist without the -no_backup switch returned an error saying that the history was corrupt.

    1: How do we catch corrupt history before we need the data?

    2: There's a way of 'attaching' the fix_hist switch to an agents restart, do you know how and if its best practice?   
          


    ANSWER:

     

    This resolution contains commands that are intended for Unix systems and is written with a Unix-oriented audience in mind. Most of the principles described in this resolution apply to

    Windows systems as well, unless otherwise noted.

    A) How does history get corrupted?

      

    Corrupted history files can happen when the PatrolAgent is shutdown ungracefully. The three main ways this occurs is:

    - the computer crashes

    - the computer is shutdown gracefully, but the PatrolAgent is still running

    - the PatrolAgent process was killed using the 'kill -9' command.

    The PatrolAgent usually has the history file open for writes. If the PatrolAgent is brought down ungracefully, then the history file can get corrupted.

      


    B) How do you fix history corruption?

    The error message "History was not closed with a proper agent termination after the above date indicates corrupted history files and is generally caused by the improper shutdown of an Agent.

    In order to correct this issue you will need to do the following:

      

    1) Stop the Agent

    2) Rename or remove the history files. (The advantage of renaming the directory is that this

    old history data can be accessed via the dump_hist utility.) The history files are located in

    the $PATROL_HOME/log/history/<hostname>/<portnumber> directory and are called

    annotate.dat, dir, and param.hist.

    3) Remove the PEM files. They are located in $PATROL_HOME/log directory and are

    called PEM_{host}_{agent port #}.log and .archive. This is a precautionary measure.

    Many times the PEM files become corrupted along with the history files.

    4) Restart the Agent. When the Agent is unable to locate these files, it will automatically

    create new ones.

    5) If the agent still will not start successfully then once again remove PEM*, remove the

    history files, but additionally rename or remove config_{host}-{port} which is located in

    PATROL_HOME/config. It is very rare for the config DB to have a problem that blocks

    startup but it can none the less happen. Use of the PATROL Configuration Manager,

    pconfig +Reload {cfg file}, or restore of the config file from OS backup can be used to

    recover if needed. Please note that removing the config file will remove any KM configurations

    that it contains. Those KMs will need to be reconfigured after the config file is removed.

      


    Note: A fix_hist utility is provided in the $PATROL_HOME/bin directory which can be used to repair history files. This utility will scan the entire history file and resync the database and its indexes. Please refer to the PatrolAgent Reference Manual, chapter 8 for information on how to run fix_hist.

    C) What happens when an Agent improperly shuts down?

      

    When the machine crashes or the agent terminates improperly, several things could happen. They include corrupted history files and/or corrupted PATROL Event log files.

    Corrupt History Files

    The agent creates and then subsequently uses history files in the $PATROL_HOME/log/history/<hostname>/<portnumber> directory. If the agent goes down abruptly it doesn't have the opportunity to write all cached data to the history files then corruption will most likely occur.

    In some cases running the fix_hist utility will repair the history files. If fix_hist doesn't work then the files will need to be removed or renamed. The easiest method is to rename or remove the <portnumber> directory mentioned above. The advantage of renaming the directory is that this old history data can be accessed via the dump_hist utility.

    Note: See the Agent Reference Manual for info on fix_hist and dump_hist.
    More details about fix_hist are available at the below link:
    https://docs.bmc.com/docs/display/PA11302/fix_hist+utility

    Also renaming the history files port# directory allows the ability to run the dump_hist utility against them if they aren't corrupted too much. This archived history directory can have its data extracted. See the Agent Reference Manual for more info on dump_hist

      


    D) How can you prevent history corruption?

      

    Patrol History files contain parameter data and annotations that have been collected over a period of time. History Corruption can occur if PatrolAgents are abruptly or inappropriately shut down. Over time, history corruption can result in poor agent performance including increased time in executing recovery actions and updating parameter data, abnormal agent termination, and an agent failing to start.

    With PATROL Agent version 3.5.30 or later, the /AgentSetup/fixHistFlag agent configuration variable is used to fix the history files when the agent starts up.

    always -->attempts repair, packing or both

    dirtybit -->attempts repair, packing or both if the history database indicates it may have been left in an unknown or incomplete state

    nobackup -->does not back up the history database before attempting repair, packing, or both

    nopackann-->does not pack annotations when processing the history database

      
        

     


    Article Number:

    000031247


    Article Type:

    FAQ/Procedural



      Looking for additional information?    Search BMC Support  or  Browse Knowledge Articles