What is the best way to 'take a remote node offline' (stop collecting data and prevent the bgsagent from restarting) in the TSCO Agent/BPA Agent product?

Version 7
    Share This:

    This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.


    PRODUCT:

    BMC Performance Assurance for Virtual Servers


    APPLIES TO:

    TrueSight Capacity Optimization 11.5, 11.3.01, 11.0, 10.7, 10.5, 10.3, 10.0 ; BMC Performance Assurance 9.5



    QUESTION:

     
       What is the best way to 'take a remote node offline' (stop collecting data and prevent the bgsagent from restarting) in the TrueSight Capacity Optimization (TSCO) Gateway Server or BMC Performance Assurance for Servers product?   
      
    When a collection request is registered on a remote node, the TSCO Gateway Server/BMC Performance Assurance (BPA product will work to keep data collection active on that remote node. By default, the TSCO Gateway Server console will issue collector query requests to verify that the collector is gathering data and restart the bgsagent process if necessary. Thus, simply clearing a collection request or stopping the bgsagent process may not be sufficient to ensure that data collection remains stopped on a remote node.   
      
    TrueSight Capacity Optimization (Gateway Server)     11.5, 11.3.01, 11.0,    10.7, 10.5, 10.3, 10.0  
    BMC Performance Assurance for Servers 9.5,        9.0, 7.5.10, 7.5.00 
       
       BMC Performance Assurance for Virtual Servers     9.0, 7.5.10, 7.5.00 
       
          


    ANSWER:

     

    There are two parts to the solution for this issue: a short term solution to disable data collection for the remainder of the current day and a long term solution to disable data collection from now until a future day where you chose to re-enabled it.

      
          

    Section I: Short Term

       
        

     

        

    Windows Agent

        
       Step 1  
      
    Stop the TSCO Agent on the windows machine:   
      
    > "%BEST1_COLLECT_HOME%\bgs\bin\best1agent_stop"   
      
       Step 2  
      
    Stop the Service Daemon running on the machine and set it to 'Manual' startup.   
      
    The Service Daemon can be stopped either via the Services Control panel (Stop the bgssd_service) or via an Administrative Command prompt:   
      
       > net stop bgssd_service  
      
    If is is necessary to set the service to disabled (to ensure it remains stopped over a machine reboot) that will need to be done via the Services Control panel.    

    Unix Agent

        

    To stop the TSCO Gateway Server console from restarting the TSCO Agent process on the remote node (and thus re-starting TSCO data collection) one can use the Authorization.cfg file on the remote node to restrict access to the remote node so the Managing node cannot restart data collection.

        


    Option A: Use the Authorization.cfg to remove user authorization to access the TSCO Agent

        


    Step 1

    Backup your existing $BEST1_HOME/local/setup/Authorization.cfg file on the remote node.

        

    Step 2

    Stop the running bgsagent process

        

    > $BEST1_HOME/bgs/scripts/best1agent_stop -b $BEST1_HOME -a
    executing /usr/adm/best1_default/bgs/bin/bgsagent_stop -b /usr/adm/best1_default
    Trying to connect to Agent on node topgun using port 6767
      Connection made - sending stop Agent message
      Waiting for Agent to stop .
      Agent stopped successfully


    Step 3: Restrict access to the remote node using the Authorization.cfg fil

        

    Create a new $BEST1_HOME/local/setup/Authorization.cfg file with the following contents:

        

    BEGIN_AUTHORIZATION
      PERMISSION = NONE
    END_AUTHORIZATION

        

    Step 4

    At this point the bgsagent process should not be running on the agent machine and requests from the TSCO Gateway Server console to restart the TSCO Agent should be rejected by the Service Daemon with an 'Account Unathorized 'message resulting in neither the TSCO Agent (bgsagent) not the Perform Collector (bgscollect) processes remaining running on the machine.

        

    See the "After creating the Authorization.cfg file there is still an agent running that I can't stop!" section below

        

    Option B: Stopping the Service Daemon (if running in Standalone mode)

        

    If the Service Daemon is running in standalone mode (a common configuration for Linux) the Service Daemon can be stopped which will prevent the TSCO Gateway Server from communicating with the machine.

    IMPORTANT NOTE: Stopping the Service Daemon will not survive a system reboot.  On reboot the Service Daemon will be automatically restarted by the service daemon startup that has been registered with the system.

    Step 1

    Check that the Service Daemon is running in Standalone mode:

    ps -ef | grep bgssd
    perform   2869     1  0 Aug12 ?        00:00:38 /etc/bgs/SD/bgssd.exe -d /etc/bgs/SD -s


    Seeing a running 'bgssd.exe ... -s' process confirms the Service Daemon is running in standalone mode.

    Step 2

    Stop the running TSCO Agent

    $BEST1_HOME/bgs/scripts/best1agent_stop -a

    Step 3

    Confirm that the TSCO Agent and Collector has stopped:

    ps -ef | grep bgs

    You should not see a 'bgsagent' or 'bgscollect' process running.

    Step 4

    Stop the Service Daemon

    $BEST1_HOME/bgs/bin/bgssd.exe -k

        

    Step 5

    Validate that the Service Daemon has been stopped:

    ps -ef | grep bgssd
     

       
       

    Section II: Long Term

       
       On the long term side the best option is to remove the node from the domain or policy file used in your active Manager run.   
       
       

    TSCO 11.x (Agent list implementation):

       
      
       Step 1:  
      
    Access the Gateway Manager section and the Gateway Server that collects data from the Agent machine   
      
       Step 2  
      
    Download the Agent List file from the Server and remove the node from the list   
      
       Step 3  
      
    Upload the just updated Agent list file to the Gateway Server via TSCO UI   
       
       

    Deployments with no Agent List implementation:

       
      
    To remove the node from the domain file (or policy file on Windows) used in your active Manager run:   
      
       Step 1  
      
    Load the domain/policy file in the BMC Performance Assurance for Servers console GUI and delete the node.   
      
       Step 2  
      
    Save the domain/policy file.   
      
    Once this is done your manager run will no longer attempt to collect data from this remote node for future runs until you put it back in the domain file.   
      
    This change takes effect with the next execution of data collection for this Manager run on the day after the change is made. Since this change takes effect when the new collection requests are issue (typically around 11:30 PM) to stop data collection immediately you need to use the Authorization.cfg to prevent the console from communicating with the remote node (which would cause the agent to start) for the remainder of the day.   
      
    There is no mechanism available on the Perform console to stop the console from issuing query requests to the remote node in the middle of a Manager run. But, using the Authorization.cfg file those console requests will be rejected on the remote node and data collection will stop.   
      
    The Perform console will also attempt to issue data transfer attempts to the remote node for the data collection request that it started before the Authorization.cfg file was implemented on the machine. If the Authorization.cfg file is in place this transfer requests will fail and the TSCO Agent will not be restarted.  
       

    Section III: Common Questions

       
        

    Q: Do I need to actively clear the active collection request?

    No, you don't need to clear the active collection request registered with the TSCO Agent since the Authorization.cfg file will prevent the agent from restarting on the machine. But, there is no harm in clearing the active collection request before stopping the agent and implementing the Authorization.cfg file update.

    If using the Unix console, you can clear the active data collection request from the Agent's .als file by running the following command from the Perform console:

        

      > $BEST1_HOME/bgs/scripts/best1collect -n [remote hostname] -K

        

    This command must be run from the Perform console as the user that submitted the collection request. This is possible to do on the Windows console but difficult because the collection request will have been submitted as the 'SYSTEM' user.

    But, since the TSCO Agent won't restart after the Authorization.cfg change there is no technical reason to clear the request.

    Q: Should I delete the .als file from the remote node after stopping the Agent?

    There is no reason to delete the .als file from the remote node after stopping the agent. Although the $BEST1_HOME/bgs/monitor/log/[hostname]-bgsagent_6767.als file on Unix and %BEST1_COLLECT_HOME%\bgs\monitor\log\[hostname]-bgsagent_6767.als file on Windows contains a list of active collection requests and alerts once the Authorization.cfg file is used to prohibit the restart of the agent the .als file contents won't matter.   
      
       Q: Are there any future enhancements planned to make this easier?  
      
    Request for Enhancement (RFE)     QM00540694 was made to be able to specify a remote node as being in 'maintenance mode' and thus not accept incoming requests that would start the TSCO Agent (and thus initiate data collection). That request  has been reviewed by the product team and is not currently planned for inclusion in a future release of the TrueSight Capacity Optimization Gateway Server due to the complexity it would introduce in the UDR Collection Manager workflow, the expected infrequency of its use, and the limited benefit it would provide over the current workaround (since it would basically just stop the bgssd.exe binary from being started on a remote node).   
      
       Q: After creating the Authorization.cfg file there is still an agent running that I can't stop!    

    The Authorization.cfg file change must be implemented when there is not a TSCO Agent process running. Sometimes what happens is that after the agent is stopped, but before the updated Authorization.cfg file is saved, the Perform Console will restart the TSCO Agent process. Once the Authorization.cfg file is in place that will prohibit the running TSCO Agent from being stopped on the machine. To address that issue, revert the Authorization.cfg file change, stop the running agent, and implement the Authorization.cfg file change before a new TSCO Agent process has been started.

    Q: Should the Service Daemon (bgssd.exe) process still be running or able to start on the machine?

        

    The Service Daemon (bgssd.exe) is an [x]inetd service that listens on port 10128 and accepts requests from the Perform console.  There isn't really a great way to disable the Service Daemon when it is being fronted via xinetd.

    When using the Authorization.cfg file what you are basically doing is allowing the Service Daemon to run as usual when it receives a request.  It will pass that request onto the TSCO Agent (bgsagent) but then the TSCO Agent will see that the request isn't from an authorized user (since now no users are authorized) and will terminate.

    You generally shouldn't see a bgssd.exe process running for more than 60 seconds (generally it will run for less than 10 seconds) unless it is scheduled to run in 'standalone' mode (then the running process will have a '-s' flag in 'ps -ef' output).  When running in standalone mode it is possible to stop the Service Daemon since it is running as a standalone process that is started via an rc startup script.  So, in this environment you can just stop the Service Daemon (either by killing it or by stopping it via its rc startup script).  Connections from the Perform Console would then fail since there wouldn't be a process listening on port 10128.

    Standalone mode is an alternate configuration typically seen on Linux for when inetd isn't configured to run on the machine.
      000106580: Is there a way to run the Perform Service Daemon on a machine without the use of inetd or xinetd?

       
      
      
    Related Products:  
       
    1. TrueSight Capacity Optimization
    2.  
    3. BMC Performance Assurance for Servers
       Legacy ID:KA399308

     


    Article Number:

    000031745


    Article Type:

    FAQ/Procedural



      Looking for additional information?    Search BMC Support  or  Browse Knowledge Articles