TSCO 10.7.01 intermittently but frequently reporting, "BCO_AFS_FAIL102: Detected a lock in AFS main executor"

Version 4
    Share This:

    This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.


    PRODUCT:

    TrueSight Capacity Optimization


    APPLIES TO:

    TrueSight Capacity Optimization 10.7.01



    PROBLEM:

    Problem symptoms include:
      * TSCO 10.7.01 is intermittently but frequently reporting, "BCO_AFS_FAIL102: Detected a lock in AFS main executor - last activity date: Wed Aug 16 02:58:17 CDT 2017 - please restart component" and "ORA-01013: user requested cancel of current operation"

      * An Oracle AWR shows a large volume of Physical Read activity for the AFS SQL:

        SELECT SYS.SYSID, ( SELECT SUM (CASE WHEN DAT_CPU.AVGVALUE <= :1 THEN DURATION ELSE 0 END) FROM SYS_OBJECT OBJ_CPU JOIN SYS_DATA_D DAT_CPU ON OBJ_CPU.SYSOBJID = DAT_CPU.SYSOBJID AND OBJ_CPU.OBJNAME = 'CPU_UTIL' AND OBJ_CPU.SUBOBJNAME = 'GLOBAL' AND DAT_CPU.TS > :2 AND DAT_CPU.TS < :3 WHERE OBJ_CPU.SYSID = SYS.SYSID ) AS CPU_VALUE, ( SELECT SUM (CASE WHEN DAT_MEM.MAXVALUE >= :4 THEN 1 ELSE 0 END) FROM SYS_OBJECT OBJ_MEM JOIN SYS_DATA_D DAT_MEM ON OBJ_MEM.SYSOBJID = DAT_MEM.SYSOBJID AND OBJ_MEM.OBJNAME = 'MEM_BALLOONING' AND OBJ_MEM.SUBOBJNAME = 'GLOBAL' AND DAT_MEM.TS > :5 AND DAT_MEM.TS < :6 WHERE OBJ_MEM.SYSID = SYS.SYSID ) AS MEM_VALUE FROM SYS_DEF SYS WHERE 1 = 1 AND SYS.sysid in (:7 , :8 , :9 , :10 , :11 , :12 , :13 , :14 , :15 , :16 , :17 , ... lot of numbers from :18 to :777 ..., :778 , :779 , :780 , :781 , :782 , :783 , :784 , :785 , :786 , :787 , :788 , :789 , :790 , :791 , :792 , :793 , :794 , :795 , :796 , :797 , :798 , :799 , :800 , :801 , :802 , :803 , :804 , :805 , :806 , :807 , :808 , :809 , :810 , :811 )

      * An Oracle 'Explain Plan' for the SQL ID of the AFS SQL (from the AWR) shows that the execution plan for the SQL is very, very expensive.  This isn't consistent with the creation of an Explain Plan for the SQL itself via SQL Developer (so the execution plan for the running SQL is very bad compared to the predicted execution plan for the SQL).

    From the afs.log:

    2017-08-16 04:00:02,017 FAILED - BCO_AFS_FAIL102: Detected a lock in AFS main executor - last activity date: Wed Aug 16 02:58:17 CDT 2017 - please restart component
    <-- cut similar messages -->
    2017-08-16 05:20:01,640 FAILED - BCO_AFS_FAIL102: Detected a lock in AFS main executor - last activity date: Wed Aug 16 02:58:17 CDT 2017 - please restart component
    2017-08-16 05:26:17,979 INFO  - END 858 systems of type 25 processed. 0 reservation containers considered, 10377 indicators computed and 1499 metadatas produced. Profile data extracted in 8709.017 seconds, data processed in 0.053 seconds, indicators stored in 168.603 seconds, lastcounter updated in 3.232 seconds. Total execution time 8880.905 seconds.
    2017-08-16 05:26:17,980 INFO  - AFS AFSLauncherProcess processed 2594 systems in 16735.713 seconds.
    2017-08-16 05:26:18,174 INFO  - AFS AFSLauncherProcess processed 0 applications in 0.194 seconds. Now sleeping for 300 seconds.
    2017-08-16 05:31:18,174 INFO  - AFS AFSLauncherProcess awaken from sleep after 300.0 seconds.
    2017-08-16 05:31:18,564 INFO  - UPDATE 1247 systems need to be reevaluated due to new data being processed by the data warehouse.
    2017-08-16 05:31:36,869 INFO  - END 3 systems of type 27 processed. 0 reservation containers considered, 24 indicators computed and 6 metadatas produced. Profile data extracted in 3.424 seconds, data processed in 0.001 seconds, indicators stored in 14.444 seconds, lastcounter updated in 0.347 seconds. Total execution time 18.216 seconds.
    2017-08-16 05:31:40,350 INFO  - END 3 systems of type 35 processed. 0 reservation containers considered, 102 indicators computed and 69 metadatas produced. Profile data extracted in 2.705 seconds, data processed in 0.002 seconds, indicators stored in 0.751 seconds, lastcounter updated in 0.018 seconds. Total execution time 3.476 seconds.
    <-- cut similar messages -->
    2017-08-16 05:33:18,494 INFO  - END 409 systems of type 0 processed. 0 reservation containers considered, 4609 indicators computed and 1226 metadatas produced. Profile data extracted in 52.643 seconds, data processed in 0.011 seconds, indicators stored in 66.135 seconds, lastcounter updated in 0.903 seconds. Total execution time 119.692 seconds.
    2017-08-16 05:42:12,250 ERROR - Fail during systems processing [[3406, 3419, 3650, 3664, 3746, 3754, 4301, 7310, 7311, 21246, 26943, 26946, 26949, 26956, < -- ... cut lots of numbers ... -- >
    28575, 28606, 28607, 28771, 28773]]
    2017-08-16 05:42:12,251 ERROR - BCO_DH_ERR011: Error while executing AFS thread.
    StackTrace: java.util.concurrent.ExecutionException: com.neptuny.cpit.querymanager.QueryTimeoutException: ORA-01013: user requested cancel of current operation

            at java.util.concurrent.FutureTask.report(FutureTask.java:122)
            at java.util.concurrent.FutureTask.get(FutureTask.java:192)
            at com.bmc.bco.service.afs.AFSLauncher$AFSLauncherProcess.run(AFSLauncher.java:357)
    Caused by: com.neptuny.cpit.querymanager.QueryTimeoutException: ORA-01013: user requested cancel of current operation


    SOLUTION:

    The problem was that, for some reason, Oracle was applying a very, very expensive execution plan to the AFS SQL.

    This problem is being tracked by Defect TSCO-13329: 'Force correct explanation plan to Oracle DB for AFS query that extracts all the VMs' and a fix is available in the TSCO 10.7.01 C00005 and later Cumulative Hot Fix (CHF) package (10.7.01.00.C00005).

    The fix is to add optimizer hints to the SQL to direct Oracle to pick an efficient execution path for the SQL.

     


    Article Number:

    000359731


    Article Type:

    Solutions to a Product Problem



      Looking for additional information?    Search BMC Support  or  Browse Knowledge Articles