This document contains official content from the BMC Software Knowledge Base. It is automatically updated when the knowledge article is modified.
CMF MONITOR - all supported releases
We see the following CMF messages that seems to be related to sampler CA5H problems.
N A=x00F9 xxx 2018078 18:00:00.45 MVSPAS CMFCAC75E IDCSS01 COUNTS CALL FAILED FOR ALL SUBSYSTEMS: RC=0038. CACHE CPM SAMPLER CA5H
N A=x00F9 xxx 2018078 18:00:01.00 MVSPAS CMFCAC53W SAMPLER SKIPPING INTERVAL 18078 01750 DUE TO ERRORS OR TIMEOUT, CACHE CPM SAMPLER CA5H
N A=x00F9 xxx 2018078 18:10:00.45 MVSPAS CMFCAC75E IDCSS01 COUNTS CALL FAILED FOR ALL SUBSYSTEMS: RC=0038. CACHE CPM SAMPLER CA5H
N A=x00F9 xxx 2018078 18:10:01.00 MVSPAS CMFCAC53W SAMPLER SKIPPING INTERVAL 18078 01800 DUE TO ERRORS OR TIMEOUT, CACHE CPM SAMPLER CA5H
N A=x00F9 xxx 2018078 18:20:00.45 MVSPAS CMFCAC75E IDCSS01 COUNTS CALL FAILED FOR ALL SUBSYSTEMS: RC=0038. CACHE CPM SAMPLER CA5H
N A=x00F9 xxx 2018078 18:20:01.00 MVSPAS CMFCAC53W SAMPLER SKIPPING INTERVAL 18078 01810 DUE TO ERRORS OR TIMEOUT, CACHE CPM SAMPLER CA5H
N A=x00F9 xxx 2018078 18:30:00.43 MVSPAS CMFCAC75E IDCSS01 COUNTS CALL FAILED FOR ALL SUBSYSTEMS: RC=0038. CACHE CPM SAMPLER CA5H
N A=x00F9 xxx 2018078 18:30:00.68 MVSPAS BBDDA013I New WLM policy activated in GOAL mode detected
N A=x00F9 xxx 2018078 18:30:00.69 MVSPAS BBDDW006I Initializing WLM Workloads
N A=x00F9 xxx 2018078 18:30:01.00 MVSPAS *CMFSDV86 CPM SAMPLER CA5 TERMINATED, RECORD ERROR X'00000008'
N A=x00F9 xxx 2018078 18:30:01.00 MVSPAS CMFCAC54E SAMPLER TERMINATING DUE TO EXCESSIVE I/O ERRORS OR TIMEOUTS CACHE CPM SAMPLER CA5H
What can we do about these?
The help for message CMFCAC75E says
IDCSS01 xxxxxx CALL FAILED FOR ALL SUBSYSTEMS: RC=yy. CACHE xPM SAMPLER CAxH
A call to IDCSS01 to request performance data or sense status data failed in a CPM or IPM
(xPM in the message) cache sampler. The return code (RC=yy) describes the nature of the error.
This IDCSS01 call was intended to obtain information for all sub-systems.
'CAxH' identifies the cache sampler that produced this message.
Note that there are multiple cache samplers. Each collects data for a different control unit
model and each has a different identifier CAxH.
Information about the return codes can be found at the end of SSGARGL - argument list
obtained from IDCDF70 macro.
If the IDCSS01 call is for COUNTS data, no SMF records for this interval are produced
for all subsystems belonging to this model. If the IDCSS01 call is for STATUS data, the
SMF record for this interval is not produced for the particular subsystem encountering this error.
In the above messages, the RC=00000038 on the CMFCAC75E message indicates that the IBM supplied Cache API (IDCSS01) returned return code X'38' (dec 56). This is described in the macro that maps the API data as "SERIOUS ERROR MSG". No other explanation is available regarding the cause of this error.
If MainView for z/OS (CMF) receives an error from the cache API, we attempt to analyse it. Some error conditions can be handled by the product but for this return code there is no further information available on what the API found wrong. If four successive calls to the Cache API fail, the Cache Sampler will terminate (as shown in the messages above).
When this sampler terminates, there is no SMF data written (74-5 and 74-8) and no data available for the CACHxxxx views, as this data is obtained from the Cache Sampler.
If this happens just once, there is no need for further action. The Cache Sampler skips this interval and will attempt to recover at the end of the next interval.
Only if this happens four times in a row without a successful Cache API call will the Cache Sampler shut down.
If the Cache Sampler shuts down, the user will receive an appropriate error message. To restart it, you would need to enter a MODIFY pasname,CPM=xx to recycle the samplers.
The Cache Sampler will restart even if the error has not been resolved (but will shut down again if there are four further consecutive intervals with errors).
If the error condition persists, we recommend that you open an ETR with IBM to see if they can identify the cause.
By the time that MainView receives control back from the API, the only information available is the return code itself with no diagnostic information to determine what the API found wrong. Setting a SLIP to produce a dump when we receive a bad return code provides no further information.
IBM should be able to provide a SLIP Trap within the IDCSS01 API at the point it detects the error that results in the bad return code. This will then ensure that all of the APIs internals work areas, etc. are captured in the dump