Duplicate Records

Version 4
    Share:|

    Overview

     

    To find duplicates in any dataset.  The first thing you need to do is identify the duplicates records within the data class.  What are the data Object Class?  The Classes are Base relationship and Base Element.

     

    Note: Please reference the Common Data Model (CDM) for classes, and it's child object.  You can do this by use the atrium Explorer or look at the document that's overwhelming. I recommend you look at the atrium Explorer to understand the data model.

     

    Once you identify the cause of the duplicate records within the dataset, you need to clean the duplicate record within the dataset and then fix the issues that are causing the duplicates.  You also need to figure out an automated way of cleaning the duplicates in most cases.

     

    Here are the BMC documented steps on investigating CMDB data issues.  If you in a server group environment, you need to review this Schedule Recon Job Duplication to help you why duplication of jobs can cause CMDB issue.

     

    Assumptions

     

    Author assume that the user who is reading this document has knowledge of CMDB, SQL, Remedy forms, and database tables.  The author doesn't have a warranty or supports for this paper.  You should always test the SQL query or script within your development or test environment.  SQL query provided in this document does not modify any record in your Action Request database.  It only queries the database.

     

    The SQL queries presented in this document use @dataset has variable for dataset.  @dataset should be replace with dataset names.  For example: bmc.addm, bmc.asset, bmc.asset.sandbox and bmc.import.config etc.

     

    Query for Duplication data class

     

    Here some example to query class of data like OS, Computer, and serial number.

    1. --Show dups in all classes.  You use this to go to each class and find common missing data 
    2. select reconciliationIdentity, ClassId from BMC_CORE_BMC_BaseElement 
    3. Where DatasetId = 'BMC.ASSET' group by ClassId, ReconciliationIdentity having count(*) > 1;

     

    Determined Duplicates

    If you did not change the out of the box recon settings, you'd have the similar issue in your environment. The problem in my environment was identity rule was weak.  See figure 1 for the how to determine duplicates within the CMDB.

     

    Figure 1:

    capture.png

     

    Here is an example to help you understand the syntax:

    1. select reconciliationIdentity from BMC_CORE_Bmc_BaseElement 
    2. Where DatasetId = 'BMC.ASSET' group by ReconciliationIdentity having count(*) > 1; 
    3. -- 'BMC.ASSET' is where @dataset variable can be used.  This gives you duplicate reconciliationIdentity 
    4. -- query BMC.ASSET; You can change the query to different dataset like bmc.config.import or bmc.addm 

      Note:  The above query should not result in any duplicates.  If it does you need to determine the extent of you, duplicate within your dataset.

     

    This query give you the list of duplicate by recon ID and how many of the are in your dataset

    1. select COUNT(*),ReconciliationIdentity from BMC_CORE_BMC_BaseElement 
    2. where DatasetId = @dataset group by ReconciliationIdentity having COUNT (*) > 1;

      You need to copy one of the duplicates into the BMC.CORE: BMC_BaseElement form to determine the cause of you duplicates. Also, you need to search 'FailedAutomaticidentification' to see yes how many records fail to identify in the BaseElement.

     

    Once you identify the failed auto-identification records, you need to determine how many records failed.

    1. select count(*),addmintegrationid from BMC_CORE_BMC_BASEELEMENT 
    2. WHERE datasetid = @dataset group by addmintegrationid having count (*) > 1 ; 
    3. -- you need to query source and dest dataset with this query

      You need to write down record count for support for each dataset.

     

    The query below identifies data that have recon ID has 0 on the dataset.  Thus, determines the data that is failing to Auto-ID itself between two records during a recon process.

    1. select classid, count(*) from BMC_CORE_BMC_BaseElement 
    2. WHERE ReconciliationIdentity = '0'  AND DatasetId = @dataset 
    3. group by classid order by classid;

     

    Now, you have a good set of information for BMC support to help you resolve your duplication issue. "Here more of the queries I use for evaluating these counts in Database queries for evaluating CMDB data distribution in the product documentation and there is also a data report snapshot idea proposed on Communities for a way to make this data more accessible in the product."  Happy reconcile datasets!

     

    Notes:

    - Then took reconciliation id from one record.

    - Did search on BMC_BaseElement form with that recon id, got three records. One from BMC.CORE.CONFIG Dataset & Two entries from BMC.ASSET

    - One record was having related CI & another wasn't.

    - So we did reporting on another record & copied value of "LastREJobrunId" field.

    - Did search with that value on RE: Job_Runs form & found that it is not present.

    - So we ran the following query:

     

    ** Query to find all the CIs which got reconciled on same recon job:

    - select * from BMC_CORE_BMC_BaseElement where ReconciliationIdentity IN('@data') AND LastREJobrunId = '@id' and DatasetId = 'BMC.ASSET'

      (@data is reconciliation ids that we picked from the first query, @id is LastREJobrunId that was copied from reporting)

     

    - We took some CIs instance id from above query & did the search on BMC_BaseElement form to check whether it has any related CIs or not. Found that there are no related CIs to it. So we ran the following query to delete it.

     

    ** Deleted all the CIs which were duplicate:

    - delete from BMC_CORE_BMC_BaseElement where InstanceId in ('@data')

      (@data is InstanceIds from above query)

     

    Here are vim commands to edit the strings ESC Key s/$/',/g  :s/^/'/g :

     

    Defect #SW00410163: Duplicates with the same Reconciliation ID created in same dataset during reconciliation.

     

    To delete the orphan & duplicates without CMDBDiag.  Can do the following:

     

    -- To Class need to updated for the duplicates or Orphan to delete thru recon job

    update BMC_CORE_BMC_OPERATINGSYSTEM set MarkAsDeleted = '1' where SystemName is NULL and SystemClassId is NULL and DatasetId = 'BMC.ASSET'

     

    --to find class duplicates or operant @class is the variable for each class.  Class is BMC_Product or BMC_OperatingSystem

    select instan from @Class where SystemName is NULL and SystemClassId is NULL and DatasetId = 'BMC.ASSET'

     

    select SystemName, SystemClassId, Datasetid from @Class where InstanceId = 'OI-10B3DB2A95A443B082FA3B865B063DDE'

     

    Need to create qualification group for each class