As a member of BMC Customer Support, I have helped many customers work through data issues in their CMDB. I have found many issues are caused by implementation decisions, unexpected dependencies, and poor data loaded into the CMDB long before problem symptoms appeared. I recently contributed a topic to the BMC Atrium CMDB Suite 8.1 product documentation on Investigating CMDB Data issues which shares my experience on how to investigate, troubleshoot, and resolve these kinds of issues when they do occur. The goal of this blog post is to share ways to successfully manage data in your CMDB so you never have to look at that troubleshooting document.
Several of the Atrium webinars have discussed best practice principles when using BMC Atrium CMDB Suite and BMC Atrium Discovery and Dependency Mapping features which provide great context on overall planning, implementation decisions, and designing for performance. In this post, I will try to restrict the topic to only the managing your data aspects of the overall plan and data flow.
Define your use cases and determine what data you need in the CMDB
The first step is to define your use cases and determine what data you need in the CMDB. The single worst decision you can make is to put everything you have in the CMDB and let the tools sort out the mess later. A good use case includes each of the following:
- Which devices do you need to manage in the CMDB?
- What kind of data – which CI classes and relationships?
- Which applications and users will consume the data in the CMDB?
- How will the users and applications consume or access the data?
A few examples illustrate the point:
Example 1: I need server and application information for all servers in the data center. This includes computer systems, application services, databases and relationships between them. I will use this data to manually define service models which define how these CIs support business services. I will use Event Management monitoring of these servers to implement Intelligent Ticketing by creating incidents in ServiceDesk that show the impact on critical business services. Automated compliance jobs will verify these servers in the data center are configured properly and create change requests to bring them under compliance if necessary.
Example 2: Service Desk technicians need comprehensive, current information about workstations so they can investigate issues effectively.
The first example illustrates a case where several classes of CIs and relationships need to be populated to the CMDB. Some features such as a defining the relationship to business services must be done in Atrium CMDB Suite because the features which implement it require the data in this location. There are also fewer server systems, they change less frequently, and they are managed more closely – so in addition to being a requirement to store the data in the CMDB for several applications – there is also more value in propagating the information to a central location where it can be leveraged by all the different applications.
The second example is an opportunity for federation. Since the main use case is for service desk technicians to view the information, check if this information can be viewed in the discovery application by reference to the computer system. This scenario, where you populate minimal information to the CMDB reduces the amount of data to be managed. Now assume a third use case is added to perform software license management on Microsoft Office products. This functionality in BMC Remedy Asset Management requires the product information to be stored in Atrium CMDB. To accommodate this third use case, you can populate product information from these computer systems to the CMDB – but only Microsoft Office products because those are the only products required for the use case.
This strategy of populating only what you need in the CMDB reduces the amount of data to be managed, which simplifies many aspects of managing the data.
Identify the fewest, best data providers necessary for the use cases
Let’s assume we have the following data providers available for the use cases described above:
- BMC Atrium Discovery and Dependency Mapping (ADDM), which discovers information about servers in the data center
- BMC BladeLogic Client Automation, which discovers information about workstations, and the last time installed products were accessed.
- Microsoft System Center Configuration Manager (SCCM), which contains information on both servers in the data center and end user workstations.
- A data export from a legacy asset tracking application, which has manually entered information which cannot be discovered from the systems.
Which data provider should you use for discovering servers in the data center? Which data provider should you use for discovering workstations?
Putting the data export aside for the moment, there are two products that have server information. Let’s assume that after comparing BMC ADDM and Microsoft SCCM, that the former was considered the better product for server discovery in the data center because it discovers more comprehensive application relationships. Is there value in also discovering the same servers using Microsoft SCCM and populating that data to CMDB? There are three factors to consider before adding the second data source for the same CI’s:
- Do both data sources provide identical values for sufficiently many identification attributes?
- Does the second data source provide better data that the first for non-identification attributes?
- Are the strengths of the second product needed for planned use cases?
Note: The first requirement is described further in a later section on identifying data deficiencies.
Similarly, the same evaluation should be made when considering the best single data provider to use for discovering workstations and user devices.
In my experience, the best starting point is to use:
- one discovery provider for servers in the data center, which provides the best information about servers, and only discovers servers
- a second discovery provider for workstations and non-server devices, which is specialized for this purpose, and only discovers workstations
- no devices are discovered by multiple discovery sources
Secondary data providers can be added later if they meet all three requirements described above.
By contrast, consider the case of using a single discovery provider for both servers and workstations because it is already deployed and appears to be good enough for the initial use cases for the CMDB. A year later you determine there is a need for a more specialized discovery product for servers to capture the detail you need, and a third discovery product to discover and manage workstations. Since operators have been working with the existing data, you cannot simply remove it and replace with a new data provider because the existing relationships must be retained. This can lead to the situation where data from extra data providers must be managed in the CMDB for historical reasons, though they provide no benefit over the preferred data provider.
The key point here is that evaluating discovery requirements up front can save a lot of effort managing data in the CMDB in the future.
Handling legacy data imports
It is often recommended to use an automatic discovery process to populate and periodically update data in the CMDB. Discovery and data source providers are a key participant in data management in the CMDB, so if the data source does not maintain the data, who does?
The general principle is: use automatic discovery for any attributes which can be automatically discovered by a discovery product, and use manual data only for attributes which are not discoverable.
A frequent challenge I see in managing data in the CMDB is legacy data or manual data that was imported as the first data source in the CMDB. This introduces any data issues from the legacy system directly into the CMDB, which may not be recognized until much later when it is difficult to fix.
A far better approach is to do the following:
- Use an automatic discovery data provider to populate a source data set, and reconcile it as the first into the golden dataset.
- Load the legacy data into a source dataset, for example BMC.LEGACY
- Create a Reconciliation job which does NOT use standard rules. Add an identification activity which does NOT Generate IDs for the source dataset, BMC.LEGACY in our example. Define identification rules based on the data which is populated in the legacy dataset which should match the discovered data, but do not auto-identify so any data that does not find a match remains unidentified.
- Run the reconciliation job.
- Investigate any CIs in the Legacy dataset which are not identified. This may reveal cases where the legacy data has incorrect data for identification attributes, cases where the CI is no longer in the environment, or cases where discovery has not discovered the device for other reasons.
- Add a precedence group which assigns precedence for attributes which are not discoverable, or for which the legacy data is preferred.
- Add a merge job to the custom Reconciliation job, and run the reconciliation job to merge the data together.
- Look at the data in the golden dataset to verify it correctly includes the best data from both sources.
This process ensures legacy data only updates data which is discoverable and available in the environment, and only updates attributes which have more useful values.
Analyze data dependencies and deficiencies
Earlier, I introduced the need for data sources to provide identical identification attribute values for the same CI. This is important so they can be identified as the same CI through Reconciliation Identification. For example, the standard identification rules for computer system include combinations of the following attributes: TokenId, Hostname, Domain, SerialNumber, isVirtual, and PartitionId. Correctly matching the same computer system discovered by two different data sources requires matching values in at least one of the sets of identification attributes. For example, the third standard identification rule for ComputerSystems looks for a match on three attributes: Hostname, Domain, and isVirtual. Looking at the data itself is important here. If the hostname is stored as “Unknown” when it cannot be discovered, this can present challenges when using this attribute value for identification.
Note: Many classes of CI derive part of their identity from the computer system which hosts it. For example, the default identification rules for an IPEndpoint are:
- <Related to the same computer system> AND <matching values of TokenID>
- <Related to the same computer system> AND <matching values of Name>
And since the earlier step validates data requirements are met for the Computer System, this step is just to confirm one of TokenId and Name attributes are populated consistently between the two datasets.
The key takeaway here is to understand the data dependencies and examine the data before reconciling it in the CMDB. Failure to do this can lead to duplicate CIs in the CMDB and other challenges managing data in the CMDB. It is better to understand the deficiencies in the data before introducing it into production.
Another kind of data deficiency is incomplete data in the data source. For example, the combination of ManufacturerVendor and ProductName are used for product catalog normalization. How should you handle the case where the data source only includes one of these values, potentially even after discovering the data from the device itself? Normalization aliases can address some of these challenges, but the right way to address them depends on the particular situation and the uniqueness of the data which is available. See knowledge article KA403062 for further detail.
Determine how to handle deleted data
Another important element of managing data in the CMDB is requiring discovery sources and data providers to update data in the CMDB when it changes. This includes removing the CIs and relationships when they are removed from the data source or the environment. The data provider updates the CIs in the source dataset to be soft-deleted, Reconciliation propagates this change to the data in the golden dataset, and Reconciliation purge jobs remove it from both datasets in the CMDB. At the completion of this process, the data is no longer in the data source, nor in the source dataset in CMDB, nor in the golden dataset in CMDB – so it is consistent throughout.
There is sometimes a requirement to retain Computer Systems after they are removed from the environment for reporting purposes, or to retain relationships to incidents or change requests in the BMC Remedy IT Service Management Suite. This presents a few challenges:
- When a computer system is soft-deleted, all the relationships to hosted CIs are also soft-deleted.
- Some of the computer system details are not visible because the relationships are deleted.
- The computer system still resides in the golden dataset with identification attributes populated.
- If the computer system is later replaced with a device with similar identification attribute values, the new computer system can reconcile with and update the old one, creating a mix of obsolete and current data.
This series of events can cause big problems for data management. One effective method to address this situation appears to be:
- Use a reconciliation Copy dataset activity to archive data to a separate dataset, before the computer system is soft-deleted. This preserves the relationships between the computer system and its processors, memory, IPEndpoints in a separate dataset for archival purposes.
- Purge classes other than ComputerSystem in the golden dataset.
- When the Computer System is soft-deleted, update the AssetLifecycleStatus and Name to indicate it is no longer in the environment, and update the identification attributes so they will never match. For example, append –old to all of the values.
This approach seems to avoid most data management problems because it removes most of the data as appropriate, only retains a minimum number of CIs which no longer exist in the environment which are kept for historical reasons.
Examine data distribution periodically
In my experience, following the guidelines described above leads down a much more successful path for managing data in your CMDB. It reduces the number of participants and overlapping data and it identifies and avoids problem areas by evaluating data quality and dependencies before failures are encountered downstream. Periodically examining the amount of data in your CMDB is another good practice. Some interesting counts would include:
- Number of Computer Systems per dataset
- Number of CIs in related classes per dataset
- Number of CI’s per class by value of NormalizationStatus
- Number of unidentified CIs by class and dataset
- Number of soft-deleted CIs by class and dataset
These queries can provide insight into the size and location of data discrepancies, and where to focus your efforts in investigating data issues.
I shared some of the queries I use for evaluating these counts in Database queries for evaluating CMDB data distribution in the product documentation and there is also a data report snapshot idea proposed on Communities for a way to make this data more accessible in the product.
Hopefully this article provides some insight into ways to manage data in your CMDB more successfully by using a few good practices including:
- Define your use cases and determine what data you need in the CMDB
- Identify the fewest, best data providers necessary for the use cases
- Handle legacy data imports conservatively
- Analyze data dependencies and deficiencies
- Determine how to treat deleted data
- Examine data distribution periodically
These practices have worked well for me to avoid some of the challenges of managing your data in the CMDB. I am interested in your experiences in this area. Are there other good practices that should be added to the list? Please provide feedback or your own tips and techniques used to address challenges and lead to a low maintenance CMDB with high quality data.