Generating Configuration Management Databases Using Data-Driven Synthesis


The traditional configuration management database (CMDB) is big, complex, difficult to grow and change, and very expensive. Compiling data through data-driven synthesis gives IT organizations a better and more cost-effective method of providing the capabilities of a CMDB. This article explains data-driven synthesis, how it is used to generate CMDBs, and its measurable benefits.

When IT professionals think of a configuration management database (CMDB), most think of a huge, complex, and very expensive data warehouse-like platform. It revolves around a singular large database that also requires many integrations to feed it data from upstream sources and to drive downstream targets that consume its data. Most well-known commercial off-the-shelf CMDBs, whether they are installed locally or run in the cloud, follow this paradigm.

Anyone who has been involved in the installation, feeding, and long-term improvement of these traditional CMDBs will tell you the war stories that come with trying to do so, further fueling the industry data that emphasizes the consistently high failure rates of CMDB implementation projects. They can also rattle off a long list of challenges that make CMDB projects fail—or at least fall short of the business’s expectations to provide its enterprises with real value—such as the cost to design, build, deploy, and maintain integrations, or the cost and complexity of identifying, harvesting, and continuously maintaining relationships between the things we load and track into our CMDBs (i.e., configuration items, also known as CIs).

 The common data compiler/data synthesizer

With significant computing resources being much more affordable and accessible, and with the advent of paradigms like big data that have advanced the means for dealing with large and constantly changing sets of data, the world is realizing that there are new and interesting methods for solving the same types of problems that CMDBs were originally designed for. One of these new methods is to use data compilation to automatically generate the entire CMDB and all of its supporting knowledge constructs (e.g., relationships, reports, views, visualizations, etc.) directly from your enterprise data. Using data compilers to generate your CMDB is based on what’s known as data-driven synthesis.

Understanding Data-Driven Synthesis

A compiler, quite simply, takes something in and changes its representation. For example, a software compiler takes in source code files that are written in textual formats and converts them to binary constructs like libraries, modules, executables, and packages.

Synthesis is a form of compilation that turns the outputs of a compiler into more advanced constructs than just software. For example, since the early 1990s, the semiconductor industry has been leveraging synthesis to convert fourth-generation languages into functional semiconductor designs that are used to fuel simulations, emulation, and manufacturing for very complex semiconductors.

More recently, these synthesis concepts have found their way into the data analytics fields, where informatics and data science professionals use data and rules to create complex interactive visualizations for identifying and taking advantage of data patterns, trends, and anomalies.

Data-driven synthesis is a compilation paradigm that takes in baseline data, along with processing rules, in order to yield something new as its output. The type of data you’re working with could dictate the types of outputs you can synthesize. For example, if you’re working with musical notes, you can automatically synthesize songs. If you’re working with elements on the periodic table, you could possibly synthesize chemical compounds. In the case of an enterprise, if you’re working with enterprise data, you can synthesize an entire CMDB.

Synthesizing Your CMDB from Data

Creating a CMDB from data requires the use of a data compiler that knows how to ingest data and output your CMDB. This is not simple, so it is always recommended that you leverage existing tools before trying to build your own.

As with any compiler, there are inputs and outputs. The inputs to the synthesizer will be your data—servers, software, applications, databases, people, organizations, etc.—along with data processing rules that direct the compiler on what to do with the data, such as how to find and harvest relationships between CIs, how to format CIs, and how to index CIs.

Data-driven synthesis of a CMDB from enterprise data

An important thing to note is that the CMDB synthesizer, because it is a compiler, can easily be tied to your build and deployment systems, facilitating better DevOps.

Upon execution, your synthesizer will generate a data network, from which it will harvest all CI types, CI instances, and, most importantly, massive quantities of semantic relationships. Traditional CMDBs try to store these outputs in one big database. However, data synthesizers may or may not use databases; they might just output the results to a fully traversable HTML file tree that acts as one big multi-component document.

A good synthesizer will then go far beyond just generating data for your CMDB. It will also format, categorize, and organize all your data into humanly digestible constructs, such as diverse CI views, inventories, reports, dashboards, visualizations, indexes, glossaries, or catalogs. These are called knowledge constructs, and they make it easier for actual people to understand the data that comes from the data graph. The goal is to automatically generate knowledge constructs that are usable by many people who are both technical and nontechnical—your business, for example.

The Benefits of Data Compilers and Synthesizing Your CMDB

Using a data compiler to synthesize your CMDB has many advantages over a traditional CMDB.

Data diversity: Traditional CMDBs rarely work well beyond highly technical infrastructure data. It appears this is because without data compilers, there are no obvious tools to create relationships between nontechnical CIs. Data compilers allow you to mix and match almost any data, allowing comingling and interrelating of both technical and nontechnical data and relationships.

Simplicity: Data compilers are very simple to install, set up, and use. There are no databases or application servers, and there is no need for expensive, complex, and direct integrations to other systems.

Speed: Because they’re simple, you can often have a data compiler installed, running, and generating your first fully synthesized CMDB in less than an hour. They’re also blazing fast and can generate millions of relationships in minutes.

Support for agile: One of the biggest advantages to synthesizing your CMDB from a data compiler is that compilers are the very definition of agile development. In cases where you want to add new data or change existing data, you simply recompile over and over until you get your data right.

Higher quality and consistency: Because data compilers are fast, iterative, and agile, you can quickly identify data problems. This means your data gets cleaned far quicker than with traditional CMDBs. In fact, good data compilers can easily do things like identify complex data holes and even create negative data reports and views.

Versioning and history: Because the outputs of a data compiler are nothing more than snapshots, you can easily version the entire compiled output, allowing you to see and compare changes to your enterprise over time. It can also create intranet web sites (i.e., digital libraries) with catalogs, indexes, inventories, reports, and visualizations in a manner that allows the CMDB and the asset register to be fully collapsed into the intranet, eliminating the need for three separate tools.

Multiple instances: Traditional CMDBs are often so complex and expensive that you can only afford one. With data compilers, you can have as many different CMDBs as you’d like. For example, you could have one for infrastructure data and one for sales data.

Lower costs: Data compilers are often a fraction of the cost of traditional CMDBs. The fact that you can have data-driven CMDBs up and running in less than a day also factors in, as most traditional CMDBs can require significant investments over many months or even years to populate them with relevant data.

Data-Driven Synthesis: More Value for Less Effort

Data-driven synthesis is a rapidly evolving model that can help you automatically generate a CMDB directly from your data. Doing so is proven to help enterprises get to the real value of a CMDB much faster, with far less effort, higher levels of quality, and much less investment than traditional database-centric CMDBs. Simply, this means better business results and value.

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.