Clarifying Data Governance: What is a Business Glossary, a Data Dictionary, and a Data Catalog?

I often see conflicting and overlapping definitions of business glossaries, data dictionaries, and data catalogs, and consensus of standard definitions of each remain elusive.  Some of this confusion is easily understood considering how data governance typically evolves within an organization. For instance, it can be efficient to start with the creation of a data dictionary or data catalog and subsequently build a data governance program on top of that; likewise for a data quality initiative.  This approach delivers quick wins in data governance while embracing the spirit of ‘agile’.  I will put forth the following as the suggested definitions and elements of each.   My intent and emphasis is to capture the joint value of these assets, to provide specific definitions of each, explain how they fit into a data governance program, and provide examples of each.

Summary of Business Glossary, Data Dictionary, and Data Catalog

Business Glossary

A business glossary is business language-focused and easily understood in any business setting from boardrooms to technology standups. Business terms aren’t meant to define data, metadata, transforms, or locations, but rather to define what each term means in a business sense. What do we mean by a conversion? A sale? A prospect? These types of questions can be answered with a business glossary. Having a business glossary brings common understanding of the vocabulary used throughout an organization. The scope of a business glossary should be enterprise-wide or at least divisional-wide in cases where different divisions have significantly different business terminology. Because of the scope and the expertise needed, responsibility for the business glossary is owned by the business rather than by technology. Often a data steward or business analyst will have this as a sole responsibility.

Data Dictionary

A data dictionary should be focused on the descriptions and details involved in storing data. There should be one data dictionary for each database in the enterprise. The data dictionary includes details about the data such as data type, permissible length, lineage, transformations, and so on. This metadata helps data architects, engineers, and data scientists understand how to join, query, and report on the data, and explains the granularity as well. Because of the need for technical and metadata expertise, the ownership responsibility for a data dictionary lies within technology, frequently with roles such as database administrators, data engineers, data architects and/or data stewards.

Data Catalog

The data catalog serves as a single-point directory to locate information and it further provides the mapping between the business glossary and data dictionaries. The data catalog is an enterprise-wide asset providing a single reference source for location of any data set required for varying needs such as Operational, BI, Analytics, Data Science, etc.. Just as with the business glossary, if one division of an enterprise is significantly different than others, it would be reasonable for the data catalog to be exclusive to the division rather than to the enterprise. The data catalog would most reasonably be developed after the successful creation of both the business glossary and data dictionaries, but it can also be assembled incrementally as the other two assets evolve over time. A data catalog may be presented in a variety of ways such as enterprise data marketplace. The marketplace would serve as the distribution or access point for all, or most, enterprise certified data sets for a variety of purposes. Because of the mapping work requiring involvement from both business and technical expertise, assembling the data catalog is a collaborative effort.

Business Glossary, Data Dictionary, Data Catalog

Summary

Of course, the success you realize from the assembly and use of these data governance assets is entirely dependent on other pillars of a solid data governance program such as a data quality initiative, master data management, compliance and security concerns, etc. Please share your thoughts in the comments section or by direct message.

Dirk Garner is Principal Consultant at Garner Consulting providing data strategy consulting and advisory services.  He can be contacted via email:  dirkgarner@garnerconsulting.com or through LinkedIn:http://www.linkedin.com/in/dirkgarner

See more on the Garner Consulting blog: http://www.garnerconsulting.com/blog-busglossdatadictdatacat.html

 

2 thoughts on “Clarifying Data Governance: What is a Business Glossary, a Data Dictionary, and a Data Catalog?”

  1. Thank you so much for the descriptions of these and especially the table that further describes the differences. This was very valuable to me in my studies to understand these.

Leave a Reply

Your email address will not be published. Required fields are marked *