IData Insights Blog

Best Way to Build a Data Dictionary

Written by Jim Walery | May 23, 2018 12:58:00 PM

A data dictionary is a repository of definitions of the data you use to make decisions at your organization. This post will suggest, gathered from our experience, the preferred way to build your data dictionary, including main drivers, processes, who is involved and thoughts/suggestions.

Minimally a data dictionary contains the name of the definition, a functional description and a technical description.

First, let’s discuss the main drivers to have an online data dictionary:

  • Agreement on the data with one version of the truth which achieves better decision making
  • Transparency which saves time
  • Sharing knowledge which avoids repeat questions
  • Reference documentation for reporting which allows for faster report development
  • Source of training which means faster new employee onboarding

Continuing on, let’s talk about the processes you will need in building data definitions which are:

  1. Process for data requests which includes a ticket system that can handle different types of requests (reports, extracts, dashboards, etc.). This process needs an approval from the requester when they are delivered the final product.
  2. Process for gathering requirements which includes a conversation with the requester with the reason for the request known. Details will be written down in the specifications.
  3. Process for approving definitions which are usually done by the data stewards or the data owners.

Next, let’s look at who is involved in building data definitions, who are:

  • Data definition requester
  • Report developer and reporting manager as they need to understand the data they are putting in reports
  • Data steward or owner who will need to review and approve the definition
  • Report consumer who will look at the data and want to know the definition of what they are looking at

Finally, here are our suggestions and thoughts about data dictionary creation that we have gathered from our experience:

  1. Write down your list of main drivers for a data dictionary
  2. Gather data definitions in a central knowledge base like the Data Cookbook
  3. Don’t focus on everything. Build up your data definitions over time focusing on the most widely used data first. Grow your data dictionaries organically based on the questions that are being asked
  4. Get data dictionary approval by the data stewards or data owners. Data consumers need to know who the data dictionary approvers are. Data stewards need to know their responsibilities
  5. Provide a platform for data definition collaboration such as the Data Cookbook
  6. Focus on process and define data definition approval process, data requirements process and process for gathering requirements
  7. Allow the data dictionaries to be accessed right from the reports themselves to save time and frustration
  8. You don’t need a big project budget or establish a formal committee to debate every data definition
  9. Remember that the world changes so data definitions will change

If you follow these thoughts then your organizational data will be improved, use of the data will be improved and there will be increased benefit.  Other resources regarding data dictionaires and definitions can be found in this blog post.  If interested in learning more about the Data Cookbook, data governance (data intelligence) or our data governance services, feel free to  .

(image credit StockSnap_6T45JNFUXH_ToysBuildingDictionary #1062)