There is a great deal of content in data governance. And there needs to be agreement on what we are governing. There are 12 key areas of content which we will cover in this blog post. We will explain the terminology that we use. You may have different terminology and that is fine. But it is important within your organization to agree on the terminology before there is confusion or frustration. In 2018, we did a blog post on content for higher education institutions. We thought we would update the post and do a more detailed post for all types of organizations.
Here are the content areas:
- Business Glossary (functional and technical definitions) (data dictionary) - The glossary is a place for the definition of your business terms. It is not something where you have thousands and thousands of entries. It probably has hundreds or low thousands of entries. Such as: What is an employee? What is an active employee? What is a receivable? The functional definition should be agnostic to the data system that they are implemented in. Additionally, you have technical definitions which includes technical guidance. There can be multiple technical definitions associated with a functional definition. Often this is referred to as a data dictionary. However, we acknowledge that is not a universal definition. Blog Post
- Data System Inventory (system details and technical data models) - This content is about all the different data systems that you have in your organization, either big or small, including shadow systems. Including: What they are? Where they live? Why do they live? How do you get access to it? Who's in charge of it? What type of security classifications does it have? Additionally, in a data system inventory, you get the full data model for each system (technical metadata). The data model is what some people often refer to as a data dictionary. This is where there is some confusion. We like to call this a technical data model or technical metadata. Take your technical data model and associate it to a glossary term through the technical definition. Some people focus very much on defining, tagging, and augmenting the information around their data model with additional detail. Blog Post and Recorded Webinar
- Data Deliverables Catalog (Reports, ETLs, Surveys, APIs) (Specifications) - This content is essentially information and documentation about any processing you are doing. For example, dashboards, reports, surveys, ETLs (extract transform load), and APIs. Anything that involves the moving of data out of one system to another system should be documented. Specifications are something that you are about to build or something that is built. Recorded Webinar
- Data Lineage – This content is how data moves and transforms from one system to another system. Recorded Webinar
- Data Requests – This content is the various data requests that people have submitted thru your data request process along with the tracking and resolution of that request. Blog Post and Recorded Webinar
- Data Stewards, Functional Classification, Ownership – This content manages the data stewards and functional classification and ownership of all the various content. Including: Who are the stewards? What is the functional organization set up? How does that relate to the other content in the system? Blog Post
- Data Quality (rules, assessments, monitoring, issue resolution) – This content includes a few items (data quality issue reporting/resolution, rules, and assessments). Someone may report a data quality issue that may or may not be an actual data quality problem. It may be that someone has seen a report that says the division did 2 million dollars of revenue last year, but I know we did 2.5 million. This could be a definitional problem or a contextual problem or a problem with the report. It could be a misunderstanding, or it could be bad data. No matter what it is, you must acknowledge the potential quality issue. You document the resolution, which is also part of the content. Maybe the issue can become a rule, another form of data quality content. Once you have a rule you can have assessments. Calculate how many entries you have and how many violate the data quality rule. This is assessment content as well. Blog Post
- Reference Data Management – This content, reference data, is the list of code values (such as postal codes) or the list of statuses (say of employees). They need to be managed, as a master list or by data system, along with the relationship between them. Blog Post
- Data Access, Security, Sharing, and Privacy Content – This content is about the attributes that you are going to assign to the different content items: data access rules, security classifications, data sharing agreements, privacy codes and classifications.
- Change Management Content – This content is the versioning of content so when it changes you can view its history and possible reverting back if necessary.
- Data Policies – This content is the overall data policies necessary for the organization.
- External Data Standards and Reporting – This content is external from your control. You are getting it from a regulatory agency or consortium that you are a part of or a vendor that your involved with. How do you keep these things up to date and synchronized to make sure that you are compliant with the most recent version?
You do not have to do all the above content at once. Figure out who is affected by the above. Prioritize the content and create a strategy. Do the ones that have the greatest impact. Look at return on investment. Use just-in-time data governance where content is created when you need it. For example, create your data deliverables catalog entries when a data request is submitted and approved. Hope this blog post was useful to you in data governance content creation and management.
IData has a solution, the Data Cookbook, that can aid the employees and the organization in its data governance, data stewardship and data quality initiatives. IData also has experts that can assist with data governance, reporting, integration and other technology services on an as needed basis. Feel free to contact us and let us know how we can assist.
Photo Credit StockSnap_AREPT79VY0_libary_manyresourcesofDG_BP #1105