A key part of data governance is to have an easily accessible knowledge base. Important content of this knowledge base is a data processing catalog. Some call this a report catalog or a data flow catalog. In a previous blog post we covered about expanding your report catalog into a data processing catalog which includes more than just reports. The blog post will cover the many uses of a data processing catalog including inventory, curation, requests, building, and user documentation.
The first main use for a data processing catalog is inventory which includes:
- Import
- Synchronize
- Search and locate
- Access to run or view
- Access to develop or code
Identify the different data processing items your organization has (reports, extracts, integrations, dashboards, etc.) and place information about them in your data processing catalog. Often, some information about these items can be imported into the data processing catalog from the source systems. This catalog is a central place where data users can search about the existence of a certain data processing item and view a minimal amount of information about them. Users can easily locate information such as how to access, what written in, general information or technical details. You want to learn about the existence of the report so the data processing catalog entry needs to have descriptions or details that would aid in the search. You want to link this data processing catalog information to the actual report or dashboard so that it can be easily accessed.
The second main use for a data processing catalog is curation which includes:
- Adding context
- Clarity
- Transparency
- Policy attributes
- Usage
- Other custom attributes
Now that you have a catalog of basic information of your reports and extracts, curation will add more information that will make the data processing catalog even more beneficial. This additional information can be the purpose of the report, who owns the report, what is the definition or glossary entries mentioned on the report, policy attributes associated with it, security classification of this report, does it require a data sharing agreement, data retention policy, and does it contain PII information. This additional curated information provides a great amount of value to a data processing catalog entry.
The third main use for a data processing catalog is requests which includes:
- Request new processes
- Request changes to processes
- Curation request
- Run request
- Access request
After searching, this can be a request for a new report if not found or a change to an existing report, if found. Or it can be a request for curation on the report or an access request to the report. As you are processing a request, you often add new information to the data processing catalog entry.
The fourth main use for a data processing catalog is design and build which includes:
- Requirements
- Design specifications
- Design sign-off
- Approve build
Once there is a request, you need to go through the process, start capturing the requirements, designing what it is, getting sign off on the design, getting it built and then getting approval. All this requirement information must be placed in the data processing catalog for later use. Design is curation on the fly.
The fifth main use for a data processing catalog is user documentation which includes:
- Provide understanding
- Submit questions
Data users might have a deeper question for a particular report or extract or dashboard than what is in the data processing catalog entry. Document the answers to these questions and make them a part of the catalog entry. In some cases, all this information, especially the curated information, could be used to create user documentation for later use.
In this blog post we covered the five main use areas of a data processing catalog. Build out the information over time about the report, extract, or other data processing item as it is needed such as when a curation request is made. Make sure that there is an easy point of entry for people requesting additional information. Make sure that you provide, when people are looking at these reports, links to this curated data processing content. As people look at this information, often they will have additional questions or want additional information, which will aid in building out the catalog content. You want constant improvement to the data processing catalog. A goal is to make sure that people have the information they need to facilitate the use of the organization’s data and provide them with a greater understanding of the organization’s data which will improve their trust in the data.
Additional resources about a data processing or report catalog can be found here.
IData has a solution, the Data Cookbook, that can aid the employees and the organization in its data governance, data intelligence, data stewardship and data quality initiatives. IData also has experts that can assist with data governance, reporting, integration and other technology services on an as needed basis. Feel free to contact us and let us know how we can assist.
Photo Credit:StockSnap_WCRB32VE5E_DataCatalogUses_Laptop_BP #B1169