In an earlier post I described the components of a data quality program. Today I want to discuss one of those components - data quality rules and a catalog to organize the rules. A data quality rule is a tool for codifying the accuracy and completeness of a data attribute.
Once a rule is established, processes can be put in place to ensure that data follows the rule. Without a rule, there is nothing to measure accuracy and completeness against. Creating new quality rules does not have to take a lot of time. It is best to create a rule as the business need for it becomes obvious. Gradually, the collection of rules will become comprehensive. Rules are divided into two types - business process rules and technical rules.
Business process rules ensure that data supports critical business activity. For example, Advancement staff want accurate addresses for recent graduates so that solicitation campaigns are successful. A quality rule can be defined that stipulates a recent graduate’s mailing address must be no older than 90 days. To enact the rule, a new requirement to confirm mailing address could be added to a pre-existing form that prospective graduates must complete. For further ease of reporting, a time stamp could be added to the mailing address. Without the formality of a data quality rule, it would be more difficult to implement the time stamp and the required confirmation.
Technical rules are the second type of data quality rule. This type of rule pertains to data format and completeness. These rules ensure that data conforms to a specified format, such as a phone number that includes country code, area code, and number. Or, in another example, a technical rule may stipulate completeness such as a person record must contain a phone number. The need for a technical rule should also align with business purposes so that rules are not created arbitrarily. Using my example earlier, if email is the exclusive communication method, it may not be necessary to have a quality rule measuring the completeness of phone numbers.
In recent versions of enterprise systems, both business and technical quality rules can be recorded within a module of the system. However, a free-standing catalog that gathers all rules from all systems into one place is important for discovery and documentation across the institution and is more efficient in the long run. A catalog provides transparency to all rules, and makes it easy to assess a quality program holistically, without needing to login or risk access to actual data within a data system. A catalog also makes it easier to govern rules. Governance of quality rules means that someone with credentials, perhaps a data steward, has approved the rule and thus the rule can be trusted.
In summary, rules are the cornerstone of a data quality program.
If you need help in implementing data governance, remember that IData provides data governance services. A data governance solution like the Data Cookbook can help in successful implementation of data governance at a higher education institution. Feel free to .
(image credit StockSnap_DBRAYDPLP1_checklist1 #1072)