Data quality is a key component of data governance (also called data intelligence). This post lists impacts of poor quality data and identifies sources of poor quality data. If you can’t get traction to establish a formal program, you might use the information in this post as talking points at your higher education institution. Use this information to persuade administrators to formalize data quality activities.
Poor data quality impacts many areas at an organization (such as a higher education institution) including:
Here are some common sources and some suggested solutions:
1. Data entry error – For example, a data entry error such as a student’s Social Security Number (SSN) or zip code is entered wrong, such as transposing of digits. Data entry error is reduced by hiring excellent individuals who are trained well on the use of the software and are rewarded for accuracy over speed.
2. Mismatch between multiple systems or integration/migration issues – When information is imported into a new data system, codes can change. One system may spell out Ohio and the new system may use OH. A data steward is consulted during the migration projects, these errors can be reduced.
3. Lack of data validation or software issue – Invalid data includes nulls, a birth date that does not exist in a student’s record, or illogical data, such as a birth date occurring after an application date. Software validation errors occur if the name or email address is too long for data entry field or a man is listed on the women’s soccer team. The solution is to report the issue in the data quality issue tracking system. The data steward will review the issue, create a quality rule and work with IT staff to change the software so that the issue will not happen again. And the affected records will need to be updated.
4. Data does not conform to business rules or policies – For example, a student who is a senior does not have a declared major. When an unclear business data quality rule is reported to the appropriate data steward through the data quality issues tracking system, the current rule should be reviewed and updated. IT should be involved to implement the business quality rule into the appropriate system.
5. Duplicate data – For example, a student is listed in the same campus organization multiple times due to using different email addresses. If a point of entry check is possible and a warning can be flagged, then this would help eliminate duplicates. If no point of entry check, then the data steward will need to regularly review the database and merge duplicate records.
6. Obsolete data – For example, a prospective student for a community college moves out of the country and thus should no longer get marketing emails and letters. A solution for obsolete data is to identify what makes the data obsolete and create automatic reports or cleanup tools that will either notify data steward of obsolete data or remove the obsolete data.
In other blog posts we discuss data quality programs, data quality rules, data quality assessments and data quality issues. Feel free to check out our other data quality resources in our data quality resources blog post. Hope this post on the impacts and sources of poor quality helped.
IData has a solution, the Data Cookbook, that can aid the employees and the institution in its data governance. IData also has experts that can assist with data governance, reporting, integration and other technology services on an as needed basis. Feel free to contact us and let us know how we can assist.
(image credit StockSnap_4RFELNP8DB_ImpactDataQuality #1039)