IData Insights Blog

Impacts and Sources of Data Quality Problems

Written by Jim Walery | Jan 2, 2019 11:01:44 PM

Data quality is a key component of data governance (also called data intelligence). This post lists impacts of poor quality data and identifies sources of poor quality data. If you can’t get traction to establish a formal program, you might use the information in this post as talking points at your higher education institution. Use this information to persuade administrators to formalize data quality activities.

Poor data quality impacts many areas at an organization (such as a higher education institution) including:

  • Productivity - Poor data quality slows productivity as it forces staff to perform a manual work around or to submit a quality issue. Under pressure for a critical deadline, a staff member may make ad hoc data corrections themselves instead of fixing the error systematically.
  • Reputation - Poor data quality damages reputation. When poor data quality is experienced, a student or former employee can take to social media to share their negative experience with the institution. This could affect prospective employees (or students) in their decision to join the organization. Also, if inaccurate information is submitted to agencies, the media might report the inaccuracies.
  • Decision Making – The better the information, the better the decisions that will be made after viewing the information. Decisions about the selection behavior of prospective students, the retention behavior of current students and the satisfaction behavior of alumni is critical for higher education institutions deserve accurate data.
  • Communication - Inaccurate contact information affects the ability to reach prospective students, students, employees and organizations that the higher education institution must contact. An institution wants to communicate the necessary information to the right people at the right time.
  • Identifying all of the sources of data quality problems most prevalent at an institution is the first step in prioritizing and then eliminating the sources.

Here are some common sources and some suggested solutions:

1. Data entry error – For example, a data entry error such as a student’s Social Security Number (SSN) or zip code is entered wrong, such as transposing of digits. Data entry error is reduced by hiring excellent individuals who are trained well on the use of the software and are rewarded for accuracy over speed.

2. Mismatch between multiple systems or integration/migration issues – When information is imported into a new data system, codes can change. One system may spell out Ohio and the new system may use OH. A data steward is consulted during the migration projects, these errors can be reduced.

3. Lack of data validation or software issue – Invalid data includes nulls, a birth date that does not exist in a student’s record, or illogical data, such as a birth date occurring after an application date. Software validation errors occur if the name or email address is too long for data entry field or a man is listed on the women’s soccer team. The solution is to report the issue in the data quality issue tracking system. The data steward will review the issue, create a quality rule and work with IT staff to change the software so that the issue will not happen again. And the affected records will need to be updated.

4. Data does not conform to business rules or policies – For example, a student who is a senior does not have a declared major. When an unclear business data quality rule is reported to the appropriate data steward through the data quality issues tracking system, the current rule should be reviewed and updated. IT should be involved to implement the business quality rule into the appropriate system.

5. Duplicate data – For example, a student is listed in the same campus organization multiple times due to using different email addresses. If a point of entry check is possible and a warning can be flagged, then this would help eliminate duplicates. If no point of entry check, then the data steward will need to regularly review the database and merge duplicate records.

6. Obsolete data – For example, a prospective student for a community college moves out of the country and thus should no longer get marketing emails and letters. A solution for obsolete data is to identify what makes the data obsolete and create automatic reports or cleanup tools that will either notify data steward of obsolete data or remove the obsolete data.

In other blog posts we discuss data quality programs, data quality rules, data quality assessments and data quality issues. Feel free to check out our other data quality resources in our data quality resources blog post.  Hope this post on the impacts and sources of poor quality helped.

IData has a solution, the Data Cookbook, that can aid the employees and the institution in its data governance. IData also has experts that can assist with data governance, reporting, integration and other technology services on an as needed basis. Feel free to contact us and let us know how we can assist.

 

(image credit StockSnap_4RFELNP8DB_ImpactDataQuality #1039)