Periodically, you will want to modify or remove data in your organization that is incorrect, incomplete, improperly formatted, unnecessary, or duplicated. To do this you want to clean your data. Having clean data improves decision-making, boosts efficiency and reduces inconsistencies. Thus, making staff members and the organization happier. This blog post will cover this topic of data cleaning including the why, who, when and how (culture, documentation, processes, training, framework, tools) components.
Every organization is different in their data uses and accumulation. And every organization should have a data cleanup plan that is tailored to their organization. Having a plan for data cleanup allows you to focus on the data that will be important to the success of your organization. Of course, if possible, you want to do just-in-time cleanup of your data issues. For example, fixing capitalizations or finding a missing area code in a phone number as you see them. And resolving data quality issues and requests as you receive them. Contacts in your database with bounced emails, returns on emails, and that have left the organization (such as retirement) are just some of the data cleanups that needs to be done. You want to clean redundant data, remove unnecessary or no longer needed data, and fix bad data. Data should emphasize quality over quantity. Remember that you are paying to store your data, so make sure the information is useful to your organization and creating more efficient operations.
Who – Everyone in the organization is responsible for the cleaning of data. If you see an issue, fix it, or report it. Determine who will perform specific cleaning tasks and assign data stewards to specific data areas. Of course, assignment will depend on what data it is, what caused the issue and how it will be fixed. If it is an integration issue that will probably need to be fixed by a technical staff member. If just some missing information, it would have to be fixed by a functional person that knows how to use the software. An important part of data cleaning is knowing who is responsible.
When – Data cleaning occurs at various times such as:
- Do not let it happen in the first place – We know. Easier said than done. You strive to enter the correct data at the point of entry if possible (if done by staff). This requires the proper staff training and having educational resources in place. Having data governance in place, data quality rules set up and integrations available are important. Make sure that these integrations are maintained and updated when changes occur.
- As they occur or are seen – Make sure that the culture of your organization empowers your staff to fix (or notify on) data as problems are seen or as they occur. Have data request and data quality resolution processes in place so they can be easily used (known points of entry).
- Immediate notification – On import or integration, get notifications when an error is seen and immediately routed to the appropriate person to resolve.
- On a regular basis – Certain data cleaning tasks should be on a regular schedule. This could be end of day (EOD), end of month (EOM), end of quarter (EOQ), or end of year (EOY). I know that we look at marketing contacts that are not associated with an organization at the end of the month and try to associate them to an organization (usually when they use a personal email address). We also review possible duplicates of contacts and organizations and decide if they are OK to leave alone or need to be merged. These tasks are on our EOM check list. Determine your frequency to run data quality assessments against data quality rules to discover issues that need to be resolved. Assign these regular tasks to specific individuals.
- When identified to resolve – These are bigger tasks which probably involve project management. Discuss the best time to do this project (if business is slow in December, then this might be a good time to do a major data cleanup project).
How – Some ways to do data cleaning:
Set up data request and data quality issue reporting and resolution process. Assign data stewards to be involved in the resolution processes. Educate your staff on these data governance-related processes and make sure that they are aware of the points of entry. Learn from the resolutions, solve root causes, and put in new data quality rules when identified. Use results to prevent future data issues. Staff can examine the results to determine what data is having issues and why. Train your staff on the standards for input. Data cleanup takes time so allocate staff time to perform this important function. Have it part of their job description to resolve data issues. Let the staff know that they need to “weed the data garden” and are important to clean, trusted data.
Have a data governance framework and data intelligence content in place so that data can be used effectively and efficiently. Using a solution like the Data Cookbook in place is a big help. Use the tools your system vendors provide you to clean your data, or those created by the organization. There are tools that look for duplications. For example, your CRM system might have a tool to show possible duplicate contacts or organizations. Review the duplicates and decide if a merger should be done. Use tools to find missing information, improper formats, duplicate records, etc.
Documentation is very important for clean data. Understand your data. Build a business glossary and data processing (report) catalog. Identify critical data systems you are managing. Figure out the data fields in each data system. Look for data silos and eliminate the ones you no longer need. Have data policies in place regarding retention and removal of data. Remember to communicate these policies to the staff and create ways to enforce the policies.
Make data governance a part of the culture of the organization. Have a discussion regarding data cleaning and identify gaps in your data cleaning. Review your data cleaning processes, tasks and tools at least once a year. Remember that training and education is important to clean, trusted, data. We hope that you found this blog post beneficial and wish you great success in cleaning your data.
IData has a solution, the Data Cookbook, that can aid the employees and the organization in its data governance, data intelligence, data stewardship and data quality initiatives. IData also has experts that can assist with data governance, reporting, integration and other technology services on an as needed basis. Feel free to contact us and let us know how we can assist.
Photo Credit:StockSnap_BTP3VRHYIY_dataclean_womantyping_BP #B1174