Why Don't We Trust Our Data? And What Can We Do about It?

Why Don't We Trust Our Data? And What Can We Do about It?

StockSnap_REVXK9HRNF_climbertools_datatrust_BpOne of the reasons we got into this business was a desire to help organizations make better use of data. Over the years we have worked with hundreds of organizations to introduce data into the decision-making process, to provide easier and more secure access to data, and to document the range and types of data collected and the potential uses to which data can be put. In this blog post we will discuss why we do not trust our data and what we can do to improve trust.

While we have observed a change with respect to the volume and variety of data collected or otherwise made available, and a similar kind of change in the data-handling and presentation capabilities of business intelligence and analysis tools, one thing that has not changed quite so much is how well organizations are able to utilize their data. While nearly all of our clients express interest in using data strategically, and relying on it across their operations, many of them remain hesitant when it comes to putting these goals into practice. When we ask them, we often hear that they still - after all this time and work - don't really trust their data.

Why don't they trust data? We have the heard the following, and more.

  1. They've been burned by inaccurate or misleading data in the past. This leads to a continued lack of confidence in the underlying data, or perhaps in the analysts who are providing it. Too often there is little visibility into or transparency around data sets, sources, and manipulation.

  2. The data doesn't conform to expectations or align with their hopes. Sometimes the results of a data analysis are so far outside expectations that they don't seem credible, especially when those results seem to show poor performance. Sometimes a rigorous analysis suggests a plan of action that is daunting, or risky.

  3. Data consumers don't understand data well enough to take informed action around it. We have observed that analytics of any kind can easily become a black box, where the methods are murky, the outcomes are opaque, and data providers cannot adequately explain their work or justify their conclusions. At a certain managerial level, consumers are likely not to have much familiarity with the real-life concepts to which the data refers. And data literacy capabilities often lag or are underdeveloped, which can render any sophisticated analysis inaccessible. In some cases managers who lack data literacy can be actively hostile ("we don't want to hear what the nerds have to say"), and in other cases managers whose data literacy capabilities, real or imagined, outstrip the capabilities of others in the organization, will take on unwarranted gatekeeping and interpretation roles.

  4. Data and analytics are delivered in an inappropriate or otherwise ineffectual format. While charts and graphs may convey certain information better than other tables or bullet points, not all visualizations are created equal. A dashboard with too much detail, or too jarring a color scheme, or too many objects to draw attention, may well be worse than a dashboard or traditional report containing less information. No tool can overcome the confusion that arrives when labels and data terminology are not explained, when assumptions and methods are not named, or when unexpected data points are not annotated.

  5. It takes too long to verify and deliver the data. Not all decisions are baked into a multi-year strategic plan; in fact, an increasing number of decisions that would be improved with recourse to relevant data are time-sensitive. If those decisions must be made on incomplete or partial data, it is not too big a leap to decide to continue to operate without data. Unverified data is often easy to lay hands on, and it can drown out or otherwise displace verified results. When the perfect becomes the enemy of the good, you often get neither.

  6. Useful data is simply not available. This can be related to the point above, where it is too troublesome to incorporate certain kinds of data. Organizations where data is siloed anywhere in its lifecycle find it difficult to put a wide range of trustworthy data in front of executives. This can also manifest in a form of analysis paralysis, which is related to (2) and (3) above. We should also note that the human desire to see patterns has probably led to situations where running one more test, or developing one more KPI, will finally yield that dispositive result. (It won't.) 

As we review this list, it appears there are some common themes that emerge.

First, there are concerns about data quality: the accuracy, completeness, and timeliness of data. These concerns about data quality have been with us since we started keeping digital records, but the explosion of the amount of data we collect, and the number of systems and tools we use to collect it, make these concerns ever more pressing.

Second, problems often exist at various points in the data request and delivery pipeline. Reported data is inconsistent with expectations, often because we have skimped in the the requirements-gathering phase. Sources and manipulations of data are inscrutable, so it becomes increasingly difficult for end-users to validate results. And as we noted above, analyzed or processed data is too often delivered in formats that obscure rather than clarify.

Third, data literacy capabilities at all levels of all kinds of organizations have not kept pace with data tools, nor with the growing volume and sophistication of data-related questions. Data literacy issues can plague organizations everywhere: they can cause substandard data collection practices that torpedo data integrity, they can cause sophisticated statistical analyses to be performed on inappropriately modeled or even irrelevant data, they can cause data-savvy leaders to lose trust in their data teams and try to act as citizen data scientists, to everyone's detriment.

Data governance and data intelligence is often proposed as a solution to help organizations build trust in data. But data governance does not just happen, and even when ambitious data governance efforts are underway trust in the data does not necessarily follow. Just because we all agree on a piece of business terminology doesn't mean that dashboards featuring that term are built the same way or even from the same sources. Just because we have a policy governing the rogue purchase of new software doesn't mean a user isn't maintaining a satellite data set and using it to source reports or to validate reported data. And, of course, when data governance efforts are perceived as onerous regulations that prevent people from performing their work, they often engender behaviors that lead to silos, data dueling, and further distrust of data.

Data governance or data intelligence as a one-size-fits-all solution may make you feel better, but whether and how much it helps you utilize data more effectively is a big question. For many years now, our tools and our practice have exemplified a pragmatic approach to data governance, specifically using the practice of data governance to solve specific problems.

If it turns out you have legitimate data quality issues, then focusing your data governance efforts on improving data quality will pay off. Maybe that's a simple as identifying the system of record for particular pieces of information, and making public which office or unit is responsible. Maybe it's more complicated than that, and requires writing new standards and training protocols, and instituting a monitoring process.

Developing a system to certify key data products (e.g., reports and dashboards), and to curate the data sets used to generate those products, can be the path forward to build up a collection of trusted outputs based on trustworthy data. Documentation, collaborative approval, and transparency are critical to this effort, and modeling them in a limited fashion now may make it easier to replicate on a wider scale down the road.

If data literacy is your stumbling block, then an organizational knowledge base consisting of key data concepts, terminology, usage, and presentation methods may help prepare your employees to better interact with and respond to organizational analytics. Perhaps it makes sense to also provide (or even require) training in data concepts, the scope and range of data collected, and in how your organization seeks to use data.

Our data intelligence and data governance tool, the Data Cookbook, can be instrumental in your journey to trusted and trustworthy data. Our roadmap services can help you identify barriers to trusted data, strategies to break down those barriers, and a prioritized schedule of tasks to realize those strategies.

Not sure if you're ready for those? Just want to talk about your situation? Take a look at our library of data governance resources. Schedule a free consultation call with us. Do something. Additional resources related to data trust can be found here.  Without trust in your data, your team cannot make wise use of this increasingly valuable asset.  Hope that you found this blog post beneficial.

The Data Cookbook can assist an organization in its data governance, data intelligence, data stewardship and data quality initiatives. IData also has experts that can assist with data governance, reporting, integration and other technology services on an as needed basis. Feel free to contact us and let us know how we can assist.

 Contact Us

Photo  Credit:  StockSnap_REVXK9HRNF_climbertools_datatrust_Bp #B1241

Aaron Walker
About the Author

Aaron joined IData in 2014 after over 20 years in higher education, including more than 15 years providing analytics and decision support services. Aaron’s role at IData includes establishing data governance, training data stewards, and improving business intelligence solutions.

Subscribe to Email Updates

Recent Posts