Over the past few months, we have explored the concept of data literacy in an individual and organizational capacity. Organizations are being pushed to be more data-centric just to keep up with the world around them, and many of them are now being led by people who want to be more data-informed in the way they manage their organizations and plan for the future.
A quick look around the web will uncover a multitude of articles describing how to assess your organization's level of data literacy, or how to create a data literacy program. As we have noted, your organizational baseline is probably stronger than you think, and many of your employees already strongly value data and take admirable steps to protect and secure it, as well as to maintain its quality. (And to be candid, the notion that you are going to carve out time and resources for a training program like this, when so many organizations struggle with basic data management, strikes us as fanciful.)
Instead we thought we would mention some practical approaches that can contribute to organizational data literacy, without making a big deal or causing much disruption. Many of these practices are part of our standard suite of data intelligence and data governance recommendations, and so they may sound somewhat familiar to regular readers.
In most cases, gaps in data literacy show up when analytics are developed and presented. Data consumers and users often have an emotional reaction to data, whether that is discomfort from having trouble interpreting data, or resistance to figures that fly in the face of expectations, or elation (or at least validation) when a trend seems to be positive. We can notice these reactions, and we can try to manage them. In our opinion, managing these emotional reactions is easier if they are anticipated at the beginning of the process – when data is originally requested – rather than closer to the end.
When a data request is made, is it tied to a business need? Does the request make clear what questions are being asked, or what problems are being addressed? If, as is often the case, data is used to provide a status update around a performance indicator, or to provide progress toward a goal, then do we have a complete understanding of what that indicator measures, orr how we decided on the goal? Is the data in question restricted, confidential, or in some way proprietary? If so, do we know whom it can be shared with, and in what form, and what might be done with it down the line? It is important to have a data request process in place.
Assuming these grounding steps have been taken, then we should be in a better position to interact with analytics and business intelligence deliverables. Often, the disconnect when presented with data is that what the numbers tell us – or what we in the moment think the numbers tell us – is at odds with our “lived experience” in our organization. Even the most functional teams are subject to episodes of groupthink and echo chambered agreement, so reflexively discounting data tends to be a bad practice; at the same time, that lived experience is real, and our coworkers are knowledgeable, so seeking a balance is usually helpful.
How do we arrive at that balance? One thing we can do is to make sure we are all using the same terminology. It does not make any difference what the raw data is, what manipulation it is been subject, or what analytical methods have been applied to it, if we are not fundamentally talking about the same concepts and objects.
Thus, the need for a business glossary. A business glossary, oriented around the terminology related to strategic objectives, metrics, and key performance indicators, crowdsourced from subject matter experts, technical resources, and business analysts will be beneficial to the organization.
We can also try to put data outputs in context, which often means finding something about which we are more knowledgeable, or about which there is relative organizational certainty, and comparing that information to this newer information. The better analytics will already include this, if possible, either by comparing this period’s numbers to something historical, or by overlaying new slices of data on previously agreed-on slices.
Whether we are confident or uncertain in our understanding or interpretation of the analytics, and whether we are pleased or displeased with what we think they tell us, we should still do some version of the following checklist.
- What and who are the sources of the data or the statistics?
- Can we trace the lineage from collection through storage and manipulation to analysis?
- What might have been excluded or otherwise not counted, and why?
A data catalog of key reporting and analytics products is important as it foregrounds the questions being asked, displays the lineage and processing of the data in motion, and tells consumers where to go for additional information.
Even those of us who are not statisticians or analysts (which is to say, most of us) have heard of inadvertent bias in algorithms and data sets. Infamous examples include facial recognition technology flopping because it was “trained” only on a limited palette of skin tones, and drugs being approved or rejected based on studies of their efficacy that failed to take into account differences between men and women. It is easier to say “interrogate assumptions behind algorithms and data sets” than to perform that interrogation, but we can always ask who or what is missing, and how our conclusions might change if they were included.
A repository of data quality rules and reference data could help learn more about how comprehensive and representative data is; a place where data users can further investigate potential data quality issues might also be desirable here.
We should always be skeptical of what we think we know. Particularly when the news is good, so to speak, we should look for disconfirming information, methods of analysis and blending that cannot be replicated, errors in data curation, and, as much as possible, blind spots caused by our own assumptions. When presented with an analysis, confirm its appropriateness. Before drawing conclusions, try a different visualization or other way of organizing the data. And, while it may make sense to defer to statistical authority, before acting on that deference, make sure you can explain the data, the analysis, and the conclusions, in the language of your business.
We wish you well as your grow your organization’s data literacy, and on your path to data and analytics maturity. Our tools, such as the Data Cookbook, and our team can help you get on and stay on that path. We hope that you enjoyed this post. Additional data governance and data intelligence resources can be found at www.datacookbook.com/dg.
The Data Cookbook can assist an organization in its data governance, data intelligence, data stewardship and data quality initiatives. IData also has experts that can assist with data governance, reporting, integration and other technology services on an as needed basis. Feel free to contact us and let us know how we can assist.
Photo Credit: StockSnap_URDSGNSKMP_PeopleSmiling_DataStartsQuestions_BP #B1216