We speak with many clients and other interested parties about data literacy, and it’s clear from those conversations that no unified understanding of the phrase exists. Well, everyone will say something about understanding and communicating with and about data, but that is a broad and vague and fundamentally unsatisfying description.
Coincidentally, we recently ran into a former colleague, and we had a chance to catch up on some work they've been doing around organizing survey data for reporting and analysis.
Surveys are in and of themselves an interesting data management challenge. You have to take steps to ensure that the survey respondents are representative of the population you want to learn about; you have to design a survey instrument that allows respondents to provide as honest a response as they are capable of, and that doesn't in its design, format, or wording inadvertently skew responses; you have to slip the occasional question in to ensure that the respondent is taking the survey seriously. Moreover, after you collect the responses, you may have to do some preliminary work to identify nonserious responses. A basis statistical summary of the entire body of responses is often useful to identify anomalies or outliers, and in some cases that initial summary might contain enough warning signs to make you consider throwing out the entire sample altogether!
Assuming that your survey methodology and responses are acceptable, survey data can be both richly informative and easy to act on, in large part because the results are easy to share, and they seem almost intuitive to draw inferences from. If our survey results say that 75% of our customers are unhappy with a specific product, that in itself may be prima facie evidence that we need to change our product! (Or educate our consumers better about the product.) Without asking this question directly and tabulating the results, our other options involve considerably more estimation, imputation, data manipulation, and statistical analysis. And the more advanced data work we have to do, the harder it is for our stakeholders and consumers to interact with the outputs based on that work.
But, survey analysis is a place where basic data literacy skills quickly show their limitations. Even the surveys that seem most dispositive are far from infallible. Surveys are one way to express an opinion, but people also vote with their feet and their pocketbooks. A survey that suggests widespread dissatisfaction in the face of record sales or extremely high customer loyalty could as a whole be an outlier, or it might tell us more about the market and our competition than about ourselves. So, it may be that all a survey tells us is that maybe we should do more surveys. Of course, even if you're happy with what you have, you'll want to survey periodically as a way to see if opinions change. And you'll probably want to do A/B surveys to test whether the change that affects only A or B seems to have an impact on respondents. And so on and so forth.
It doesn't take long, then, for survey data to join the rest of your massive collections of data that no one understands or is able to rigorously analyze or is able to use to make informed decisions. This is where our former colleague came in. They modestly described this work as standing up a small data warehouse, but when we discovered the details we learned of hundreds of hours of classifying data, defining terminology, organizing metadata, and of multiple rounds of testing and comparing results. Frankly, they could have used a tool like the Data Cookbook to support and document this work!
Another friend of ours described an event at their workplace where one employee embarked on a very casual observational study, and then that employee attempted to present preliminary findings at a meeting. They were barraged with methodological questions about randomization, periodicity, repeatability, and so on, as well as with questions about hypothesis testing and follow-up research. (And, of course, considerable complaints about the appropriateness and legibility of the presentation method were lodged as well!) This employee hadn’t really tried to draw any conclusions from the data, other than that there might be some interesting information in it, but another set of colleagues were impatient to draw or refute possible inferences.
On the one hand, we were pleased to hear that those employees, who represented multiple roles and departments, had substantive responses; on the other hand, we fear that this employee’s work was received so poorly that they might never seek to build on it. Which is too bad: that person had an idea that data might provide some insight, and that person took initiative to act on that idea. Organizations of all shapes and sizes need to encourage that behavior, while at the same time facilitating ways for it to be productive and useful.
Obviously, that employee has a lot to learn about collecting, analyzing, and presenting data. Right now their aspirations outstrip their skills, but we know that could change, given the right resources, the opportunity to take advantage of those resources, and the willingness to put in the work. They’re surrounded by numerous colleagues who might be available as mentors, but they, too, need to make themselves available as resources, they need to allow their expertise to be tapped (and even questioned); they, too, need to be willing to put in the work to support their colleague’s data literacy journey.
We noted above that there are many ways to define data literacy, and many places to see it practiced. It follows that, if understandings of the term vary, programs to build data literacy are going to vary as well. And that’s OK! As we have mentioned here previously, data literacy competencies span a number of skill sets, practices, attitudes, learning modalities, data sources, tools, etc. As you think about creating a data skills program, or purchasing an existing one, or hiring trainers, here are some guidelines to help you move forward.
Treat increasing data literacy competencies (“upskilling”) as a business investment. If data is an asset, then the ability to make better use of that asset has positive and to some extent measurable value. Define real objectives (not "use data better"), identify the business units and people within them that will be targeted, and develop rubrics for assessing new competencies and achievements. Investments don’t happen without expense, and there is always a chance that your return will not be as high as expected.
Tie data training to people’s actual work, and take a data-first approach to building training programs and modules. (We might quietly suggest that a business glossary that defines data-related terminology in natural language could be helpful here.) Resist the temptation to focus on quantitative skills, business statistics, or particular business intelligence (BI) tools. We have said for years that you’ll do better when you train in the data, not the tool. When you present the results of years of customer satisfaction surveys to stakeholders, the ability to understand and act on that data probably does not require much knowledge of the survey instruments, or the specific tool used to graph the results. When users are coached about the scope and quality of the data set, and when the queries of those data sets are explained in a straightforward, nontechnical manner, and when the output is carefully shared in an organized display, users can grow their data literacy competencies and at the same time utilize their business expertise. (Perhaps a tool that catalogs data sources, assets, and products would be valuable?)
We know now that people learn visually, and they learn from stories, and some can even learn from a dry recitation of facts. Don't expect this effort to succeed by offering a one-size-fits-all approach; instead consider a medley of courses, workshops, videos, independent study projects, even gamified instruction! If you have in-house resources, find ways to free up their time to help, whether formally as instructors and trainers, or informally as hosts of meetups or drop-in office hours, or something in the middle. Empower employees to increase and/or impart their knowledge, to improve their skills (or assist others in improving theirs), and reward them when they do so.
Every organization we work with is rich in data about at least some aspect of the work that they do. Most of them have not governed that data well historically, and many of them have not developed a data-enabled workforce. (It's hard not to think that those two actions, or inactions, are unrelated.) Consequently they struggle to manage and organize data, much less analyze it carefully and act on it wisely.
But as our colleague showed their client: you can make a manageable, properly directed, sustained effort to understand the data you are rich in; you can take the time and effort to explain it for business users and to catalog it in ways those users can access it; and you can commit to utilizing data analysis, rudimentary and sophisticated alike, as the basis for decisions and actions.
Ultimately, governed data is easier for users to understand, it lends itself to simple and advanced exploration, it eases the generation of consistent results, and it supports the ability to ask more and better questions, all across your organization. Modern business enterprises desperately seek data literacy competencies from their entire workforce, and those competencies are enabled and enhanced when your data governance practices are strong and widespread.
Hope you found this blog post useful. IData has a solution, the Data Cookbook, that can aid the employees and the organization in its data governance, data intelligence, data stewardship and data quality initiatives. IData also has experts that can assist with data governance, reporting, integration and other technology services on an as needed basis. Feel free to contact us and let us know how we can assist.
(image credit: StockSnap_FRU4Y0OBEY_datagroup_nexusliteracy_BP #1262)