Data visualization tools as well as design shops - this is an insufficient and reductive description for talented professionals who take data visualizations and tailor them further into often stunning infographics - will often describe their method as telling stories with data. And we think this description is fine, as far as it goes. While people process the world visually, they also are wired, it seems, for narratives, even simple ones.
Aggregating data, plotting it on a graph or organizing it into charts, comparing data across segments or at various points in time--there can be no question that these techniques can contribute to insight, reveal patterns, or otherwise convey information succinctly. After all, we continue to work with clients whose primary method is to export far too much data into a spreadsheet or similar tool that allows for further analysis, however imprecise and unsophisticated. They spend considerable time and energy producing data objects that are rarely attractive and only slightly more commonly useful.
Contemporary analytics tools can filter, aggregate, perform calculations on, and attractively display data in seconds, and they are undoubtedly a massive upgrade on VLOOKUP and pivot tables. (Full disclosure: we use VLOOKUP and pivot tables regularly. Just not in the products we intend to publish or share widely!) Having said that, it seems certain that the admonition to "tell stories with data" is at some level a recognition that improving the display of information is no guarantee that insight into or even understanding of that information will follow. Data literacy competencies vary widely and are unevenly distributed, after all, and even those analytical outputs that distill complex data into a single metric can be misinterpreted, incompletely understood, or flat-out rejected (especially if the apparent conclusions fly in the face of received wisdom, personal experience, or even fervent wishful thinking).
We are data people, and we have been for a long time. Still, we are astounded by the capabilities of modern visualization tools, and our socks have been knocked off, figuratively, by great infographics. But we were also fans of legacy BI tools: the modeling, the transparency, the sheer horsepower that many of them brought (and still bring!) to querying and organizing data. We have seen sophisticated and compelling outputs from stats packages, and to be completely honest we are to this day susceptible to elegant SQL or Python.
All of these tools have power that most users do not touch, and they generate outputs that most consumers can not use. That is OK: we assume our car can travel in excess of 100 mph, or that the engine can handle 7500 rpm, but we do not intend to ever find out.
So, yes, we wholeheartedly support enriching your data analysis by telling stories with it. Some savvy consumers will read or hear those stories and they will have a good response. They may ask, why do you think the data supports this conclusion? Or, what other data could we bring to bear on this decision? Possibly even, how will we evaluate the decisions we make based on this data? And so on.
But anyone who traffics in data, whether that is analytics or modeling or curating data sets or whatever, knows that not all consumers are quite so savvy. So, you will get questions, but those questions may be much more basic. In our experience, it turns out that before you can tell good stories with your data, you may need to tell stories about your data. In fact, many of the immediate responses that you will get to your carefully crafted data-as-story are really questions about the story of data.
You will often hear, where did this data come from? This generally means, who collected the data, and who provided it to the collector? And it may subtly imply a lack of trust in the collection process. It is frequently the case that dashboard developers do not really have the answer to these questions. They may be able to identify the database and data model objects, but that is a technical answer to a business question. But the inability to answer questions like this may grind all further discussion to a halt. If the provenance of the data is unknown, or if the reliability of the original collection is impugnable, your data story will have to carry a heavy load to be heard.
Another story to tell about data is how long have we had it. This might mean certifying that we are sure it is up to date. Or the other side of that might be evaluating whether we have had it long enough to otherwise verify it. Or even, why have not I been told that we have this? These are questions worth asking during the development of your data product, and they are certainly questions we would encourage data consumers to pose.
We might also want to know where this data resides. Some systems have a reputation for being difficult to extract data from. Some systems may be entirely unknown to data consumers. Some systems may be managed by offices whose data practices are untrustworthy. Sometimes data that is provided to consumers originated in a trustworthy system, but it lived for a time in a satellite data set, possibly aging past its prime, almost certainly not being updated with newer attributes.
Other stories about your data might be a little harder to tell. Why do we have this data, for example, is not always a question with a simple answer. What is the purpose of the data, what are we ultimately going to do with it (e.g., keep it forever, archive it, delete it), who (else) are we planning to share it with, etc.
So, finally, what would it look like to be able to tell stories about your data before trying to tell stories with it? We suspect it would look a little like the basic lessons of journalism. Who collected this data? What are we doing with this data? Where does the data live now and where might it go in the future? Why do we think this data is useful for our enterprise? (Or why do we think this analysis is appropriate for it?) When was the data accessed for this particular data product?
How do we prepare ourselves to tell these stories and answer these questions? Data intelligence work could include creating a catalog of data assets with both technical and business information about them. What sorts of data are we storing in which kinds of systems, using which applications? What is the shape and scope of that data? Who is responsible for the maintenance of data and the upkeep of these data applications?
And these data intelligence queries flow naturally into establishing productive data governance practices. Data governance decisions involve identifying who is responsible for data at various points in its lifecycle, defining data, determining how it is to be used, managing access to and otherwise securing data, archiving or disposing of data, and serving as a custodian for it along its travels.
While you can use a number of tools and methods in support of this work, our Data Cookbook solution provides a one-stop shop to set you up and send you on your way. The Data Cookbook allows you to catalog your data assets and data products, to define your data terminology and quality rules, and to maintain a living repository of this information curated by subject matter experts and blessed by data stewards and managers. It will not be the only place to record the stories you tell about your data? But it can certainly be the place your users go first, and it can provide them the information they need when they want to read further.
The Data Cookbook can assist an organization in its data governance, data intelligence, data stewardship and data quality initiatives. IData also has experts that can assist with data governance, reporting, integration and other technology services on an as needed basis. Feel free to contact us and let us know how we can assist.
Photo Credit: StockSnap_ZZG16IZGRV_mantalking_tellingstory_BP #B1217