Shadow systems commonly show up as a spreadsheet, or series of spreadsheets, used to generate lists or track activities, and often the data in these spreadsheets duplicates, or purports to duplicate, data stored in your ERP or other key information systems. Sometimes these shadow databases use a classic database structures, often made in Microsoft Access, that reproduce (or even store) a subset of institutional data. Other times they’re data sets used for reporting. Occasionally they’re lists written in a word processor or text editor!
Let’s look at some common scenarios where shadow systems emerge.
Scenario One: I’m a casual or occasional user of our organization’s very comprehensive and complicated ERP. Extracting data from an ERP can be a challenge, and it is a time-consuming task, since I have to submit a request for the data. Once I’ve got the list of records I need, it is easier and more convenient to work from my list for future analysis, despite its age or other limitations, than to request a fresh data set. If I use this list to generate mail, maybe I track updates here – again, it’s simpler to make changes to my spreadsheet, a tool I use daily, than to try to navigate changing an address in the official system (assuming I even have access and training to perform this work).
Scenario Two: I’m an administrator or manager who has an idea, but for one reason or another I don’t know whether the data I need even exists, or whom to ask if it does, or how to get access to it. So I may begin my own data-gathering efforts and record that data outside of my institution’s systems of record.
Scenario Three: I’m a data analyst or other kind of power user, and I’m in a hurry, or I’ve got a request from above that our existing toolset doesn’t seem capable of solving. Our business intelligence (BI) tool doesn’t make graphs and charts as easily as this desktop application or piece of open-source software I have experience with. Or the data that comes from our central systems needs to be reformatted, massaged, or otherwise altered prior to use or presentation. Or I need to compare current period statistics to a previous period and it’s a quick shortcut to use the data set I requested last time, rather than go through the proper channels to construct a new one, even if errors have been removed or clean-up has occurred. Expediency is my top need, even though no one else can reconstruct or confirm the numbers I report.
So why are shadow systems problematic? There are several reasons. A major one is security, since they are generally stored in unencrypted fashion on a user’s workstation. A second critical failing is data quality: data extracted from a transactional system into a shadow data source (or, worse, entered by hand) will not have been validated, and the longer it sits in the shadow system the more likely it is to get out of date. In some cases, these shadow systems become replacements for systems of record, and users try to keep them up to date. Consistency tends to fade in these situations, and more critical data that ought to be updated in our system of record may not be updated. A third compelling concern about shadow systems is that they result in operating inefficiencies and losses of productivity, because they represent a duplication of effort. In many cases, however, this is a recipe for data inconsistency, accuracy and completeness errors, and potentially even conflict (what one of our colleagues called “data brawling”).
So how and why do people resort to shadow systems? In our experience, the primary reason is that users work more efficiently using their shadow systems.
One, they are often easy to use, or at least users have a comfort level with them. Spreadsheets are ubiquitous, and many people have at least a passing familiarity with them. Desktop database tools such as Access and Filemaker have been in existence for decades, and savvy users have figured out that they are customizable and portable. Contrast this with legacy ERP applications and complicated reporting and BI tools, which offer more features and security but can be intimidating to use and difficult to learn.
Two, shadow systems provide us with information right away. Some of this is because they’re easy to use, particularly if we’ve customized them to our personal preferences. Some of this is because they’re on our desktop or even removable storage (!), and we don’t have to connect to a remote system or go through another person to get the information we’re after.
Three, the work is often repeatable. That is, we do the same thing every time, and generally with fewer steps than we’d have to go through using organizationally sanctioned tools. In the case of shadow analytics, I may have developed a data frame (if I’m using a statistics package) or set of custom variables (in Tableau, for example) that I would have to recreate from scratch every time I load new data.
We can see the appeal of shadow systems to casual users and savvy analysts alike. Traditional practice has been to try to stamp these systems out, either via desktop control or simply loudly declared policy, and that practice is often not successful.
As we mentioned in previous posts, the goal of data governance ought to be to help people do their work more effectively, which is the same goal people have when they resort to shadow systems! Once we see how users employ shadow systems, then we can think about how we might enable them to accomplish their work without these insecure, inferior workarounds.
Some thoughts:
What we’re describing here is data governance: good, flexible, open processes, managed by and responsive to people, making data available for decision support and organizational management. Shadow systems exist by and large because people can’t get the data they need when they need it, in a form they can use. Better technology, whether it’s the current stack or the tools you’re evaluating, will only do as much to eradicate shadow systems as these other data governance pillars allow. Remember: shadow systems are a symptom, and efforts to wipe them out will not succeed if you don’t resolve the underlying problems that motivated users to create them.
Also feel free to review our other data system inventory resources located at this blog post.
IData has a solution, the Data Cookbook, that can aid the employees and the organization in its data governance and data quality initiatives. IData also has experts that can assist with data governance, reporting, integration and other technology services on an as needed basis. Feel free to contact us and let us know how we can assist.
Let us know about your thoughts on shadow systems.
(image credit StockSnap_M86IE4N066_ShadowSystems_BP #1084)