If It Were Easy, Wouldn't You Already Have Done It?

If It Were Easy, Wouldn't You Already Have Done It?

StockSnap_JENKYX8RSH_monkeyfirst_projecthardBPWe were recently reminded of #MonkeyFirst, which was something between a thought experiment and an analogy produced by Astro Teller of Google X. Briefly, for those of you who missed it the first time or around or whose memories, like ours, have started to rust, the idea is this. Say you have a project to train monkeys to recite Shakespeare while standing upon pedestal: where do you start? Well, you know you can build a pedestal, or you can hire a contractor to build one for you, or you can even buy one; you don't know whether monkeys can be trained to recite Shakespeare.  So, where do you begin? Teller said, don't start by working on the pedestal. That's the easy part. Instead, tackle the hard part first.  That is what this blog post is about, discussing data governance in particular.

Why try to train the monkeys first? Well, if you realize early on that you're never going to succeed, then you know you need to kill, or rescope, otherwise change the project. Better to know early that failure is imminent than to spend time, money, and effort only to provide the illusion of progress. The problem is, at too many companies and on too many projects, it's safer to build that pedestal first. In Teller's own words, "...I bet at least a couple of people will rush off and start building a really great pedestal first. Why? Because at some point the boss is going to pop by and ask for a status update – and you want to be able to show off something other than a long list of reasons why teaching a monkey to talk is really, really hard."

That sounds like a great TED talk—and, in fact, it was. But we realize that analogies are not logical proofs, and we are aware that there could be plenty of ways to push back on or poke holes in this argument.

What if we need pedestals for other things? Can we go ahead and build them now, knowing they're going to be used later? (Well, how much later? When is pedestal building going to be the bottleneck?)

What if it turns out that we can train monkeys to recite Shakespeare, but they need a specially formed pedestal in order to stand? (Well, is there any pedestal construction requirement more challenging than training your monkeys?)

What if we have 10 monkey trainers who can get started on that right away--should our pedestal builder sit idle until they're finished? (This is a bad-faith response, it seems to us: any kind of project plan recognizes that different resources are needed at different times, and accounts for that.)

But: it certainly seems reasonable to worry that our project funding is going to get cut off if we don't show progress within a certain timeframe, or that our ability to continue to attract resources will be challenged. In that case, shouldn't we pluck some low-hanging fruit or demonstrate some quick wins?  Again, are empty accomplishments really a demonstration of progress? If your project will be shut down for lack of progress, when do you want it to end? Relatively early, when it becomes apparent that given existing resources and constraints you won't be successful, or somewhat later, after squandering funds, people, and time, each of which could have been put to more productive use?

We admit it: we’re as guilty as anyone of highlighting quick wins that might not be that meaningful, or settling for taking the path of least resistance instead of the path of greatest value.

Those of us who work in data-related roles and professions are probably not going to find ourselves working on projects as outlandish and improbable as training a monkey to recite Shakespeare while standing on a pedestal, but that doesn't mean our projects won't have uncertain outcomes and potentially unmeetable challenges. But, when it comes to your data initiative, how can you tell the monkey training from the pedestal building? And, just as importantly, how do you decide to focus on the one rather than the other?

Prospective clients often come to us saying something like, "Management has decided it's time to do data governance and data intelligence, and we need a tool to help us do it." We often spend a lot of time unpacking statements like these, in order to learn more about what is needed, and whether and how our products and services can help.

Data governance is not an end in and of itself, it is a means to an end. We are always asking both prospective and current clients, what are you trying to accomplish? What problems are you trying to solve? What end do you hope to reach using the data governance means? Why are you interested in data governance now?

Sometimes that end is often described fairly generically as something like, "We want to make better use of our data.” Sometimes the end is to get management off our backs! But usually that end involves leveraging data to make better decisions, and in order to properly leverage data there are numerous requirements: identifying needed data, identifying missing data, cleaning up or organizing data for analysis, connecting analytics tools to the data, creating and delivering analytics products, interpreting the analytics, making decisions based on or informed by those interpretations. Well, once those requirements are laid out and described, can we start to say which ones are most like pedestals, and which ones are most like monkeys?

And it turns out that even when management has explicitly called out for data governance, and only data governance, clients are still moving forward with other critical data innovations and transformations. Some are looking to switch from one system (or set of systems) that do certain business tasks to other systems that will, they hope, help those tasks be performed more quickly, or more easily, or at less expense, or for greater return. Sometimes they're looking to add new tools to perform additional tasks, but they want to push data between new and old tools and systems. Frequently we see clients who are planning, or even in the midst of, changes to their BI stack, such as deploying, expanding, or moving a data warehouse, or upgrading or switching to (or adding) another analytics tool. In our client base it's still rare, but increasingly we see data lakes and data mesh and data fabric entering the conversation.

Among other things, we see these as opportunities to commit to superior data governance practices, and to increase the overall store of data intelligence at an organization. But organizations don’t decide on these data-related initiatives in order to improve the way they govern data; they choose to do them because there are strategic goals to meet or challenges to face. It’s just that without governing their data they’re going to find it nearly impossible to use data to meet those goals or face those challenges!

We hear plenty of anecdotes about how data governance efforts have foundered, or even failed altogether. Indeed, our webinars, blog posts, conference presentations, and client counseling sessions are replete with examples of what to avoid as well as what to embrace. Maybe not that surprisingly, it turns out that data governance is just as hard as any other task you set yourself when it comes to improving your data usage.

If you find yourself tasked with “doing data governance,” our longstanding advice is to wrap that work around ongoing data operations, and, if possible, use data governance practices and principles to support and improve major and highly visible activities. And then you might ask yourself, is data governance the monkey or the pedestal in your data initiative? Well, what does it look like when you build the pedestal first, or when you work towards a quick win, when you're migrating to a new analytics tool or platform, or when you're developing a data warehouse?

Sometimes, we know, organizations want to embark on a program of formal data governance, and that program can take on a life of its own. The question to ask, then, is where do our data governance efforts face the most intractability? Which of those data governance activities and practices will be most difficult to succeed in?

In our experience, there’s not usually too much that prevents organizations from setting up a Data Governance Council, appointing data stewards and data trustees, and writing a charter. Every organization has committees, mission statements, and names for roles that neither convey much meaning nor confer much authority. Obviously, navigating bureaucratic hurdles and entrenched norms isn’t always easy, but we know how to do it. So, there’s a good chance this might be a pedestal.

By the same token, even when you do a lot of comparison shopping, put in the due diligence, check the references, and do whatever else is involved in kicking the tires, it’s generally only money that stands in the way of purchasing a data governance platform, data catalog, metadata management tool, or other technology solutions. Yes, there’s an opportunity cost—you could always spend that money elsewhere—but there’s a good chance selecting a data governance application might be a pedestal.

Many organizations want to know where they are storing data in a way that puts them—or the people, entities, transactions, products, etc., referred to by that data—at risk. Is the data not secured against unauthorized access? Is the system vulnerable to attack? Are the users not properly trained and cautioned about protecting personal privacy? Why don't organizations know this already? What is preventing them from answering this question? So—maybe this isn’t quite as much a pedestal?

Sometimes we learn that clients don't have an inventory of data systems and assets. Sometimes that inventory exists, but it is informal, and it doesn't include critical information about risk. Frequently the inventory, even if it's formal, is incomplete, because not every unit participates (or is even asked to participate). Without this inventory, multiple versions of some pieces of data are probably being captured and stored (potentially not very securely, given the previous paragraph). Funds and personnel are probably being allocated inefficiently, as some systems end up competing with each other, and others are underused (or possibly even abandoned). Is this a pedestal?

Everyone wants to provide data to decision-makers so they can use it to make better decisions. Again, what is preventing them from providing that data to them right now? We (they?) don't know what data they need? Is there a lack of data acumen among decision-makers? Is this a data strategy issue? Is it a lack of skilled data providers? Is it a data pipeline problem? Is the data technology not up to the task? What if the answer to all of these questions is yes? Then you have to decide what's more like training monkeys, and what's more like building pedestals, and what it would look like to direct your data-governance-as-problem-solving efforts to the monkey stuff.

Sometimes the blog runs a little philosophical, and this is one of those times. It's been fun for us to think about this, and we hope you've enjoyed reading it.

It’s quite possible this monkey and pedestal (or #monkeyfirst, you can look it up) analogy isn’t the most useful heuristic. But it’s a colorful and memorable way to help us focus on some basics. What are our goals for using and managing data, and which objectives in the service of those goals will adopting data governance practices and principles help us fulfill? Which activities will actually represent progress towards those goals, and which activities might only represent the fact that we’re taking action? Of those activities that represent real progress, which ones will be the hardest to complete, and will failing to complete them scuttle the success of your entire effort?

Our data governance and data intelligence solution, the Data Cookbook, is by design a lightweight and flexible application that is well suited for agile deployment in exactly those areas where you need to make real progress and demonstrate meaningful accomplishment. It is an accessible repository of knowledge of your organization’s data assets, terminology, and structure, it is an engine for making, recording, and publicizing decisions about data, and it provides multiple points of engagement for your data stakeholders, whether they are seeking to learn more, suggest changes, or add to the organizational knowledge base. Used appropriately, it is a pragmatic tool to help you tackle those difficult challenges up front.

(image credit: StockSnap_JENKYX8RSH_monkeyfirst_projecthardBP #1260)

Aaron Walker
About the Author

Aaron joined IData in 2014 after over 20 years in higher education, including more than 15 years providing analytics and decision support services. Aaron’s role at IData includes establishing data governance, training data stewards, and improving business intelligence solutions.

Subscribe to Email Updates

Recent Posts