Getting Started with Data Governance

Getting Started with Data Governance

StockSnap_NTDGYW24VE_Hike_GettingStartedDG_BPBy now you've decided that data governance is a good idea for your organization. You want to use organizational data to drive better decisions, and to better understand the effect of actions taken. You're tired of inconsistently and/or unreliably reported data, you've had enough of silos, you're sick of not knowing whether data is collected and whether you can access it, you're through with data quality errors--whatever it is that's preventing you from making the best use of your organization's data.

Stated somewhat grandly, the basic problem is that your organization has yet to properly recognize data as a strategic asset. Or, perhaps, your organization has recognized that data is a critical asset, but hasn't yet been able to use it as effectively and consistently as other assets. Ultimately, this is evidence that you are falling short in your data management activities.

The relationship between data governance and data management is inexact. Some would argue that data governance is one category of data management, and others conceive of data governance as a foundation for data management, or as kind of rubric under which data management takes place. Either way, poor or insufficient data governance will not give rise to effective data management activities, nor is there any guarantee that currently successful data management initiatives will continue to be successful without strong data governance around them.

You want to do more with your data, and you know data governance will help. But how to proceed next?

Let's start by stating (or re-stating) some basic observations. You'll find many definitions of data governance that sound something like this: data governance is the body of policies and procedures that determine the exercise of formal control over data and data assets. That's fine, as far as it goes, but what does it really mean? And what would it entail to write policies and develop procedures for formal authority over data?

Don't get us wrong: policies are usually good, and they do good when they're useful. But the best policies are probably descriptive, meaning they reflect actual agreed-upon practices, rather than prescriptive. And formal accountability is also desirable, especially when it comes into play for something other than placing blame, but we like to see a focus on being accountable more than we worry about how formal the arrangement is.

But let's see if we can't think a bit more practically about the problems in front of us, and how we might tackle them. When we discuss data governance, we'll often preface the phrase with the word "pragmatic." What we mean by that is, what actions can you take now that address real data governance needs, and that will contribute to more effective data management techniques?

There is a playbook out there for data governance that involves creating a data governance council, writing and publishing a charter, staffing that council with data stewards and trustees, and getting to work on policies and procedures. For some organizations, that playbook will work; for others, it may just be a recipe for death by committee. No matter what structure you choose, however, our recommendation for quick success is to identify one or two critical data management problems, and to focus on solving them using our baseline data governance best practices.

Here's a by-no-means exhaustive list of common data management problems:

  1. We don't know where all the data we store resides, or how it is captured
    • We might be collecting or maintaining duplicate data
    • We are probably collecting critical data that analysts don't know about or don't have access to and so it isn't used in decision support
  2. Our collected data isn't of uniformly high quality
    • We know there are problems with data quality but haven't been able to solve them
    • We just don't trust the data that comes out of our systems in reports
  3. It takes too long to get data we can use for strategic purposes
    • We don't have the right tools or people (or we don't have enough resources to use our tools)
    • Our data request process is fractured and too much of our data providers' work doesn't actually involve providing data
  4. Data fluency is lacking
    • Business users don't know enough about their data to make informed use of it
    • Technical users don't know enough about the organization's business to guide business users' discovery
    • We lack a common business terminology for our data, so our reporting and analysis efforts end up comparing apples to oranges
  5. Too much (or too little) energy is spent securing data, rather than making it available in the proper format and amount to consumers
    • By "securing" we really mean blocking access to
    • For most users, broad access is no more valuable than limited access

An approach we have seen our clients succeed with is to pick out one (maybe two) of your data management issues, to identify people who have both a big stake in them as well as the expertise to address them, and to provide them a framework in which they can make progress. Now, that framework will look a little different depending on your organization and issues, but we think there are some common features.

Towards a common data governance framework

People - especially data stewards - are central to the effort

This should maybe go without saying, but the people who are most knowledgeable about organizational data, and who either use it the most and/or make decisions regarding its use, need to drive any process improvements involving data.

The work needs to be done in the open, as transparently as possible

This is not the same as saying, "Information wants to be free." Certainly one of the goals of any data governance program is likely to be increased security around data. But secrecy, or a misplaced emphasis on security, or a culture of hoarding access to organizational data, all tend to lead to silos, shadow systems and other unsanctioned workarounds, and data dueling (which always seems to crop up at exactly the moment your teams need to be in sync). Remember, security and control are not synonymous.

The perfect cannot become the enemy of the good

We see it all the time: analysts afraid to release their work because there's just one more thing to check; managers unwilling to commit to or take responsibility for data definitions because they haven't covered every possibility; architects reluctant to grant access to data structures because they're not complete or there's some upstream data quality issue; and so on. Our advice has always been to share what you know now, and update what you've shared when you know more or different. Working in iterations goes hand in hand with working publicly and collaboratively, and it allows data managers and subject matter experts to leverage shared knowledge.

Just because your organization is a bureaucracy doesn't necessarily mean it cannot move quickly and nimbly. But we have found that smaller groups, with defined goals and clearly circumscribed project parameters, tend to work better.

Regardless of your approach, however, the news isn't all bad, and you're not starting from scratch.

You are already governing your data

First, whether you have people in your organization you label data stewards, there are people who are responsible for data collection and management in their office or department, and they are likely to be highly knowledgeable about the business usage of data, or the technical location and lineage of data, or both. They are crafters and interpreters and occasionally enforcers of data policies in their areas.

Second, there are additional subject matter experts and technical resources who have knowledge of (some of) your organization's data, and who are already making use of it in their work. These people are talking to each other, to data stewards, and to data consumers, probably daily, about data. They discuss terminology, they look at data to see if it's erroneously entered or outdated or otherwise of poor quality, they are involved in requesting and providing data in many formats.

Third, your organization is increasingly populated by consumers who are hungry for data. They may have much to learn about your organization's data, they may need additional training in understanding and accessing the data deliverables they're provided, and they may have unrealistic expectations about what's available, but they want to use data to perform their tasks more successfully and to choose better courses of action for the organization.

  • So questions are being asked and answered (or not answered, as the case may be) all the time in your organization.
    • Imagine if those questions and those answers were recorded and made publicly available!
  • People are stewarding data all the time, whether that's defining data or clarifying terminology, approving requests for access to data and data sets, providing guidance concerning the appropriate display and sharing of data, etc.
    • Imagine if this work were documented and shared widely!
    • And imagine if all your data stewards were working from the same set of principles and guidelines, perhaps one that they as a group had primary responsible for developing and maintaining!
  • Data is being accessed, and shared, and reviewed, and analyzed, and interrogated by consumers at all levels.
    • Imagine if that consumption was catalogued!
    • Imagine if consumers were able to search to see if someone else had asked the same question about data, and what was provided to them!
    • Imagine if there were a streamlined way to evaluate data requests and to determine whether new research was necessary or whether some previous work would with minimal new effort meet a request!

So back to our data governance framework: ours is a framework that expects people to communicate with each other about data, and to freely share their expertise and knowledge; our framework needs this communication to occur in public, as much as possible, and for the outcomes of these conversations to be shared even more widely; and our data governance framework relies on and improves on work previously performed, rather than creating it anew every time.

IData has a tool, the Data Cookbook, that is organized around this framework, and that enables your journey to improved data governance. Of course we'd love for you to use it! But a tool is just a tool, and without training, instruction, feedback, and practice, you're not likely to use it very well. (For more information on getting started with data governance using the Data Cookbook, check out this webinar by our founder, Brian Parish.)

So while your goals for utilizing and managing data should be ambitious, you will probably need to start small. Identify a real problem, apply appropriate data governance practices inside a manageable but scalable framework, and determine which tools and other resources will be most valuable.

Since you were interested in this blog post, you might be interested in these videos:

Data Governance - Where You Should Start https://youtu.be/R-cpCCAHqSE
Start with Why in Data Governance https://youtu.be/5-c1haR-tE4
Setting Data Governance Goals https://youtu.be/bbzHp2wcsJI
Create a First Step Process in Data Governance https://youtu.be/vUESWLajwb8

And don't forget we have additional data governance related resources located on our resources page.

We are happy to be part of that conversation at any point in your process!  Feel free to contact us and let us know how we can assist.

 Contact Us

Image Credit StockSnap_NTDGYW24VE_Hike_GettingStartedDG_BP #1024

Aaron Walker
About the Author

Aaron joined IData in 2014 after over 20 years in higher education, including more than 15 years providing analytics and decision support services. Aaron’s role at IData includes establishing data governance, training data stewards, and improving business intelligence solutions.

Subscribe to Email Updates

Recent Posts

Categories