Bringing Together Artificial Intelligence, Data Skills, and Institutional Knowledge

Written by Aaron Walker | May 10, 2026 5:58:49 AM

We've been writing in this space about how artificial intelligence (AI), large language model edition, can enable employees to make better use of data in their work. Many of our thoughts have been about using AI to clear out some of the simple but labor- and time-intensive tasks that slow so many of us down. Can AI help us inventory and describe our data sets, so we know what data might be available for further analysis? Or, what data we're not collecting that could be useful? Can AI help with data profiling? What about scanning our library of reports and dashboards, looking specifically for inconsistencies and potential inaccuracies? Like many people, we're intrigued by the possibility that AI could help non-technical users develop their own queries and construct their own analytics, although as we mention in this blog, many of the things that need to be in place for that future to come to pass are not in place.

Now, before AI came along as a possible helper in these endeavors, we--or our colleagues--would probably have had to develop some new data skills, or improve our existing ones, in order to do our jobs. Take data quality work, for example. A person can probably learn quite a bit about data profiling just by reading about it, and with pretty basic tools that same person could probably generate some useful output about the data they're examining.

However, it's one thing to identify some number of records that don't seem to conform to the pattern established by the majority of records, and it's quite another thing to understand why that might be, and whether it even matters. The institutional memory and knowledge possessed by more experienced coworkers will almost certainly be necessary, first to determine if what looks like an error is really an error, then to determine whether something needs to be done, and, if something needs to be done, what that something is.

One of the chores many of our clients struggle with is a fairly basic one, that of managing access to databases. We say it's fairly basic because the actual task involved of granting or revoking access to a database, or an application that handles data, is typically not complicated. But figuring out who should have what kind and amount of access can, depending on the data and the roles, be quite complicated, and staying on top of changing roles, new data elements, and employee turnover is a never-ending set of tasks that eats a lot of time.

It's easy to imagine ways AI could help with this work, such as agents that accept access requests, compare the requesting user's role to those of existing users, even ask a few canned follow-up questions, before taking some action. Maybe the action is just a recommendation: yes/no, or more information is still needed. Maybe that agentic action is more independent.

We use AI at IData, each of us in different ways. The tool where we write and share these blog posts has AI features! We've started to add some AI capabilities to our products, and there's no reason to think we won't include it more and more. Our use cases, which must be common, include improved personal productivity, speedier or more sophisticated coding (or both), more efficient processing or large tasks or large data sets, and similar. Our clients are trying to use AI to, among other things, make more data accessible, and to make it easier to understand data, and the specific manifestations of those efforts vary quite widely.

There's widespread concern in the world about AI taking jobs, or making positions unnecessary. We're not qualified to speak to that idea, and as with many stories we'd probably fall back to the old standby observation that the plural of anecdote is not data. We've recently been introduced to the work of Matt Beane, who raises the concern that one of the consequences of widespead adoption of AI (along with robots and smart technology and other automations yet to appear) could be a sort of hollowing out of the workforce, as opportunities for skill-building, engagement, and professional growth might dwindle or disappear.

Fewer jobs mean fewer opportunities, of course, but as we understand this particular possibility, an additional potential issue is the quality, rather than the quantity, of work. AI could very well disrupt the important connections we form with co-workers, supervisors, and mentors. Because of its speed and versatility, challenging tasks that in the long run help us build skills and grow our knowledge base might not ever be assigned to us, or those tasks which in the past have contained a healthy complexity might be broken down into simpler components. If that complexity is what hones our problem-solving skills, and that complexity is abstracted away, what we're left with might be less valuable to us, professionally and personally.

Nearly all of our clients have a long history of hiring a junior analyst, and expecting them to learn on the job, via independent study, by taking on more and more difficult tasks, and by picking up some of the expertise of their more senior colleagues. Can we envision a situation where companies no longer hire that person because Claude or ChatGPT is able to provide users with data on request? Or where the person in that role never achieves the skills and engagement of their senior colleagues, because the conditions that support that growth no longer exist?

Just because AI can write a query across multiple languages and platforms, does that mean AI knows what it's doing when it writes that query? We're not the only people to observe that in some ways, adding AI to your analytics team is not all that different from bringing on a new data scientist, one who appears knowledgeable about data architecture, query writing, statistics, etc. Maybe in some cases they have some ability to generate visualizations, but they know almost nothing about what makes your business unique, and they have had no meaningful training on your data.

What are the problems that clients come to us with? A short list includes frustration with competing and incompatible definitions of data-related terminology; the inability handle massive collections of reports and dashboards that are rarely looked at and only somewhat understood; the existence of multiple data sets feeding those reports, few if any of which have been curated and some of which are entirely unsanctioned; trying and mainly failing to maintain a patchwork system of access grants and security protocols'; and frequently contending with a widespread suspicion that the overall quality of the organization's data is questionable, potentially even compromised beyond reliability.

For each of these problems, we can think of several potential uses for AI as it currently exists, to say nothing of what it might develop into. Setting AI loose on your data could expose the most critical weaknesses of your data foundation, it could show up the gaps in your data intelligence, and it could make it very clear where your data governance practices have been inconsistent or nonexistent.

But the notion that we could just create agents around these issues and set those agents loose fixing data problems seems fanciful, if not outright fantasy. We know from real-life experience how difficult it is to determine whether reports and dashboards are still being used, particularly if the number we need to review is in the hundreds or thousands. While it might not be entirely trivial, we can probably ask AI to review our analytics catalog (which isn't really a catalog, it's more just an agglomeration of various kinds of deliverables) to see if the same data element name is used in multiple products, but is calculated differently, for example. Then what?

Think of all the context still needed to decide if these discrepancies are meaningful, and if they're worth doing anything about. Is it really a problem if one office calls something FTE in an internal report, and another office calls something else FTE in another internal report, if those products never cross paths? What if that abbreviation shows up on a dashboard that hasn't been refreshed since 2020--is there reason to think that the office in question is still even using that language? What if, because of other variables the way we derive or count a metric varies, but those variations are legitimate? Several humans would probably be needed to review that issue prior to resolving it, whatever that looks like.

(Few, if any, of our clients have the luxury of hiring a new person or team and setting them the task of investigating and then beginning to resolve data debts like these. In this particular instance, AI would not be taking a job, because despite the value of the work there will more likely than not never be anyone to do it. At best, some resources might be devoted to this work when a new BI tool is adopted, or when a change to warehousing architecture is underway.)

Organizations across all industries are looking to use data to inform or drive their decisions. Even ones that could be described as data-driven would surely like to extend their abilities in this area. We've argued for years that poor data governance essentially ties one hand behind your back in this endeavor, but even organizations that do a pretty good job governing data don't necessarily find themselves consistently and productively utilizing their data. What too often seems to go unremarked is the quantity and breadth of decisions that organizations make about data, starting with what data to collect, where to store data, how long to retain it, whether to enrich it in some fashion, who is allowed to access data and under what circumstances, and what is to be done if a (potential) data problem is discovered.

Those decisions, and the institutional knowledge that drives them, are the things that determine what data your organization uses, and as a whole they provide the context for any business intelligence or analytics efforts you're making. For employees wanting to utilize data, the challenge might lie in how well they utilize a tool, and whether their skills are growing. The complexity might come in recognizing which data is relevant, and which missing data might be useful; it might also lie in figuring out the best way to display and share the results of analysis and exploration. The connections to other employees might come in the form of bouncing ideas off each other prior to running an analysis, or asking for a sanity check when the numbers seem too good (or too bad) to be true, or simply sharing new techniques.

Surely there's a role, and quite possibly there are many roles, for AI in this arrangement. We've suggested a number of ways it could be used, and undoubtedly those barely scratch the surface of what's possible. But, at least for now, AI won't curate your data sets, AI won't create entries in your business glossary, and AI won't know the the backstory of your organization's data. Maybe when you've achieved a data governance framework, when you've transferred tacit institutional knowledge into an accessible, organized knowledge base--maybe then AI can start creating useful dashboards and cleaning up data quality problems and handling your data warehouse maintenance. We'll see. But until then it's probably a better use of the tool to put it to work building that framework and compiling that knowledge. And stay tuned. We'll surely have additional thoughts about how we can partner with you to get to that point!

Hope this blog post was of assistance to you and your organization. All our data governance and data intelligence resources (blog posts, videos, and recorded webinars) can be accessed from our data governance resources page. IData has a solution, the Data Cookbook, that can aid the employees and the organization in its data governance, data intelligence, data stewardship and data quality initiatives. IData also has experts that can assist with data governance, reporting, integration and other technology services on an as needed basis. Feel free to contact us and let us know how we can assist.

Image Credit: StockSnap_SNK7GS7JV9_togetherblocks_AISkillstogether_BP #B1317

View full post