Written by Toffer Winslow
Digital transformation initiatives have become the reflex reaction for big companies fearing existential threats from ‘digital natives’ — those disruptive competitors that leverage technology and data to fundamentally reshape the competitive dynamics of entire industries. It’s a well-intentioned response, but also one that exposes the data debt crisis that is the Achilles Heel of so many large enterprises’ efforts to reinvent themselves. Without a competency in agile data management (what we call dataops), the ‘activation energy’ to find the insights that power effective digital transformation is simply too high. Radically reducing the ‘cost to know’ needs to be a top priority.
Data scientists — the alchemists of digital transformation — are increasingly seen as an answer to this problem. They command jaw-dropping salaries for their skills and the expectations their employers have for them. Yet the fact that they spend the majority of their time (up to 80% according to some studies) doing tedious, time-consuming data preparation work is an example of just how high the ‘cost to know’ is. For those who push ahead with a quest for answers to important questions, these high costs are seen as a necessary investment given the potential payoff.
A more insidious problem is the opportunity cost of not even trying to answer critical questions because of the expected sky-high level of effort in finding, assembling, preparing, and analyzing the necessary data. This is where the ‘cost to know’ becomes sand in the gears of digital transformation initiatives. These costs are highest when data is siloed and sits in innumerable sources with different formats and structures. Grand plans are stymied by an inability to gather enough data for meaningful analysis; worse still, those plans might proceed under political pressure with a small fraction of the data needed.
To succeed with digital transformation, companies need to build an agile data management capability that enables them to attack the ‘cost to know’ problem. This enables them to do three essential things:
- Super charge data science efforts: By radically reducing the amount of time data science teams spend on data prep, they can be freed up to focus on high value-added work that is their specialty.
- Finally solve the ‘too hard’ problems: When the ‘cost to know’ is slashed by an order of magnitude, the projects that have been deferred for years because of the anticipated high cost and risk can finally be addressed.
- Respond to the unexpected: Digital transformation will never be a straight shot. Unanticipated questions and challenges will inevitably arise. An agile data management capability provides the capacity to respond effectively when the unexpected happens.
So how do you build an agile data management capability, you ask? The answer is bigger than any one company, any single technology, and even technology itself. It’s an approach that involves the deliberate coordination of people, processes, and tools — one that borrows heavily from the ideas that made the devops movement such an transformational force in software development and deployment. We’ve written extensively about this subject, and our thinking is well-summarized in a recent piece titled Building A Next Generation Data Engineering Organization.
Unsurprisingly, we think that enterprise data unification is a core capability of agile data management. And we’re not alone in that view. Gartner identified ‘Data Unification & Consolidation’ as one of four distinct components of what they call Agile Enterprise Information Management in their December 2017 Market Guide for Data Preparation (Gartner subscription required). And we both agree that Data Unification is an upstream complement to self-service data preparation capabilities that the likes of Trifacta and Alteryx have built into an important market segment.
Enterprise data unification harnesses advances in machine learning to solve the thorny challenges of combining, cleaning, mastering, and classifying the myriad internal and external data sources that are the root cause of the data debt that drives the high ‘cost to know’. You can learn more about Tamr’s unique approach in our technical white paper or if you really want to get into it, read our patent.
Agile data management is the equalizer for large enterprises who aren’t digital natives. And enterprise data unification is core capability in agile data management. Collectively, they reduce a company’s ‘cost to know’ the answers buried in their data and make is easier to deliver the insights necessary to drive successful digital transformation.