datamaster summit 2020

The Modern Approach to MDM: Emerging Technology for Data Mastering

 

Paul Balas

former Chief Advisor Advanced Analytics at Newmont

Paul Balas, former Chief Advisor Advanced Analytics at Newmont, discusses new strategies for replacing rules-based master data management (MDM) with modern approaches

Transcript

Megan LaFlamme:
Hi, everyone. Thank you so much for joining us today. My name is Megan LaFlamme, I am the Director of Product Marketing here at Tamr. And I’m so pleased to be joined by Paul Balas, Digital Transformation Executive. Hi, Paul, how are you?

Paul Balas:
Good, how are you Megan?

Megan LaFlamme:
Great. Thank you again for joining us. Today we’re here to talk about something that I think is top of mind for a lot of data leaders in the world today who are trying to leverage their data as an asset, and are looking for ways to better leverage their data mastering solutions and their data management ecosystem, and are curious about alternatives to their rules based systems that just have not been able to unlock the insights that they need to navigate this world that we are in today.

Megan LaFlamme:
So Paul has joined us, a great friend of Tamr, and he’s really going to talk to us about how he changed the rules on data mastering to accelerate analytic insights. And Paul’s held numerous roles leveraging data to drive analytic outcomes for large organizations. He’s been Chief Advisor, advanced analytics of Newmont Corporation, Chief Information Architect and VP of business intelligence at Digital Realty, and has also had roles as Executive Technical Architect at IHS.

Megan LaFlamme:
So Paul, thank you again for joining us. I’m really excited about this discussion coming out of our recent summit Data Masters, this topic was top of mind for a lot of people, so thank you for putting together this content. And I’ll hand things over to you.

Paul Balas:
Yes, thank you very much Megan, I’m very excited to be here and share some of my learnings with you. I’ve been doing data for quite a while now, a few decades, and I’ve had a lot of different experiences, so want to share some of those with you around the topic of Master Data Management. I’ve implemented it a number of times, in a number of different companies, with a number of different types of data, so not just customer product. And what I’ve learned is, what to avoid and what to embrace. So, we’re going to talk about the past, present, and future. I’m not going to dwell too long on the past, we’ll spend a little time on the present, and we’ll try to spend more time on the future, and it’s all about MDM approaches. What have vendors been doing? Where did the market come from? Where is it today? And where is it going to?

Paul Balas:
So, what is MDM? For those of you who aren’t familiar with it, it’s an abbreviation for Master Data Management. And it basically is cleaning up your key business data. So when you think about your business and how it operates, you think about things like your products and your customers, but there’s a lot of other data that’s critical, your general ledger, for example, you can have master general ledger data. Location data, for example, when you’ve got a very distributed company in multiple locations, you might have a financial view of your locations and you might have a physical view of your locations, and the two seemingly never shall meet but with MDM you can solve some of those problems.

Paul Balas:
And the way it’s described today is that it’s a partnership between technology and the business. And why is that important? It’s because there’s certain components in MDM that the IT team needs to help the business with, and certain components that the business can take control for themselves. And I have a bias, I like to see solutions that really focus on end user enablement. And with machine learning and some advanced technologies and development in the software arena for MDM, some vendors have really taken a step change in terms of that end user enablement.

Paul Balas:
So, this is an important adage, it’s, if you don’t have trust in your data, then you can’t really use it to drive company performance, because you’re always going to be second guessing the data. We see that in the public sector around COVID data where we’re not confident in the test numbers, the infection rates. And that’s just a macro view of your company, where a lot of people spend a lot of time arguing about the data, they spend a lot of time preparing the data in parallel, we call it parallel data play, and certain people in one organization might come up with a different answer than others. So trust is key and we’re going to talk a little bit about how to build it with MDM.

Paul Balas:
What should you achieve if you invest in an MDM solution? What’s best in class? The end result is that your stakeholders will have trust in their data. Your data stewards, who are engaged in cleaning the data and taking their intellectual property, your intellectual property, not theirs, about how the business rules work, how your organization works as it’s expressed in data, they have to translate that into some sort of system for you to capture and retain that IP. And it’s an investment in their time, so we have to value it. We need to align across teams, and the MDM onto itself really won’t solve for that problem. MDM and data governance are like peanut butter and jelly in the data world, they go together and you need one to be able to really be effective with the other.

Paul Balas:
So, we want to be able to spend more time analyzing data with trust and less time on data preparation. A whole classes of tools with end user data wrangling have come into being because they recognize there’s this problem, it’s a productivity gap, and so we want to solve for that. And we want to get faster time to value. Traditionally, we’ll talk about how long MDM solutions can take to implement and how vendors are really trying to spend their IP and their development dollars in shortening that time so you get value faster.

Paul Balas:
So let’s talk just briefly about the past. MDM came up, I don’t know if it’s the chicken or the egg, PIM or customer, but the first one that I ever worked with was around customer mastering, and the sales operations folks, and the SVP, and the chief sales officer, were all very excited about it because they wanted a 360 degree view of their customers. And creative people in the software industry said, “Hey, that’s a problem, we can fix that, let’s create, match, merge some other techniques programmatically that will solve for that problem where we can really identify Paul Balas as the golden customer at our company, even though he may have interacted with us many times.

Paul Balas:
And then product has a similar type of problem where if you’ve got a very large corporation, you buy a lot of products from a lot of different vendors, you need to have the catalogs and you need to have some sort of unique identifier for those products. But what was happening is people were buying products, they weren’t necessarily picking from the product pick list, and then there was other data in other parts of the organization that we’re capturing the revenue for the product and maybe the costs. So we’ve got maybe two, or three, or four different places you have to bring that data together. So that’s basically where the areas, the domains that MDM started, began, and they were really, at that point in time, not super great in terms of the level of accuracy, and they took a lot more work, and there were a lot of failures.

Paul Balas:
So let’s talk a little bit about the present. And this is interesting is, when you listen to the analysts, Gartner, Forrester, and others who are practitioners in Master Data Management, they all say pretty much the same thing, it’s not the license cost that’s going to get you in MDM, the software that you purchase, it’s the amount of time it takes to actually implement it. And at one point, Gartner was estimating 4X, a 4X in cost of software license in terms of people costs. So reducing the time to implement is critical in driving the cost down.

Paul Balas:
Now, you may have a very good value case for your MDM, it might be tens of millions of dollars if you implement it, five million, a million. So maybe it can offset some of that cost, but we still want to be efficient with our hard earned dollars and how we spend it. So it’s not a silver bullet, the analysts also say this and everyone else who says this, it’s really a people powered exercise. So even though you’ve got software to help you, you really need a lot of engagement with people in your organization who have domain expertise. And the reason why MDM and data governance go hand in hand is because you might solve one domain, but unless it’s broadly shared and standardized through the organization through governance rules and standards and agreement, then it won’t be as effective and you won’t leverage the full value of your investment.

Paul Balas:
Why are analysts scared of MDM? We hear these warnings, warnings. And a lot of analysts in the past decade have made a lot of cases for different methodologies for delivering MDM faster. Well it’s not just a methodology issue, but that’s important. The MDM initiatives fail when organizations don’t ensure organizational readiness, according to the analysts. And they confuse what is and isn’t master data, it’s an abstract concept, what’s a business entity? What’s an entity? So there’s this new language and we try to simplify that language so we don’t make the business people learn a new language and speak of master data in their own terms, a location, an asset, a fleet, so real world things that makes sense within business context. But this warning by these analysts, they’re well earned warnings, is because it is hard if you don’t get people aligned, and we’re going to talk about how to more efficiently align people and enable them through the use of software.

Paul Balas:
So because it’s people, why is it people? Why are people the problem in any MDM program? Well, data hoarding is real, data protectionism is real, we’ve all heard that it’s my data, I control it, I will communicate it, I certify it before it goes out. So it’s a source of power for some people in organizations. And I’ve always said that data should be Switzerland, the Switzerland of data, it doesn’t pick one side or another, it’s just what it is, it’s a corporate asset. So you’ve had to have these techniques for engaging and working with people in MDM projects who have data ownership to participate and collaborate. Sometimes it can be done bottom up because they’re enthused and tired of all the work they have to put into the data, and sometimes they’re not, and you have to do some cajoling and maybe some top down. So there is this enterprise nature and organizational change nature to data because of some of these issues.

Paul Balas:
And then when you start to do more meaningful things in master data, and you want to share it broadly across the organization, what ends up happening is unique cross-functional teams come together to agree on what things are, because they have their own interpretations and really through governance you have to standardize that.

Paul Balas:
Executive support, we talked a little bit about top down. And the reason why the analysts and the consultants are saying, “You need this top-down support before you get started”, is for what I just alluded to, sometimes some people don’t want to get on the bus, and they don’t want to engage on the project. And they need that, you’re going to do this, we’re going to measure your performance and supporting this corporate initiatives. So that top-down can be very helpful and is almost always necessary.

Paul Balas:
The real reason that traditional MDM vendors don’t address this issue effectively and why the issue of getting people to do this work isn’t and hasn’t been too effective in the past and a lot of projects fail, is because it’s typically time consuming for these data stewards to deliver a solution within the product to get their IP to put the rules in place. It’s very painful to get good results on legacy technologies, because frankly, they’re not that sophisticated, so you have to apply more people power where the software isn’t lifting its fair share of the load. And then this creates this high burden on the organization and the data stewards, and so, if you’ve been in the industry for a while, we’ve got this tentative nature in how we approach these projects.

Paul Balas:
And if you’ve done this and you’ve experienced the objections from your data stewards, it goes something like this, I already have a day job, I need a team, I can’t do this, I can’t possibly do what I do and do this as well. Why can’t we hire someone? So there’s excuse, after excuse, after excuse, I’ve heard all these excuses before and I’m sure a few of you have heard some of them. And if we can really enable and overcome that objection, enable the data steward, overcome the objection of I don’t have enough time, then that would be a key benefit to any MDM initiative, and we’re going to talk a little bit about that.

Paul Balas:
So, every MDM vendor says, empower the steward. Yes, we recognize that the steward is the critical piece of these projects, and what you need to ask yourself when you’re talking to your vendor or you’re looking at a new vendor is, how long does it take to train a steward? And if they give you that, it depends, now they should be able to give you a timeframe, a low end timeframe for a simple solution and a max amount of time, and that’s to basically get the model trained, get those rules into the model so that you can get something of value out of the master data work.

Paul Balas:
How much of their day is actually fixing the data, right, or correcting the results? And that becomes key because it gives you what the operating cost is for the model, and it has a big impact on how you roll out. You may be able to, if you have a very effective solution, roll it out broadly to people who do have day jobs and they do have other purposes, and this becomes a very small part of what they do. And if it helps them out and they’ve got skin in the game, they get a benefit from their work, that’s the ideal marriage of the type of data steward to correct the business problems with the data entity that you’ve got to solve for. So vendors that can support that are very interesting because they’re enablers.

Paul Balas:
And then the idea of collaboration in this day and age is even more critical. So having an efficient set of activities, a process around the data to fix it and allow people who have different perspectives to participate in that workflow is pretty important. And the way a lot of projects in MDM and data governance start is we get together and we start defining what the rules are, and we define those rules, and we have to get agreement, we have meetings in person. Well, with some of the vendors, you actually can have collaborative workflows to do those definitional processes and then to fix the data together. So when Joe doesn’t understand how these two pieces of data should go together, maybe Sally does, and they can do it efficiently through these workflows and route problems to one another, and it’s a birds of a feather approach.

Paul Balas:
Every CXO says, “No more tribal knowledge.” Everybody wants the IP of what’s in the colleagues had into some sort of system or framework, especially in industries where we’ve got an aging population and people are starting to retire or they’re soon to retire. I’ve been in a few companies like that and it’s no joke. People have domain experience of 30 years of working at a company, and to lose that is a real loss in value. So we want to figure out efficient ways to get what people know into the system. And there are really two approaches to doing that today, there’s an old approach and a new approach. So I’m going to talk to you a little bit about the old and the new approach.

Paul Balas:
The other issue is that there’s business versus IT. And there’s a friction, there tends to be a friction in a lot of organizations where IT is being asked to do things that are not in their day jobs. So we’re introducing MDM to an organization, it’s new functionality, and the business users have to engage with those IT folks and they have to co-build the solution. I’ve got the requirements as the business user, I’m going to tell you what I need, and you’re going to go code. And so, that tends to be a very expensive cycle of development. It’s a lot of back and forth, it’s a lot of time, and that’s one of the key issues why we have this friction between IT and the business, IT has to build something for the business, the business doesn’t want to wait, natural tension.

Paul Balas:
So when MDM tends to fail, when your platform requires an active IT to make all the changes to the rules. We want to look for modern platforms that enable those business users to do as much as they can for themselves, and where we do have to depend on IT, we depend on IT. And this tends to be a very slow and frustrating experience and it can create some animosity in the organization. So, we really want to enable those data stewards because they hate manual data entry. So let’s talk a little bit about that and how we can improve their lives.

Paul Balas:
One of the techniques to improve their lives is Agile. And agile is a great way for most types of development work to deliver incremental results in quick iterations with very fast feedback. And what it tends to do is it tends to reduce the length of time it takes before you fail. So you can fail more quickly or get it right more quickly because you’ve got a better interactive model for development. And it is a good thing for MDM as well, and in fact, it’s the approach that I always recommend when I engage on these projects. So it’ll increase your odds of success, but it can’t cure for an archaic MDM legacy platform that really hasn’t done anything to modernize and take advantage of some new capabilities in the market.

Paul Balas:
Aging MDM platforms kill agility. So why do I say that? They’ve been developing this code for 25 years now, some of the vendors haven’t modernized their code and so it’s brittle. There’s a lot of technical debt in it, modern changes to improve the way the code is built, the efficiency, the flexibility aren’t there. So when you see a vendor who’s been in business for a long time, ask them how long their code has been in place. It can be telling, and if it’s been in place a long time, it may not be very good for you, the customer.

Paul Balas:
When IT is telling you as a business person, we need this complex methodology, we’re going to do a [Boli 00:20:14] Bot Framework, and I’m going to teach you in two weeks the new language. The business hates that, because they’ve got day jobs, and they’ve got problems to solve. So if there’s this process of learning or a process to implement change, it creates a lot of friction, and ultimately, it’ll create a lot of dissatisfaction. So think about modernizing your platform if you have one today. And I’m going to give you some tips and tricks for what to look for in a modern MDM solution if you don’t have one.

Paul Balas:
MDM agility killers are effectively where you’ve got procedural rules. So now we’re getting a little bit more into the nitty gritty. Procedural rules or if-then-else logic, and that’s the way 95% of the solutions out there implement logic. That IP, that subject matter expert has, they tell you what the problem should be, you can’t have somebody who’s older than $100 or 100 years old and go on probation. It’s an if-then-else rule, it’s conditional, and it requires coding. So that’s a problem. Software that requires a lot of training, new language, yes, we’ve got three different tracks of training for our solution, business doesn’t have time for that. Changing rules is expensive and requires a significant amount of testing, so if your platform is not agile from a DataOps perspective or enabling an end users and giving them immediate feedback in a production setting, then you’ve got these cycles of development and you have to test them. It’s very time consuming, very expensive, and very frustrating.

Paul Balas:
And then software that doesn’t make the data stewards life easier. If they’re doing something in their spreadsheet and it works for them, they’re going to give you an excuse like, this is going to take me forever, I can’t do this. So it needs to be frictionless. And then there’s no mechanism for collaboration, we talked about how important collaboration is. You really need that way to work together where you’re not be able to go to the office. And I think that’s going to be our reality for a while, and maybe it’ll be even more so in our new culture as things happen with this pandemic. So the other thing is when you want to add anything new. Why most people don’t have more than a customer a product MDM, is because it takes too much work. And so they don’t master a lot of their data. And they just stop with, okay, well, we’ll just do customer and product, no, I can’t take another project.

Paul Balas:
You need to be in The Cloud. If you’re not in The Cloud today, you need to be in The cloud, and there are true benefits in terms of agility and time to value. And it really takes away the need to have a lot of people On-Prem to administer servers, all the different disciplines, you still need a few people to do that, but The Cloud will abstract some of the infrastructure management for you and make it a lot easier. And it importantly, what it does is it allows you to scale up and scale down as you have needs and demands from a compute power and storage perspective, so that you don’t have to have long lead times in getting new infrastructure in place, because we just bought or acquired a new company, we integrated it into the warehouse, and I need five more servers. So you can just basically twist the knob in The Cloud, and you can scale up. And it’s real, it works very well, and it’s secure.

Paul Balas:
So best of breed versus integrated offering, this is one other concept. And some vendors’ strategy, they have everything, they have data discovery, they have Master Data Management, they have data quality, they have data catalog, they have integration, they have machine learning, they have everything. And that is not always a good thing, because certain problems where we see a lot of friction, we want to leverage innovation in the market. So I have a bias personally, I recommend this bias for you, and do your own exploration, around best of breed over platform.

Paul Balas:
IT likes platform because it sounds good, like, well, I just have to deal with this one vendor. But the fact is, is that if they have a large broad offering, they have different teams to support it, and they have different levels of investment in each of those components. So some vendors don’t even modernize their MDM, they haven’t touched the code base for a couple years. One very large vendor who I won’t name, is exactly in that situation. And there are innovations in this market that you really need to be able to take advantage of and decouple from that large integrated platform offering.

Paul Balas:
So Multi-Entity Mastering is Now Table-Stakes. Your business is more than customers and products. There are certain products out there in the MDM world which do just customer, they focus on that, or they focus on product. And in fact, you want to be able to do different types of business entities in terms of things like your suppliers, your transactions, your parts, your work orders, you should be able to really govern and master anything in your environment that’s meaningful and core to the business.

Paul Balas:
So let’s talk about the future. Next-Gen MDM. What’s Next-Gen MDM? Next-Gen MDM is about machine learning and applying the power of machine learning to enable the data steward and improve the precision of the Master Data Management process. Machine learning is allowing match-merge logic to be trained instead of coded with procedural rules. It’s a more efficient path to get the data stewards to train a model by just saying, yes, that works, no, that doesn’t. It’s basically yes, no. And then the machine learns as you train it and you can get really high precision and accuracy with that model of learning. And in fact, I’ve proven that out myself personally, and many other customers have used machine learning in this way also have.

Paul Balas:
It allows them to capture their rules simply with minimal effort, the data steward, and it’s much more flexible and adaptable to changes. And I’m going to show you a graph about that, about TCO over time, and complexity to administer really is differentiating factor in the MDM market right now. It’s also easier to sell internally. So if I have a way that I make life easy and a data steward can spend minutes, a day, or a week fixing any exceptions in the data, they’ll be in, and they have very few excuses, if any, to participate in that workflow. And then the idea that legacy MDM vendors have to rewrite their current match-merge technology to leverage machine learning is a red flag for you.

Paul Balas:
So ML talked about, it’s the next step, it’s allowing you to do match-merge logic, it’s allowing you to capture these rules, it’s more flexible and adaptable, it’s much easier to sell internally, and the vendors having to rewrite their code. These are all the critical things with which you should be assessing any new technology that you want to purchase. The issue around ML versus Procedural MDM is exposed in this graphic.

Paul Balas:
So we have effort on the Y axis, and on the X axis we have time. So how much effort do we have to put into training or building this solution over time? And the gold line or the orange-gold line is procedural rules, if-then-else logic. The business user tells the IT person, this is what the rules should be, the IT person codes it, or maybe the business user can code it, but they’re still coding rules. The ML line is the green line. And that’s basically taking your data and saying, this is how the data should match up, go tell me what the rules are by validating my assertions. So the ML will actually assert what should be a match from its belief based on how you see the training. And then you just inform it, you say, yes, that’s a match, nope, that’s not a match. What happens is over time, your ML model gets very sophisticated and your accuracy and precision go way up, your level of effort goes way down, because the model is dealing with all those exceptions, and it really enables the data steward.

Paul Balas:
When you get that orange line, you get a roller coaster. You go through this exercise of IT and the business users talking, implementing rules. Okay, we’re training the model, our accuracy is not very good. Okay, we’re improving our accuracy but the effort is going down, that’s awesome. Okay, we’ve deployed, we’re in production. We added a new data source. We’ve got to go through this whole rule process, we’re getting a lot of rules. Okay, the complexity is getting harder. Okay, we got through that hump, we have a business rule change. And then you can see where I’m going with this. Over time, this curve tends to increase the amount of complexity of the number of rules you’ve implemented in your framework. And if you’ve had one of these in place for a long time, you will in fact be nodding your head and you’ll be going, yes, it’s really getting hard to QA any changes we make and we’re very nervous about doing it in our framework.

Paul Balas:
So this whole thing time, time, time to value, more time analyzing, more time strategizing, less time munging data. And when you take the perspective of a business analyst or a data scientist, they spend 80% of their time, you’ve probably heard this before, fixing data before they can get it ready to analyze and build their models. And what certain solutions do is they turn that equation on its head. So ML can actually do that 80% for you, okay, maybe 75%, they can do a large part of that work in munging the data and fixing it, and you can spend more time on thinking about it.

Paul Balas:
So your MDM vendor may not have modernized their solution to this way of implementation, and it’s basically the innovators dilemma. And I’m going to quote Dr. Michael Stonebreaker who is quoting the Harvard Business School. And Clayton Christensen suggested that when technology changes and you’re a vendor, that old stuff is very difficult to pivot from into the new stuff. And so when you’re talking about a vendor who’s been doing this for a while, again, you really have to be thoughtful about, how old is their stuff? Are they really taking advantage of the innovations in software development to enable our outcomes? It’s a very telling exercise to go through.

Paul Balas:
Who’s enabling the data steward? We made the case that ML is important, and are there any vendors that are using machine learning in their products today? Well, there is one vendor and I think there may be one other vendor who is doing it a slightly different way. But the one that’s solidly been embracing machine learning for many, many years now and has a lot of successes is Tamr.

Paul Balas:
Tamr is really a pivot on the old way of doing things, as we’ve described. I alluded to the fact that there’s this long development cycle in the old way, and it goes like this, it’s that rules based process where you find your source data, you identify your developers, you get your business input, so those back and forth discussions about what the data should look like. If you’re doing it right, you’re profiling your data and understanding what the data tells you in addition to what the business user is telling you, then you write rules. And then you get a QA person involved to QA that the software was built correctly. You do if-then procedural logic, then you sit down and review with the business users and they go, that’s not quite right. So DataOps to the rescue. Which is good, we’re going to put DataOps in to validate the solution. But this whole cycle in the old way of doing things is very long and it’s very painful, and it can take months and years to deliver one entity, one business entity for MDM. And your accuracy may not be that great, and it probably isn’t.

Paul Balas:
So the new way of doing things is Tamr uses machine learning. And it’s put it squarely in the spot of the workflow for the data steward, so that they can really take their knowledge and quickly and efficiently transfer it into the system. So we put the source data in the system, yes, probably IT, but we consolidate, cleanse, and categorize that data, it’s iterated and informed by data stewards who are experts, they can collaborate, E-collaboration, around problems that they may not know, like Joe says, “I don’t know how those parts should match, are they really the same thing? Tamr tells me it’s the same thing but it’s not positive.” And then he can field that over to John in Omaha, and say, “Hey, John, take a look this.” John takes a look at it and goes, yes, those are the same thing, because he’s procurement in Omaha.

Paul Balas:
So this idea is very efficient and very effective in the new world. And you get very high accuracy. I built recently seven different projects in Tamr with 90 plus percent accuracy, and I did it and weeks. So I’m a believer, I encourage you strongly to test it out for yourself. So when you want to select the right solution, think about these things because you’ve got a lot of choices to make and there’s a lot of vendors in the marketplace. And by the way, Gartner and other analysts predict, this is a $25 billion industry in the next five years.

Paul Balas:
The guidelines for vendor selection. Vendors you want to run from, from my perspective, are vendors that highlight technology features over proving value through example. Do they have customers and examples that resonate with business outcomes, and have they achieved those for clients, and are they referenceable? Can they do more than one or two domain? You need more than just products and customers. Can they offer those strong references to you? Are they 100% referenceable? Can you pick and choose? Show me people like me, always a great way to have confidence in the solutions you’re going to purchase. If they offer a fully integrated package, it is not necessarily a benefit, and so you have to be thoughtful about what’s most important to you and your organization in achieving your MDM desires. And then if they hadn’t modernized their approach in a long time, they’re still crazy after all these years, you really need to think twice about those vendors.

Paul Balas:
So you want to run towards vendors that are focused on business outcomes. They leverage these modern approaches, the innovations in the market, and they simplify the life of the data steward. It makes it easy to operate the system. Ask your vendor, how do I staff for this? And if you get a pretty long laundry list of people, then you might be concerned. Some of the critical features in modern MDM vendor software is that the data steward trains the model and they can do it simply.

Paul Balas:
So procedural versus ML, if you want to use something like Venn process analysis, you want to use other methodologies to determine how fast that data steward can actually do their work, it’s worthwhile. And so when you’re evaluating vendors, actually prototype the software, prototype the experience, day in the life of the data steward, most key thing you can do to get buy-in on any new technology. And if they can do that efficiently and effective, they’re going to be happy, they’re going to nod their heads. And if you get a good result, then the sponsors of the project are going to be happy as well. That data steward collaboration framework is super critical, can’t underplay the importance of it. Allowing people to collaborate with purpose around data is the most efficient and effective way to fix your data, because it will require dialogue from people, it will require alignment and agreement.

Paul Balas:
If you have multi-domain, MDM software where you can do more than one type of thing, more than product, more than customer, and you’ve heard me say this over and over again, and it’s because it’s really important to be able to iterate through your different business assets at your corporation and truly get leverage the value of your data and not just for customers or products, but for your operations, for your finance, for every area in your company, your HR. So the critical capabilities are what I believe you should focus on, these are what they are.

Paul Balas:
So what are important capabilities? Well, there are some things about actually how you build a solution that are really important, like how you master schemas. So a schema is basically just a table, and you’re going to have a bunch of tables from different databases that have to all come together. So how do you map those schemas together? That’s schema mastering. And some tools actually make it pretty efficient to map those things together because they’ve applied some intelligence and smarts in the software to do what’s called a structural schema mapping. And it’s basically PROD ID, product ID, yes, those got to be the same column on these two tables. Let me map that for you and give you a recommendation. So that’s a very value added feature when you’re trying to construct things because it does take a lot of forensic analysis work to look at your data and start to understand even what it is.

Paul Balas:
Data quality is pretty critical. So being able to look at the data, have a profile of it as you’re trying to build your project is critical. You either do it manually or the software enables you. If the software enables you to do that, then it makes the process of having conversations about what the data is much faster, and it’s part of that workflow within the framework. Metadata management, almost every solution and master data that you’ll implement, you’ll want to enrich your data with external or other data in the organization. And it does a lot of things for you, it helps with your matching, it helps with your ability to categorize and view your data by things that are important, say Industry Classification of your customers, you might go to DNB for that. That’s metadata management.

Paul Balas:
Hierarchy management, the ability for you to model hierarchies that are relevant to your business. So what is our organizational structure look like? What is our physical or our legal structure look like? It’s very important hierarchy that you can model in some of these solutions. Match-merge, golden master, all these capabilities within the ML world, there’s really only one vendor that I know, maybe a second one coming up where they use ML in this process. But if they’re just using procedural rules, then it’s not as good, it’s not as precise and accurate. And the match-merge golden master otherwise, is pretty much not a differentiator. So if you’re looking at vendors and you’re saying, well, you don’t do ML but I like you for these other reasons, then it’s probably not going to be a big differentiator in how they get to golden masters.

Paul Balas:
Core data catalog. So that’s that ability to then document what your data is so that you can share it broadly. It’s a nice to have. There’s a lot of catalog vendors out there who do a great job and they offer a lot of innovation in this area that you may want to consider and they don’t have to be from one vendor. Data modeling, the ability just to model your data, it’s a technical thing that the IT data architects do, it’s important but it’s not critical. And then analytics, if your platform offers some analytics about the quality of your data, that’s pretty important. Some way to measure the quality and the output of what you’ve got, if they’re trying to offer you an analytical project building capability like dashboarding, not so interesting. You probably already have Power BI, you have Tableau, you have click, you have something else.

Paul Balas:
Nice to have things, master data governance, I would personally look outside to another vendor for the governance aspects, the catalog and discovery we just mentioned. Data integration, not critical. Probably most organizations who are investing in MDM already have some sort of integration technology, data sharing exchange, so the ability some vendors offer. Well, you can participate in our ecosystem and you can get data and share data, it’s not to your benefit, it’s to their benefit to monetize the product and create pull through for the demand.

Paul Balas:
Data syndication, your ability to then send your data, this is not a hard problem for IT professionals to solve, integration professionals. So if they really tout that feature, not too interesting to me, very specific use cases. Other considerations, hey, did we build it on graph? One vendor has a claim to fame on that, they have a great product, but it’s like, well, okay, what business problems can I solve and how fast can I get this thing built? And is it appropriate for all my different use cases? Big data shouldn’t be a feature, that’s a thing of the past in terms of marketing. Do they use a relational database? So forth and so on. These things really aren’t that important. If they fit within your ecosystem, great. If you’re in The Cloud, you have less concerns about the management of these things, though it doesn’t eliminate the management of these things.

Paul Balas:
So let’s move on to what customers say about these products. And this is from Gartner peer insights, and powerful products, struggles to deliver. Good as a traditional all in schema first MDM platform, lagging on AI and ML, speaking to my points in this presentation. Overall comment, incredibly flexible, model everything, awesome. Overall experience has been okay. First implementation of MDM, it’s been highs and lows. Well they probably experienced some of the pain I just walked you through.

Paul Balas:
What makes a great MDM vendor? At this point, I’m probably repeating myself, but sometimes repetition is valuable. And a great MDM vendor is people working together efficiently so that they can transfer their expertise into the mastering of the data. And then that drives decisions within the organization that people trust. So vendors who do a great job respecting the time of the data champion and the people that participate in this workflow, are the vendors that you want to be with. And great MDM vendors provide smart ways for organizations to have conversations on live data and fix it without IT.

Paul Balas:
So thank you very much. I can talk for a long time on some other subjects, but I think there’s a few questions about this subject that people might be interested in finding out about.

Megan LaFlamme:
Yes.

Paul Balas:
So, Megan.

Megan LaFlamme:
Sure. That was great. Really great deep dive. And I think you touched so much on the cross-functional and agile approach, and better aligning these cross-functional teams from the business and IT side. So two part question, one is, who’s typically sponsoring such a huge shift in technology and process? And when you embarked on this journey, how did you sell this internally? Who had a seat at the table? What pitch did you give to get all of this executive alignment on such a big transition both in technology and in process?

Paul Balas:
Yes, so I could just speak from my personal experiences having done this a number of times. And the people who come to you, if you’re an enterprise architect or a business intelligence data warehouse person, often in the past were the sales and marketing people. So the sales and marketing people were my first project, they said, “I don’t know which customers are buying what from me, and I don’t know how much my sales are. We’ve acquired for different companies and we’ve never integrated the customer database.” And it’s like, okay, as an enterprise architect, I go, there’s a solution for that, it’s called MDM, and this is what we’ll do, and this is how much it will cost, and these are the people that we need. So it was really a business driven conversation.

Paul Balas:
And then after having done the first project and there was more awareness about Master Data Management in the marketplace, it was becoming more topical. Customer 360 became a thing, and then ERP vendors started going, hey, we need a product 360 because our ERP doesn’t do a very good job of mastering product data, and so that came along. So business people were starting to become aware of those things. But sometimes I go into a company and I would see this problem that required Master Data Management, they had no idea. So there’s this process of education, what is Master Data Management? How does it work? Why do you need it? And I got better and better at translating that into business value and focusing on business outcomes, and then the MDM as an enabler to that, and then really focusing on the people piece of it.

Paul Balas:
What I learned over time, as we had this, sometimes I was suggesting MDM, sometimes the business was demanding or needing it, what I learned over time is that it really came down to getting the data stewards engaged, and the data champions, and then the data governance. And I started having that be my first foot forward, is, yes, you want this, what we need is to have all these people engaged. And that became a barrier to entry and it became a barrier to delivering and convincing people.

Paul Balas:
Once the technology started getting better and better, it became a much easier conversation to have with people. Really, at that point in time, I’ll say seven years ago, MDM was really fairly mature, I won’t say really mature, but fairly mature, and a lot of people had awareness about it and the value that it could bring, so there was this buzz in the industry. So the conversations became much easier, however, project success always hinged on getting those data stewards engaged and understanding what they knew about the data and translating that to a system.

Megan LaFlamme:
I think, so you talked about the need for education around MDM at one time, and so I’m curious, at what point did you need to transition into education around machine learning, and did you get challenges or pushback with ML being this magical black box? And I think you really succinctly described it on this webinar around how it learns from seeded training data. But at the time, when you were embracing these new technologies, how did you navigate any concerns around machine learning technology and adopting new approaches?

Paul Balas:
So at first, there was one company, Tamr, that I tracked for about seven years. And they started introducing a [inaudible 00:48:07] to their product, and I thought, that’s an innovation, that’s interesting, because they’re solving that friction for the data steward. And that’s really the way I saw the product as a focus. And so, at first, when I came aware of it, there was really no other game in town and I’ve never been on the cutting edge of technology, I wait till ideas prove themselves out. And so through this tracking, I started seeing success, success, success, and I thought, okay, this is really good. But I didn’t really speak to people at first in terms of ML and solutioning, because it really didn’t have that brain trust that MDM was starting to have, and I didn’t want to confuse people or think that I was just excited about a technology for technology’s sake.

Paul Balas:
So I talked about time to value, right, enabling the data steward in those conversations. And then as I had opportunity actually to invest in Tamr in a product for a supply chain optimization project at Newmont Goldcorp, it was my first introduction to actually getting my hands on this, we did an evaluation, yes, Tamr came out on top, we looked at other vendors. And what I found was I could start talking about machine learning as a differentiating feature, and people were going, okay, I understand basically what ML is but I really understand this time to value concept and enabling the data steward, and yes, I agree that, that’s really one of the big barriers to entry and getting MDM to be successful.

Megan LaFlamme:
Yes, that’s great. And I’m curious on the flip side, what do you see as some of the biggest reasons for organizations staying the course with their legacy procedural based MDM solutions? What holds them back from investing in something else?

Paul Balas:
It’s really hard to walk away from sunk cost, even though sunk cost shouldn’t be considered in determining your next steps. So sometimes people have their credibility on the line because they’re the ones who proposed the original technology. They went through years of pain to get to where they are, and they don’t want to go through years of pain again. So it becomes an exercise in education, and enlightenment, and some persuasion to show people that there is really a better way, that there are some innovations in the marketplace. You can leverage these innovations and you can really change the experience that you had from one that was painful to go through, to one that is relatively painless to go through, and reduce your burden on the organization and get better value. So that’s really one of the big barriers to entry.

Paul Balas:
The other barrier to entry is for people who still are working in organizations that aren’t as high tech. Maybe they’re more large infrastructure oriented and the technology is more, and the advancements are more in the engineering piece of it rather than the data piece of it. So in those types of organizations, there’s an educational process of how you do this, you have to basically be able to prove and demonstrate that other people have been successful who are like them, it’s always a very powerful way to incentivize and get people to go, okay, maybe we could do that. So those are really the two classes of client that I’ve engaged with, and they have a different way in which we solve that problem on the barrier to move forward with new technology.

Megan LaFlamme:
Awesome. Well, I think we’re just about at time. And I really enjoyed our conversations over the last few weeks. And thank you for joining us at Data Masters as well. I encourage everyone who’s listening today to also take a look at Paul’s talk about the way that he has been leveraging this technology to uncover some of the challenges around COVID-19 data, and that session is available on demand on our website. And Paul, I hope you don’t mind me saying that I think you love nerding out over this stuff. So it’s been so great and I just wanted to open it up to our audience to reach out to you if they want to talk about data strategy and some of these other topics that I’m sure you’d be happy to dig into.

Paul Balas:
Yes, I love talking about this. Really enjoyed my partnership with Tamr, great company to work with, great people, they can walk their talk. So thank you very much, I really enjoyed this experience.

Megan LaFlamme:
Great. Thank you, Paul. We’ll talk soon.

Paul Balas:
All right.