datamaster summit 2020

The Business Benefits of Establishing “Data as an Asset”

 

Sarah Gadd and Young Kim

Sarah Gadd Head of Semantic Technology, Analytics & Machine Intelligence, Credit Suisse
Young Kim Head of Digital Solutions & Group CDO, SK Group

Hear first-hand why data mastering at scale is critical to achieving their businesses goals, and how they are proving to the business the truth behind the phrase: data as an asset.

Join this roundtable discussion led by Anthony Deighton, Tamr’s Chief Product Officer, with SK Group’s chief data officer, Young Kim, and Credit Suisse’s, Sarah Gadd to learn more about why Tamr customers value data mastering at scale and what it allows them to accomplish for the business.

Transcript

Melissa Campbell:
Hello, and welcome to the second day of the Data Master Summit. I’m Melissa Campbell, Chief Revenue Officer at Tamr.

Larry Simmons:
And I’m Larry Simmons, Chief Customer Officer at Tamr.

Melissa Campbell:
Thank you all for joining us on day two of Data Masters. Yesterday, we heard from business and data leaders who are leading the way and leveraging data as an asset. Larry, what was your biggest takeaway of the day?

Larry Simmons:
I really enjoyed hearing business leaders solve real business problems using their data. I loved how they talked about increasing their topline and making revenues stronger, look at bottom line and supply chain and spend optimization. They reduce risk using data files they already have in their business. I also just like how they use real clean data to make solid business decisions. It’s refreshing to see that in the business decision making process. Melissa, what were some of your favorite sessions?

Melissa Campbell:
Hands down, the CDO panel. We had Alaina, Alica Conna from Dannon, we had Kathleen Maley from KeyBank, Barkha Saxena from Poshmark. It was outstanding. We heard what these fantastic data leaders are doing to enact change within their respective organizations. From reducing spending risks and delivering overall better business outcomes, the role of the CDO is crucial in these times of change, and these panelists have led incredible initiatives, and it was really, really a privilege to hear their stories.
We also heard from Paul Balas, who saw a real problem in the way coronavirus data was being handled, and he looked at how he and the technologies around him can fight the spread of the disease. Lastly, we also loved hearing from Mike Stonebraker and our CEO, Andy Palmer, live and chatting with the attendees. It was great. It was such a great opportunity to engage with our audience and get Mike’s unfiltered perspective on what’s working, what’s not working in data management today.

Larry Simmons:
I had a couple that were really outstanding. Mark Marinelli, who runs cloud and partner enablement had a couple of great sessions yesterday. One was seven components of the data ops ecosystem. He talked about the layering of the data ops ecosystem, the solutions they need, the people involved. It was a good multifaceted look at how they solve business decisions. The second was data ops architecture, keep human in the loop. One of the most incredibly important things in data, especially in regards to ML, is human feedback and having active learning play a role in accelerating data at scale. It’s crucial that humans have a role to play in data curation. Last, from Mark Alvarez, at Thomson Reuters. He did a great session on, don’t boil the data lake. As we all know, data warehouses have a ton of complexity and a ton of data, and you have to incrementally work your way through it and treat data like an asset, instead of a byproduct of the IT process.

Melissa Campbell:
Right, right. So thanks, Larry. By the way, if you missed any of these sessions, they’re available on demand now and through the platform and on Tamr’s website after the summit. So be sure to check it out. So with that, we’re ready to kick off day two.

Larry Simmons:
Very exciting.

Melissa Campbell:
It is exciting. So day two, we’ll focus on the key components of a scaled up data management in the cloud, and we’ll host breakout sessions for specific industries. So we have 14 sessions lined up, covering firsthand stories from Tamr customers on why data mastering at scale is critical to achieving their business goals, and how they’re proving to the business the truth behind the phrase, data as an asset.

Larry Simmons:
We also have several panels, with Blackstone, Santander, and other financial institutions who are using machine learning to more precisely manage their complex and diverse data [inaudible 00:03:19] systems.

Melissa Campbell:
We also have three federal focus sessions on data mastering for geospatial data conflation, best practices for data ops in the public sector, with the US Air Force and a panel on leveraging federal data as an asset, with the former CTO of the Obama administration.

Larry Simmons:
That’s exciting. What a great session.

Melissa Campbell:
That is exciting. So without reading off the entire agenda, I’ll just share that we’re thrilled to host Tamr partners, Five Trans and DataRobot, who will be both speaking with Tamr CEO, Andy Palmer, on making the most out of your data warehouse with AI and ML.

Larry Simmons:
I have to say, today is my favorite day of Data Masters.

Melissa Campbell:
Well, because we’re here hosting it.

Larry Simmons:
That’s true. The [inaudible 00:03:56] is jam packed with fantastic speakers who are really making a difference in the world, and it all involves data. With that, everyone, let’s kick off day two of the inaugural Data Masters summit.

Melissa Campbell:
Awesome. So if you have any questions or want to talk to a Tamr expert, please visit the Tamr’s virtual booth. You can access that booth at the top right hand of the home page, and thanks so much and enjoy the day, everyone.

Larry Simmons:
Thank you. Enjoy the day.

Speaker 3:
Data Masters Summit 2020, presented by Tamr.

Anthony Deighton:
Hi everyone. I’m Anthony Deighton, Chief Product Officer here at Tamr. Thanks for joining this session, The Business Benefits of Establishing Data As An Asset. Now, if you’ve watched any of the earlier keynotes or presentations, you’ll have heard how Tamr, we believe data is a uniquely transformative asset for all businesses. If you haven’t already, I’d highly recommend taking the time to watch our CEO, Andy Palmer’s keynote on this very topic. Of course, watch this session first. But as everyone knows, talk is cheap, and if we want to think about a slogan like data as an asset, we really got to get to proving to your organization the truth behind this idea.
In this session, I’ll have the good fortune to talk with two data leaders about exactly how they got to data as an asset. First off is Sarah Gadd. She’s the head of Symantec Technology, Analytics and Machine Intelligence at Credit Suisse. We’re going to have a great conversation about data and its changing nature, especially in the financial services sector. Now, one of my favorite parts of this conversation is where Sarah tells us about the key missing ingredient in the popular adage, data is the new oil.
Next up is a conversation with Young Kim, who is the Group Chief Digital Officer at SK Group. Young is leading the change to turn data into an asset at one of the world’s largest conglomerates, with businesses ranging from construction to healthcare. We’ll get into a really interesting conversation about his journey to find and demonstrate data’s value in the 119 SK group affiliated companies. I really enjoyed these conversations, and I hope you will, as well. If the topic resonates with you, I’d love to start the conversation about how Tamr can help you turn your business’s data into an asset. Well, thank you for joining us, Sarah. Welcome to Data Masters. I thought what we might start with, if you could share a bit more specifics around Credit Suisse, the work that you do, the structure of the organization and your role within it, that’d be lovely.

Sarah Gadd:
Sure. So Credit Suisse is a global wealth management company, we have around 48,000 employees globally. It’s an interesting model, because it’s more of the hub and spoke. We have one central division, which is called group CIO, and that contains… You could think of it as the foundation to Credit Suisse; things like IT, infrastructure operations are sitting within this kind of central area. And then around that, we’ve got the divisions. And the divisions themselves, including some of the corporate divisions, had their own IT areas, and so drive their own initiatives. From the central model, where I sit, in group CIO, we actually provide that framework, the enablement functions, the governance functions that help the bank broadly, and that the various different divisions can tap into.

Anthony Deighton:
Excellent. That actually brings up, I think, an interesting issue and sort of a theme I’ve heard in our Data Masters podcast and in my conversations with data-driven businesses all over the world. And it’s this question of like, centralization versus decentralization, what groups should do what function, how do we empower the business to get value out of data, to treat data as an asset, but then how do we create leverage and structure from the center, cost efficiencies, et cetera? I am sure this is a question that you struggle with and that Credit Suisse thinks about. Maybe you could share a little bit about how you approach that, and potentially even, how you’ve seen changes in the business from central [inaudible 00:08:42] central over time.

Sarah Gadd:
Yeah, that is an interesting… It’s one we’ve pondered quite often. So, you’ve got these different approaches, I would say, in different industries, and people have tried a number of and figured out that they don’t work well. I think from our perspective, we have a group CDO office, and through our group CDO office, we have a data management framework, and that just focuses on certain pillars around usage and insights, around architecture, around governance, around security and around data quality. Each division then has a CDO that sits within that division, and each division, especially on the business side, have their own business strategy.
So the people who are going to be able to execute the value out of the data, which is a massively valuable asset, are going to need to sit close to the divisions and within the divisions and [inaudible 00:09:39] themselves, because they understand they’re the SMEs. They get what the data can do, they understand the data. But we need a central governance function to make sure that we’ve got the right framework in place and also the right tools to enable the simplified efficient access to quality data. It’s great we’ve got all this data, but if nobody knows what to find where, when, or it takes three months to be able to access the data, then that’s going to create problems in itself, and it’s very, very inefficient to the pipeline. So essentially, providing that structure, the tools, the governance, and providing the enablement then to the divisions to execute close to the business needs, I think, is the model Credit Suisse at least works with and operates.

Anthony Deighton:
I love that idea of enablement. What it sort of brings up is this idea that those close to the data, those close to the business understand the problems they’re trying to solve, the opportunities they’re trying to capture, and then the role of the center is sort of a lubricant in the machine to make sure that they’re operating as efficiently as they can and with the speed that they need.

Sarah Gadd:
And with the safety and security. We’re moving to a situation and have moved to a situation where data volumes are exploding, and we’re also looking more and more at combining external data with internal data and looking at alternative data sources. So you’ve got this exponential growth, and the traditional approaches to accessing data, a lot of which is role-based need that uplift to start thinking about things in a different way. How do you entitle more granular access to data more on an attribute basis? How do you protect it as you’re working in all these hybrid environments, whether it’s on-prem or public cloud or private cloud? How do you enable to execute in different environments and platforms while maintaining security and integrity around the data and adhering to all the rules and regs around data privacy, data movements? There’s a lot out there. As a big investment bank or a big wealth management company, I should rather say, we have to adhere to regulations from a huge number of regulators globally, not just one. [crosstalk 00:11:51]. So we have to make sure we’re checking all the boxes.

Anthony Deighton:
You’re truly an international operation, and so… and it’s an excellent point in the sense that you’re not just… You have the risk associated with data leakage, if there’s a sort of negative risk, there’s also the regulatory risk, because you have to comply with regulators, and [inaudible 00:12:12], that’s global. But there’s also the other side, or the positive side of that, which is you want people to make decisions with the best data. So if the data is bad or dirty or stuck in silos, you just can’t make good decisions with that data, no matter how… In fact, you might lock it down so much that they can’t even get access to it.

Sarah Gadd:
One of the principles we have as part of our group CDO organization is focusing on this simplified efficient access to quality data. So a lot of time is spent in group CDO looking at, what all the right data sources to get the data from? So data, as everyone knows, gets propagated to make a data change when it’s going between different systems. So what is the master system? Whether it’s a distribution or a source for that data, what is that system? We capture that information through our central governance tooling, and what we’re working with now is being able to take the governance view, so in other words, what are the critical data elements? What are the security characteristics? Is it personally identifying who owns it, who manages it? And then actually linking that back to the ability to shop through a catalog for data, and then execute up actual analytics type environments in terms of sandboxes from that data that has been strategically given the thumbs up by the CDO in the division.

Anthony Deighton:
This is a concept that I’ve referred to as the data knowledge graph of the business, so this idea that if we understand… I think people tend to focus energy on understanding the data. What they forget is that there are people using that data, so there are people who do a good job with that data, people who do a poor job with that data. There’s data that’s good, there’s data that’s in poor quality, and the role of the central organizations is to find the bad data and make it better, find the best people that use that data and make sure they’re empowered with it to take advantage of it. It sounds like exactly the systems and process that you’re putting in place at Credit Suisse.

Sarah Gadd:
I love the Clive Humby quote from back in 2006, which is that, data’s the new oil. But what people don’t realize is there’s a whole other part of that quote, which is, it’s only valuable if you can refine it. You have to change it into gas and plastic and chemicals to create that valuable entity, so you can drive a profitable activity. So you have to break your data down and analyze it to have value. Quite often, we all quote that data is the new oil, and then they forget about all the plumbing that has to go in to get to the new oil.
So the fact that we have massive amounts of data is fantastic, and that oil is sitting under the ground, waiting to be tapped into, but if you don’t curate it, you don’t govern it, you don’t understand it, you don’t know where the quality is, you don’t bless it, and then you don’t put the pipes in place, so that people… you commoditize it, so that people can actually access it in a safe way, then it’s just going to sit under the ground.

Anthony Deighton:
I couldn’t agree more. I think one of the things I often think about is in the context of oil, some of the great innovation of the last 100 years is automating the extraction and refinement of that raw material into refined pieces. It feels like in the context of data, machine learning is that refining technology. I’m curious on your views on machine learning and thinking about how technology innovation is enabling, to use your analogy, new types of data refinements.

Sarah Gadd:
No, it’s a super interesting area. Just because you can do it doesn’t mean you should do it, which sometimes we forget. We have to, especially in our seat, understand the balance between explainability, when we’re using machine learning, and the value that it’s going to bring. So there are certain cases, especially when we’re looking at data quality approaches and machine learning around data quality type approaches, where we can use pretty advanced algorithms to deal with data quality issues, or identify data quality issues. That’s okay if it’s not data that is highly sensitive or client-identifying, where if you are going to take approaches, machine learning approaches to help cleanse that data, or to aggregate that data, you have to be very clear about what you’re doing from a regulatory standpoint.
So even though the more advanced algos could bring you better results, you can’t necessarily deploy those, because you need that sense of explainability. That will keep commoditizing through time. We’re seeing more of the citizen data scientists now. So the tooling is becoming easier to explain, easier to use, and that will continue to grow, but we have to balance the technology with the needs to be able to explain what we’re doing depending on the data type. So it’s a really, really interesting space.
Obviously, within the bank ourselves, we use a mix of kind of homegrown and open source technology, as well as vendor products. The fact that we sit on this mass amount of data. Credit, this is a company that’s been around for a long time. Being able to use that for training when they’re trying to apply machine learning type approaches, especially supervised, is fantastic, because it means we can use data that has been understood, been used at the bank, has been in the flow for a long time and is very Pacific to that function, that division, that Pacific business, versus trying to find to that data from sources which really just don’t exist externally.
So I think we’ve got all the right pieces. We do deploy machine learning around certain data quality areas, but also being able to help with integration and identify where things should be tied together. Is this the same as that? Does it make sense? What does that golden record look like if you’re bringing five different sources together? And how you can use machine learning in those spaces.

Anthony Deighton:
I think that’s a… As you point out, machine learning is a tool. I love this idea of explainability, because it’s really on the interface point between the machine and the human, and whether that human’s a regulator or an employee or an executive and you’re trying to explain your results and know why something happened. If you can’t explain it, then no one will believe it. As you point out, even if it’s true.

Sarah Gadd:
Yeah, exactly. It helps build that trust. Having a data scientist explain it in data science term is not going to fly. You need to be able to explain it in people terms that anybody can really sink their teeth into. In some cases, but there’s other cases where if it’s a non-sensitive, more commodity type data, some of the [inaudible 00:19:34] data type of field, where you can use very interesting methods to do reconciliation data quality, because that’s… You’re just looking for the results. [crosstalk 00:19:44].

Anthony Deighton:
The underlying data is that data, so if you can make it better, that would be… Yeah. Turning to people for a minute, and continuing this theme of sort of explainability and the human to machine interface point, much like when the car was introduced, there were people who were scared by that innovation and sort of protested that change. Do you find that there are groups of people, types of people that are skeptical of the idea that data’s an asset? Do you find those people riding horses when you’ve invented a car?

Sarah Gadd:
No. In terms of the concept, data is an asset, I think that’s been embedded in DNA, especially within our industry forever. You’ve got people, whether they’re reading stuff off paper, getting information from newspapers or looking through Excel spreadsheets, it’s all been data, and there’s data they’ve needed to ingest to do their roles, and that’s pretty much in every role across the bank. But I think where the fear factor can come in is that trust. So now we’re taking a new approach, how do we make sure that that is the right approach that is secure?
Especially with some of the AI approaches, we have to look at things like ethics and bias, and some of the more… the terms which are very Pacific to AI and machine learning, which weren’t really around when you were looking at spreadsheets. You didn’t really ask if your spreadsheet was ethical, so whether it was biased. So the board is changing, and change can be difficult. So I think generally speaking, people look to embrace change, but they also look to question it. Is this the right thing? I think that’s a healthy attitude to have. But certainly, I think everyone gets the data is an asset piece of it.

Anthony Deighton:
No, I think that’s actually a really nice way of putting it. They have the right starting, they start on the right foot, but they have different expectations and different questions that they ask than they might have about older technology. [crosstalk 00:21:59]. That’s a really important one. So, if I could shift a little bit and ask about… This [inaudible 00:22:12] brings together this conversation about machine learning and data and security, privacy, bias, it all comes together in the idea of the cloud. It feels like there’s this really important shift going on, cloud technology, highly elastic, scalable, et cetera. How does the bank, and how do you think about that as a platform technology?

Sarah Gadd:
Well, I think it’s an enabler. So we’ve seen, especially with COVID, companies like Zoom and the scalability that’s been seem there, online universities, virtual schooling. We’ve had to see in a tremendous increase in adoption, in some cases, forced, of cloud platform type technologies. What’s interesting to me is when… I run the big data platform at Credit Suisse, the commoditized big data platform. We started life there is a bare metal platform. As we implement new technologies, we become more and more platform-agnostic. So we don’t care where you’re running it, we just need to ensure it can run on these different platforms, whether it’s cloud, on-prem or private. And then you obviously have cloud-native.
So, I think their understanding, and especially from a data perspective, if you’re aggregating massive amounts of external data on cloud, along with yours sensitive internal data, how do you do that safely? Going back to regulatory again, we’ve got very big regional jurisdictions around things like cross-border. So we can’t take Suisse data and just stick it in the rest of [inaudible 00:23:51]. Same applies to a lot of jurisdictions in APAC. So how do you deal with that on cloud? How do you make sure that your cloud data centers are in the right place? And then how do you govern that framework, and where do you govern it?
Do you really want to govern all your external data that you’re ingesting on cloud? That becomes quite challenging, especially if it’s not sensitive data. But you do need to figure out that if you’re bringing in sensitive type data that’s external, you need to know how it’s being used and what the end result is. So it adds complexity, but I think it’s an amazing opportunity, and it will help drive innovation. Going back to the citizen data scientist, you’re going to be able to create algorithms to address problems or to utilize algorithms to address problems and have no idea what that algorithm is actually doing under the hood. And then somebody is going to come along and say, “Well, how are you arriving at that decision?” And question that. So bringing that all together within a… almost like a machine learning development framework is pretty key.

Anthony Deighton:
I think that idea of the cloud as an enabling, as a sort of platform technology that is neither good or bad, that opens new opportunities. Your point about the location of the data is the perfect example, because we think of the cloud is this magical thing. Even in the name, it’s in the clouds, it’s a farm roll, it’s… And it’s not. Like, there really is a data center somewhere, and the data really does sit somewhere, and regulators really do care where that data is physically. So-

Sarah Gadd:
[inaudible 00:25:35].

Anthony Deighton:
… it brings home that difference.

Sarah Gadd:
The cloud is not a fluffy white thing in the sky. It’s typical-

Anthony Deighton:
So that-

Sarah Gadd:
… hardware sitting somewhere. To be honest, I think that when you’re… Especially the younger generation, my daughter doesn’t think about cloud. She just gets on with stuff. She has no idea whether it’s sitting underneath my desk or whether it’s sitting in a data center in Switzerland somewhere. She’s just innovating, developing, pushing, doing her online schooling, whatever it might be, Zoom. Here we are, on cloud. So I think that the platform will become further and further altercated, but within our environment, within a bank, in a regulated industry, as many other areas of the industry are, we have to worry about that stuff.

Anthony Deighton:
Last thing, just a complete left turn, but really want to get your view on it, is creating the data-driven culture, the ultimate people question, how you get people to think. How do you win the hearts and minds and build a winning culture around data at the bank?

Sarah Gadd:
Actually, this was something we kicked off a couple of years ago. So we created what we call the Analytics Community Forum. We decided that we weren’t going to do that with an IT focus or business focus, we were going to open it to everybody across all levels, across the entire bank, regardless of division. What we do is we do knowledge sharing sessions, we have individual groups, present use cases highlight successes, and we connect people together. There’s a lot of dot-connecting so people can reuse and work that’s been done. That’s been very successful. People have really come together, and also started to see what you can do, what is possible, what do people actually put into production, solutions that they found.
So there’s that piece of it. And then the other piece is being able to open up access to data. If you can commoditize that access in a safe way, it means that if I’ve got an idea, I can go check it out. I can go play around with something. I don’t have to wait three months when I’m then busy with something else and I get told I’ve now got access to that data. So that’s a pretty critical part of it. But the Analytics Community Forum, I think has helped drive the fact that, A, it’s a community, and it involves everybody. It’s not just people in certain functions within the bank, it’s really everybody across the bank. What have different people done in different areas, and how successful they’ve been.

Anthony Deighton:
So I love this idea that the way you create culture is you bring people together, and you actually create those networks of relationships. That maps so nicely to this idea of breaking data silos down and bringing data together creates new opportunities. I think this is a wonderful theme for the conference and for what we’re trying to get done here at Data Masters.

Sarah Gadd:
I agree. We’ll see this, I think, continue to grow out. We already see it with, Google, for example. Many of the algos they develop are out there in the open source community. You can take them, you can train them on your data, come up with your results. But this community approach, which has been embedded, I think, in the data science community for a long time, we’re democratizing across data, as well. I’m sure you’ve seen with some of the data, like shopping cart type tooling out there, you can actually take a community, “Hey, this is a good data set, and this is how you can use it” approach, which provides that human overlay, apart from someone saying, “Yes, this is quality, it comes from a strategic source,” which is more on the governance side. So it’s bringing the pieces together.

Anthony Deighton:
Brilliant. I think, again, the theme here is this idea of breaking silos, bringing people and data together, and then enabling new opportunities. Well, Sarah, thank you so much for your time and for joining us at Data Masters. It’s been a real pleasure.

Sarah Gadd:
Well, thank you. Thank you for the opportunity. Look forward to hearing questions and being able to [inaudible 00:30:08].

Anthony Deighton:
So welcome, Young. Maybe we could start a bit with some background and information on SK holdings.

Young Kim:
Thank you, Anthony. Thank you for having me today. So SK Group is made up of a number of different companies that belongs to several other industries. The only thing, I think this is the simple ways to explain what the SK Group has, only thing… the industry that doesn’t serve is the financial and insurance industries. Everything else, SK group actually owns a major companies that are actually leading each of the industries in Korea, as well as expanding into the global business market. One of those examples is SK Innovation is investing heavily on battery. Battery is one of the things that SK believes that is the next large initiatives that we’re embarking on. So they’re building a battery plant to serve Volkswagens and other companies out there in order to serve that particular industry.
So we actually have nine major companies that actually owns nearly 120 subsidiary companies with the third largest group in Korea behind Samsung and Hyundai, companies that you’re familiar with. And then our motto is happiness. The reason why our chairman actually has promoted happiness is because as we thrive for excellence, number of employees, as well as the business leaders, we’re believing that happiness will consistently grow the business, as well as throughout these transformation journey.

Anthony Deighton:
I love that. It’s a really nice focus. So maybe we could start with you talking a bit about your plans around data and innovation at the SK Group, in particular, your desire to show how to be a down to earth and achievable with your data goals.

Young Kim:
So interesting that you ask, Anthony. Over the years, we, throughout various IT systems and the number of products and services that we have created, generated a lot of data. Unfortunately, it was hidden into lots of different data systems, if you will. One of the things that we realized about three to four years ago, the data must own some sort of a knowledge that where we can actually go find a new source of innovation, if not a new source of a business opportunities. That was one of the things that chairman has driven, with his advisors, as well as number of CEOs. The last two years, they’ve been really focused on, how do we uncover data?
But unfortunately, they didn’t have a certain processes, if you will, to really uncover and expose these data into their benefits. So one of the things that we’re doing today is putting certain level of lifecycle-driven principles, as well as the practices into these companies. Of course, we have a lot of different large to small companies that we need to serve, therefore we prioritize, and making sure that their data sets, as well as their processes, are in place to execute on the uncovering the data.
So if I may say one more item to kind of bring it down, Anthony, into that level, where, what is achievable in that we’re reviewing all the legacy data. We’re looking at what type of data that we’re looking at, as well as how it relates to the business, and we’re also validating some of those data and tying that into internal and external data to really make sense out of those data. That’s how we’re bringing it down, but we’re bringing it down into a small chunks as we go through this particular journey.

Anthony Deighton:
I love this idea that there’s this kind of hidden gem, this hidden asset inside every one of the SK Group companies, and that’s the data. Like, every business is, at its core, a data business.

Young Kim:
Absolutely. Of course, not everybody’s data sets are perfect. Don’t get me wrong, we don’t have a perfect, beautiful data set from the get go and say, that set of data is going to help us our business. Some of them don’t actually have any data. They thought they did, but they don’t. So we’re uncovering what’s reality versus what they thought they had in real cases.

Anthony Deighton:
I think everybody listening will empathize with this idea that there is no such thing as perfect data. Like, nobody’s found that perfect data. So I love that. Earlier, when we were talking, you had shared with me this analogy for how you think about data and how you think about engaging the organization and data, and you’d said that we need to build the horse first in order to get to ride it into the sunset. I love that. Maybe share with me a little bit of your thinking behind that.

Young Kim:
Absolutely, Anthony. I think it’s important. It’s important to manage how we deal with companies that has different maturity level when it comes to, one, looking at data, or at least having some sort of a data in their places, as well as through their transformation journey, that they’re leveraging their own data. What I meant earlier was that when we deal with multiple different company, unlike dealing with one set of organization, we thought we had had certain direction, if you will, certain ways we’re going to handle each of the companies, and where they are, meet them at the layers, if not level, where there need to be.
So we wanted to make sure that we start with the right process before we go in and do things randomly, if you will. So, we have built… As part of holdings company, there is another smaller company called CNC. We are the technical arm that helps and enables our companies when IT comes to it and data centers, and of course, all these transformation journey. When we sat down, we said, “We need two things. We need a set of principles, as well as handle different phases of how we deal with the data and uncovering and getting into decision making, if not a insightful information. And secondly, we actually need a shared technology that we could help our companies to drive their successful journey out there.”
So we wanted to prep those two, and then bring it out to our majority of companies, and that’s how we’re taking on this journey, so that we’re able to have a consistent experience. Of course, there’s a dynamic aspect of that aspect of it in that each companies, when they are different, we also adapt into their environment, so that we could have a successful outcome.

Anthony Deighton:
So it’s like you’ve created a framework, a flexible framework to meet the company where they are.

Young Kim:
Absolutely.

Anthony Deighton:
In that sense, it sounds like each of the companies in the SK Group works closely with you, at the center. As you pointed out in the introduction, the SK Group is a massive organization. So how do you think about them as customers, and what’s that sort of customer intimate relationship?

Young Kim:
The companies that we work with, closely, we actually built two things. One, we assigned a group of individuals that’s leading a… serving that particular company, and then we also built relationship. While we actually respect working with IT in doing the data journey, transformation or transformation by data, we have built a relationship, deeper relationship with their strategy office, as well as the business leaders, so that we’re able to serve and understand each of these companies and how they do their business within their respective industries. So again, just to summarize, we have a dedicated people who’s consistently working with them, being on site, and making sure that we’re in sync as we take this particular journey.

Anthony Deighton:
Got it. So you’re really thinking of the IT groups within each of those companies as your customer?

Young Kim:
Right. And we act as advisors or set of advisors, as well as we serve knowledge, as well as the shared tools, so that they are able to embrace these and accelerate on what they’re trying to do.

Anthony Deighton:
Yeah, supercharging their work. That’s great. Now, I understand you’ve begun some work, some sort of pretty big data mastering initiatives using Tamr, so we’re excited about that, of course. But maybe you could share with those listening some of the things you’ve learned as you’ve begun that work.

Young Kim:
Absolutely. Absolutely. One of the things that we’re using Tamr is to target the… working with SK Constructions. So construction builds companies or manufacturing companies, and they build a huge plants out there. In fact, they built a three football size, US football field size, a semiconductor building for SK Hynix. That’s the second largest semiconductor company in globally. And then we’re also building, as I spoke about earlier, about the battery manufacturing plants, in Hungary, as well as the Georgia in US, and China. So as they build these buildings… Of course, they build a lot of buildings and Korea, as well. But as they build these buildings, we ha we work with a lot of different workers on the manufacturing floors or building plant.
When we looked at Tamr, we thought it would be great to understand the each workers who’s building these buildings, how they behave historically, as well as mapping into their roles and all of the necessary information that was spread out into different systems. The reason why we did this is because we wanted to figure out how we could protect their safety as they work in a more dangerous areas, as they work with big equipments, as well as areas that could be hazardous to our health. So, we applied Tamr to look at each of the source of data, the IoT data that’s coming in, and we wanted to make sure it’s centered around that particular worker and making sure their skillset, their location their work at, and their reactions, of course, we’re applying some of the video analysis around there to attach those to ensure that their safety is kept as they work on these construction sites that they belong to every single day.

Anthony Deighton:
Wow. That’s [crosstalkinaudible 00:42:59] incredible. That’s such an interesting use case, this idea of health and safety on the workplace. It’s such an important issue and a wonderful focus. It’s really great. So Young, we’ve talked a lot about the SK Group structure and you have lots of these businesses working together. I would imagine that sharing progress is really important to making sure you’re getting early wins and sharing that progress. So how do you report progress to others in the business? And then even more importantly, what are some of the business benefits, the benefits that the organizations are getting from this collaboration?

Young Kim:
Great question. Anthony, one of the things that we do is we formed a collaborative team that consists of business leaders, as well as the strategy leaders that formed by the respectable companies, as well as our team provides number of resources to form a collaborative team. Within this team, as we define how each companies that we work with, how they would like to start that data-driven journey, as we progress into looking at legacy data, formulating it, cleansing it, analyzing it, and when we get to the sensible data, we actually share that with the group first we evaluated. And then that particular look and feel, if not through visualization, along with the report, goes to the CEO, and we play out that whole scenario, and then help the CEO to understand what we were able to uncover and what we can do in moving forward to formulate the next steps. That’s how are able to communicate, share data, as well as looking at the right set of information to take the next steps.

Anthony Deighton:
Got it. So that’s really a strongly collaborative effort. I really love that you-

Young Kim:
Absolutely.

Anthony Deighton:
… involve sort of all levels of the organization, from the CEO down. Do you find that there are particular groups of people in the organization who are skeptical about the idea that data can be an asset, or is it the opposite, like, everyone knows data is available in a perfect state and ready to go?

Young Kim:
I wish I had the perfect answer for you, but because of the number of different companies, which you will find different set of leaders and knowledge workers, we actually have both types of folks, in that some are skeptical, some truly, truly believe data is the only asset, and in fact, some believe that it could actually be monetized and make money off of data. And then some are like, “Ah, we don’t even have data.” So there’s a lot of angles and a lot of viewpoints. However, one by one, we’re winning their hearts and showing them. But most importantly, if they don’t have the data or right data, we’re helping them to aggregate these data, but we’re also learning what source of data is out there, so that we can actually put it all together to help their businesses move forward.

Anthony Deighton:
Go it. I love that. So like a data factory-

Young Kim:
Exactly.

Anthony Deighton:
… mixing ingredients to come up with something special.

Young Kim:
Exactly.

Anthony Deighton:
So, if we think about it as a data factory mixing these ingredients together, is there… maybe, is there a secret ingredient you’ve discovered for turning raw data into a valuable business asset?

Young Kim:
Until recently, and when I say recently, Anthony, it’s very recent, government in Korea weren’t allowing us to share data across each companies. If you just imagine SK Group owns telecom, all the way to a company that builds a little wafers that you typically see on your motherboards, as well as memory chip, and it goes back to… And then vertically they own hotels, all the way to golf courses. So imagine amount of consumer data versus manufacturing data that you could collect along with construction and IoT. So, it’s unimaginable amount of data that these guys collect, and yet they weren’t able to crisscross and share data among each other’s companies. But the crazy part was, many of these companies, sort of same… a single consumer, if not, a single case where they’re able to find more assets, if not knowledge, around the data. But recently, we can [inaudible 00:48:07] number of these set of datas, so that we can actually cross-share to look at the more holistic information around this data. Now, to answer your question directly, we didn’t really have a secret sauce, but I think we’re just finding one as we speak, upcoming days.

Anthony Deighton:
Got it. No, I love that. The value comes when you bring these data together.

Young Kim:
Absolutely.

Anthony Deighton:
It’s nice to see that you now have been opened up or allowed to do that.

Young Kim:
And it’s a great opportunity, in my opinion.

Anthony Deighton:
So, we talked a little bit about the amazing work that you’re doing with data. When we think about Tamr, the real value is this machine learning based approach to bringing this data together. Maybe you could share some of the surprising insights that you’ve gotten as you’ve worked with Tamr, and just your perspective on Tamr, but also this idea of machine learning for data mastering at scale.

Young Kim:
It was January, it was col in January, in Boston, as expected. When I sat down with Andy, your CEO, Andy Palmer and the senior leadership, I learned something. It was very attractive in the way Tamr leadership presenting themselves, explaining their approach to build a tool to master data for all of us as an enterprise customers. Two things that really caught my eyes: how you guys took a vanilla approach to a AI and really making it real, and second thing is involving all of the people who could be contributing into making that data element to be real.
I thought that was brilliant, in that mastering data is new, and at the same time, what you guys have done was modernizing it, making sure that there is a machine learning element that once the business users recognize that data and help them make that decision, you learn from that and you aggregate a number of those decision points and scaling it out to making sure that large set of data is coming towards to the business users as a single of a entity, so that they are able to look at and assemble master set of data, very simple and easily moving forward. I thought that was brilliant. That’s where it got me started, and we thought, knowing how much data set that is [inaudible 00:51:06] out into a number of different systems, something like Tamr solution could really help us to accelerate, to create that dataset that we needed and looked for.

Anthony Deighton:
Excellent. This idea that we… There’s a human element to it. It’s not-

Young Kim:
Absolutely.

Anthony Deighton:
… just a machine, but you need the human input, seems to map very nicely to SK Group and the way you think about the problem.

Young Kim:
Absolutely. That’s really helping us already in the SK Constructions. In the networks, we actually have number of consumer-driven business to identify and involving the knowledge workers to drive number of decisions, and that, the social and human aspect of that was brilliant.

Anthony Deighton:
We talked earlier about, how do you get people involved and how do you get people to-

Young Kim:
Exactly.

Anthony Deighton:
… see value? It sounds like one really important dimension of that is actually having them put their energy and put their time in and put their opinion into their system.

Young Kim:
Absolutely. The outcome from that is we’re able to see… we’re seeing accelerated ways we mastered these data for these businesses, and the time reduction is amazing.

Anthony Deighton:
By involving more people, but supporting them with machine learning, you achieve results faster.

Young Kim:
Absolutely.

Anthony Deighton:
Excellent. Well, Young, thank you so much for your time and sharing your insights with the audience. We certainly look forward to many years working successfully together.

Young Kim:
Absolutely, Anthony. Thank you so much for having me, as well as being a great partner to SK Group. (silence).