In the quest to become truly data-driven, organizations must bridge the communication gap between non-technical business leaders and technical data experts. Enter Professor Joel Shapiro, a renowned figure in the field working to foster collaboration between these two groups to effectively match business problems with data solutions to achieve shared goals.
Sounds easy? Not for the vast number of businesses overwhelmed by the sheer volume and complexity of data. For a data scientist, deciphering how to translate a business person’s needs, communicated in jargon-filled language, into an insightful model or actionable insights can be quite a challenge. And the big question is: how does all that jargon get converted into a set of data and analysis that makes sense?
Tune into this episode of the Data Masters Podcast with Joel Shapiro, Clinical Associate Professor of Data Analytics at Northwestern University’s Kellogg School of Management and Chief Analytics Officer at Varicent. Joel dedicates part of his time to executive education, working with MBA students and business leaders, helping them translate their business problems into actionable insights through data. Simultaneously, he guides technically-minded graduate students aspiring to become data scientists, equipping them with the skills to become valuable partners in their organizations. The common thread in Joel's work with these two groups is bridging the communication gap between the business and data teams, enabling them to collaborate effectively and achieve success through data-driven problem-solving.
Intro/Outro - 00:00:02:
Data Masters is the go-to place for data enthusiasts. We speak with data leaders from around the world about data, analytics, and the emerging technologies and techniques data-savvy organizations are tapping into to gain a competitive advantage. Our experts also share their opinions and perspectives about the hyped and over hyped industry trends we may all be geeking out over. Join the Data Masters podcast with your host, Anthony Deighton, Data Products General Manager at Tamr.
Anthony Deighton - 00:00:39:
Welcome back to another episode of Data Masters. Today we have the distinct pleasure of hosting Professor Joel Shapiro, who teaches at the Kellogg Graduate School of Management at Northwestern University. He also serves as the Chief Analytics Officer at Varicent. Joel's career has been nothing short of stellar, seamlessly bridging the realms of academia and the ever-evolving demands of the business world. His insights are widely sought after, and you can frequently find his work featured in publications such as Forbes, Harvard Business Review, and CNBC. Welcome to the podcast, Professor Shapiro.
Joel Shapiro - 00:01:15:
Yeah, thanks for having me. Happy to be here.
Anthony Deighton - 00:01:18:
It's also nice to have another Northwestern person, who's our second Northwestern, either graduate, or in this case, academic, being a Northwestern alum myself. Clearly, I have a bias in looking for people to join the show.
Joel Shapiro - 00:01:31:
Nice. Actually, I'm an alum too. I went to law school there, even though I no longer practice law. That was a short-lived moment of my career.
Anthony Deighton - 00:01:38:
Excellent. Double whammy. So at Northwestern, you teach both MBA students at Kellogg and master's and analytics students at McCormick. And those are very different populations, but they're also kind of good archetypes of the kinds of people that many of our listeners have inside their organizations. Not to mention, of course, those are exactly the people who are going to join many of our listeners' organizations upon graduation. Maybe you could contrast the experience you have teaching these two different groups, also what the differences you see within the groups and how they approach the challenge of working with data.
Joel Shapiro - 00:02:17:
For sure. So most of my time is spent with the MBA students. It is the business leaders, aspiring leaders, a lot of executive education with business leaders. And with them, their focus is understandably on solving problems. They've got business problems, business opportunities, and that's where their head is. Unfortunately, sometimes when you start talking data with them, they sort of lose sight of the fact that first and foremost, they should be thinking not about the data, but about their business problems. And then the other side of it is the data expertise. So I get the great privilege. It's only one quarter each year, typically winter quarter. I teach in that engineering school program that you mentioned. And it's super smart, very technically minded, really skilled graduate students who are training to be, most of them say data scientists. That's sort of their aspiration. And it's the exact opposite, right? They're not really focused on the business problem. That's not what they're learning. They're focusing on a set of technical skills to analyze data and do data science. And so whereas with the MBA students and the business leaders, I'm trying to help them figure out, like, how can you take your business problem and figure out how data can bring insight to it, help you solve it, with the data experts, I'm all about how can you make sure that you're actually valuable to a business? Because otherwise you get this case of people graduating with these amazing technical skills and then out there just sort of like saying, I got great technical skills, but the business team doesn't really know how to use those technical skills. And so the common thread with the two of them is how do you match up business problem with data solutions, but the way in which they sort of work together and you get that sort of synergy, that's, I guess where all the magic happens, right?
Anthony Deighton - 00:04:07:
Yeah. And interestingly, the two groups we're talking about here, one are, let's broadly call them technical, and the other are, let's broadly call them business. As a math and statistics undergraduate major myself, I'm curious if you've seen people studying, not in McCormick, studying engineering and the technical side, but studying stats or math. Is that a degree that's less relevant, or do you think we should see more of those people?
Joel Shapiro - 00:04:34:
Look, to me, any kind of technical quantitative domain right now is needed and still very much in demand. Sometimes those kinds of skill sets will turn into more basic science kinds of inquiries, and we need lots of really good scientific thinking and research. Sometimes it's much more applied in the data science realm. And I'm one of those big fans of, There's some really good data science tools out there and some just at the press of a button, but I'm a big fan of sort of understanding what goes into them in order to not just make good use of them, but also I think we're sort of training people to build the next version of those same tools to make sure they do what we want them to do. So if you're asking me, do I think we need more of the technical domain across all the quant stuff? My answer is always yes.
Anthony Deighton - 00:05:23:
Yeah. No, it's interesting. I think it feels like in many cases, the technical side is focused on how to install and manage the technology over what is the actual algorithm doing under the covers. And in many cases, lead to really poor outcomes because you don't really understand the math of what's happening. So look, inside every organization, people are listening to the podcast. I think every one of them would agree that the volume of data that they have and have available to them has increased significantly as have the sources and destinations for those data and people or organizations increasingly want to become data driven and they have whole strategies CEO's are talking about becoming data driven. And something we've talked a lot about on the podcast is how data teams can help organizations achieve this goal. And you have a lot of really practical thoughts on how to help these technical teams, these data teams, engage their organization and become data-driven. And frankly, how to avoid some pitfalls. Maybe you could share some of that.
Joel Shapiro - 00:06:26:
Yeah, I'd be happy to. So I had this funny thing happen. It's probably like seven or eight years ago now. And actually, I got to rewind even before that. So my wife is a sociologist at Northwestern. She's a professor at Northwestern. And she used to study why people did not vaccinate their kids. Now, this was pre-COVID. All the vaccine stuff was very different back then. And she was really interested in how people made decisions around to vaccinate or not vaccinate their kids. And she did a lot of research on this. And the story that she heard from a lot of people about why they didn't vaccinate their kids was always, look, this is my baby. This is my child. Nobody tells me what to do with my baby and my child. It wasn't so much about data or evidence of effectiveness. It was just very much sort of this emotional connection to make decisions around their family. She'd get very frustrated then when public service announcements or try and get people to vaccinate their kids by just pushing more data at them. Sort of put that aside. About seven or eight years ago, I'm doing this pretty big project with a financial services company, and I'm getting ready to go on stage and talk to a bunch of people about data. And this one guy, this head of product, this one product line pulls me aside, and he's like, look, I'm glad you're here, but I got to be honest, the data team is kind of pissing me off. And I'm like, okay, what's going on? What's kind of a weird thing to tell me is I'm right about to walk out on stage here. And he says, look, I'm an expert in this area. And the data team keeps trying to tell me what to do differently. Like, this is my baby, and I know how to make my baby work or something like that. Right? The language was incredibly similar. Right? And in that moment, I was like, oh, my God, the people who provide the data output have a much bigger burden and responsibility and opportunity than just providing output. They have to convince and they have to influence and they have to collaborate because if good information and good models are being seen as a leader, as they're trying to tell me what to do, that's just going to be a disaster for everybody. And so that was sort of the genesis of me sort of thinking about how do you make these functions, the data and the business, how do you make them work better together and not have it seem like one's trying to tell the other one what to do, but really collaborating and driving outcomes. So that was sort of this moment for me. And I've sort of spent some time with a lot of data leaders and I run workshops and I get a lot of feedback on what exactly it is that data teams can do. And yeah, so I've sort of come up with some ideas about what are some of the ways that collaboration and value can be enhanced by the data teams for sure.
Anthony Deighton - 00:09:10:
I love that analogy because the idea there, no anti-vaxer is trying to do the wrong thing, quite the opposite. They're trying to do, from their perspective, exactly the right thing, as business leaders are also trying to do the right thing. But they approach the problem emotionally. So yeah, let's get into some of these practical suggestions, how the data team can bridge that communication gap and really help make that business data-driven.
Joel Shapiro - 00:09:36:
For sure. So like, I always like to start when I'm helping people data team. So I sort of framed this as developing the next generation of data leaders. And I think there's a tendency for data teams to be given a project and then they go heads down and try and solve it, which is not necessarily wrong, but I think it's a little bit of a missed opportunity. When I think about how data can be valuable, I always like to start with empowering data teams to be problem spotters, not just problem solvers. And I think that that's important because oftentimes the people who understand the data in an organization have a unique perspective sort of holistically, and they can see trends and phenomena and missed opportunities in ways that a lot of other people cannot see. And so when the data team isn't just solving the problem from the business team, but is actually bringing and surfacing those problems to them, I think that's like an important first step. It's empowering and it also helps the business team. So that's where I like to start all of this stuff.
Anthony Deighton - 00:10:39:
That's a disruptive thought in its own right, because I think the common wisdom is data teams should go to business teams and ask them, well, what problems do you have that we can go work on? As opposed to coming and presenting problems that they have identified by looking at the data. Is that what you're saying?
Joel Shapiro - 00:10:56:
Yeah. I mean, look, I would prefer that a data team not find the problem and then devote the resources to solving it with first going to the business team. You got to go to the business team and say, this is what we're seeing. Is this a business problem worth solving? Because it is that business leader who makes that call. But when I was first developing some teaching content and speaking notes around this kind of area, I did what everyone does. You Google it and you see what else is out there. And the common thread out there is don't be a problem spotter, be a problem solver. Nobody wants their team to just bring them problems or the worst adage ever. The manager who says, I don't want anybody on my team bringing me a problem unless they know how to solve it or they tell me how I can solve it. I think that's horrible because I would love as a manager to have a smart data team saying, we're seeing this. Is this a problem? I think it broadens my view, gives me greater opportunity. So for sure. The second thing, and I would sort of argue the most important one and one of the hardest ones to do well is what I'm going to call problem scoping. So once there's sort of an identified business problem, data team has got to put it into some sort of form that is analyzable, or at least some of their data work or analysis is actually going to meaningfully get at that business problem, whatever it is. And that sounds, on the one hand, sounds super obvious, of course. But I've experienced this myself. I've seen other people experience this. When a business person tells a data person what they're looking for, it's often jargony. It's hard to figure out how that's going to translate into some sort of insight or model or whatever it is that you're going to do with the data. I mean, you get people saying things like, hey, I need you to work your data magic to figure out how our go-to-market strategy is going to leverage our unique place in the industry or whatever, something like that. And you're like, okay. I get all of those words, but I don't know.
Anthony Deighton - 00:12:54:
And so how does that translate into a set of data and an analysis I can produce? Yeah.
Joel Shapiro - 00:12:59:
I mean, I like to give people like a little scoping document. And one of the most important questions on that document is. What measure, if it improves is evidence that we've sort of lessened this problem. And forcing people to sort of pinpoint that metric, it's really hard for them. But until they can do that, look, my statistics brain is thinking about the left-hand side of the equation or whatever it is. But what are we talking about here? What do we need to see evidence of to feel comfortable that we solved the problem? And almost no business leader that I have met speaks in data terms that immediately lend themselves to an obvious answer to that. So it's on the data expert to sort of tease that out of them.
Anthony Deighton - 00:13:43:
Yeah, I love this idea that they're actually, in a way, talking or speaking different languages. And it's not that one language is right and one language is wrong, but you put the owners squarely on the data team to be this translation layer between, I don't mean this pejoratively, although it's going to sound that way, business speak and data speak. Is that fair?
Joel Shapiro - 00:14:06:
Well, look, when I'm talking to business audiences, I try and put the onus on them. And I say, it's not fair to be ill-defined in what you want. And I educate them around data science, and hopefully they get better at it. But I just foresee them. Not being great at it in a lot of contexts. And so it's the data person who has to leave that conversation and go do the work. And so I want the data person to ask as many questions as necessary before saying, okay, I'm comfortable knowing what I need to do next. And that's hard because sometimes those conversations can get kind of lengthy. You want to ask a lot of what if questions and the business person is like, why are you asking me all these questions? This is your job. We hired you because you're an expert in this. But it's an important skill to make sure you're really comfortable with what you're about to go do. Because if you go spend two weeks, two months working on a project that you weren't clear on to begin with. That's just no good for anyone.
Anthony Deighton - 00:15:04:
Well, let's talk about that. So great. Now we've scoped the problem. We've put the onus on the data team to understand this business be translated into a project with some data that they can actually do something with. Now, do they finally get to go back to their office for six months and do the work and come back and present the results? Like, cause that's where I'm comfortable as a data person. I prefer to just hang out at my office and do analysis. That's what we should do.
Joel Shapiro - 00:15:28:
Yeah, look, I mean, so you can start for sure, but if you're disappearing for six months, then that's problematic for sure. So it's funny. So all of the different things that I talk about, another one is this notion of problem shepherding. And that's sort of what you're alluding to right now. So problem shepherding is making sure that you don't go close your door for six days or six weeks or six months or the duration of the project, but that you check in periodically and you say, hey, this is what we're learning so far. A lot of what if questions again, what if we learn that the end result was this? How would you feel about that? How would you act on that? Do you think we're barking up the right tree? Are we in the right direction? I originally didn't have this sort of in one of the things that I would talk about as an important skill set. And then I was giving a workshop and the chief data officer who was sponsoring the workshop, she said to me, this is all really good, but you're missing something. I got a great team. They're great at solving problems and great at scoping problems and blah, blah, blah. But the problem is they go close their door for a month. And then appear with the results, ta-da! Sometimes it's a miss and we could have caught it earlier. And I sort of validated that with some other people who are like, yeah, this is a really important skill. They don't get the luxury of just sitting there undisturbed for the time period of the project. They got to talk to the business team, make sure they're going in the right direction and course correct if necessary.
Anthony Deighton - 00:16:52:
Yeah, I mean, I think the analogy to software development is an apt one in the sense of agile software development. You don't go dark, build what you think is the perfect product and then ship it only to find out that you were wrong. But creating these tight loops of feedback is really valuable.
Joel Shapiro - 00:17:08:
Yeah. I mean, I hear like one of the things that always drives me crazy is, really good modelers, data modelers, for instance, like really talented data scientists who will say things like, look, my model speaks for itself. Oh, it really does not. Especially when you're talking about somebody who's not a data expert. Your model never speaks for itself. You have to speak for it. You just do.
Anthony Deighton - 00:17:33:
Speaking of that, and when models speak, and we made fun of business people for speaking in jargon, turnabout is fair play. When models speak, they speak gibberish, at least from the perspective of a business person. And this is something that's come up on this podcast a number of times, but this common tendency for data teams to present results in this way that, to your point, it's self-evident that if the T-score is this, or if my model shows that, and it's like, no, not so much. But maybe talk about how to break that other side of the communication gap.
Joel Shapiro - 00:18:10:
Yeah. So let me sort of back up a little bit. I think the source of the problem is sort of this. A lot of the technical team, the data team, they are hired because they have great technical expertise. And they know that. That's the value that they bring to the table. And so when they have an audience with a senior business leader or they're presenting to a business team, they're like, I'm the technical expert. I'm going to share the technical expertise that I have. But that's so rarely what the business team needs. But it's sort of this mismatch because somebody knows that I'm not a business expert. I'm a data expert. So I'm going to share the data results. And so that problem of miscommunication, I'm going to speak tech. I don't really know what that means or how it solves my business problem. That becomes a problem that I think, oftentimes, is called like, oh, we just don't have a good culture for data. Like I've seen like some pretty straightforward breakdowns in communication be blamed on. We don't have a data culture. Okay. You just aren't speaking the right way to each other. When we talk about communication, I think of it as sort of solution translation. I've done some analysis. Now it needs to be translated in some meaningful way. I think there's sort of an unfortunate trend right now that when people talk about communication with data, everybody jumps to data visualization. It's important. I mean, don't get me wrong, but it's one slice of data communication, which is one slice of a broader set of skills that data folks need. And so there's a heavy, heavy emphasis on data visualization right now. Important, but certainly not the whole ball of wax. I am a big fan of forcing people to do data analysis, to write memos around their results. And you can include some graphics and there can be a visualization in there. But I force people to do this thing I always call the two-page analytics memo. And it is super short and it makes people super uncomfortable because number one, they actually have to write words. And number two, two pages, it's not that much space, especially if it's sort of a big project. And we sort of talk about when it's not sufficient. And it's not always the right tool in the workplace, but it forces people to really focus on what matters and to communicate the important things in simple ways.
Anthony Deighton - 00:20:26:
Yeah, I think a common theme in what you're talking about is empathy. So for both sides of that conversation to put themselves in the shoes of the receiving party of the communication, understand what their goals and ambitions are, what language they speak, and then work to find that common ground in the communication. Is that fair?
Joel Shapiro - 00:20:45:
I think that's right. I'm always a little hesitant, though, because, if you take the data team and you're like, okay, now you've got to have some sort of empathy bone I mean, they're gonna be like, really? That's definitely not what I went to school for. Well, it's okay. People have skills, you know, that they're required to embrace that they didn't go to school for. But I get a little nervous about dumping too much responsibility on the data team. In my class, we always start every 10-week term by talking about what the ideal data scientist would look like. And everyone's great. They're coming up with all these adjectives, and they span a lot of different domains. And then at the end, you're left with this totally unrealistic picture of any given person because it's just too many things that they do perfectly, including being empathic, being a good listener, being able to influence. Oh, and by the way, I can also code and build a neural network. Like what? That doesn't always go together. But I think it's important. You just got to be careful what you can actually expect. But the good news is that there are a lot of skills that can be built upon to be a really good and valuable data leader. That's how I see it.
Anthony Deighton - 00:21:53:
All right, so we've talked a lot about the people. Let's talk a little bit about the data. And it feels like a lot of organizations get stuck trying to find the best data to answer a question. And you've got some thoughts about how to avoid spending all of your time sort of digging for data and maybe share a little bit about how to avoid that common data pitfall.
Joel Shapiro - 00:22:17:
Sure. So look, I think that one of the most common sort of points at which a data project can fail is when people think to themselves, I don't have perfect data. So I either need perfect data or I can't go forward. And that's problematic because often you don't have perfect data. Look, there's some really good strategies that allow you to get very good data. So sometimes I'll be working with a company and they're like, yeah, that'd be great. I'd love to be able to, I don't know, build a customer churn model or whatever it is. But we don't have that data. Well, it turns out it's actually not that hard to get it. Sometimes you just have to know where to look in your organization. But other times, if you want to sort of get feedback, you just have to go into the field with some new initiative or some communication. And you can sort of get the data that you've been looking for, like actually getting out and trying the stuff that you're thinking about. And pilot tests can give you a rich source of data on which to build models frequently. It's okay to use measures that are proxy measures, but you need a good faith assessment of whether your data is actually a reasonable proxy for the things that you care about. So perfect data doesn't exist often. But sometimes you can get it if you sort of look in the right place and if you are willing to sort of think about how proxies are helpful to you. That's sort of, I know, very broad, but point is, if you're looking for perfection, you're always going to sort of come to this point where you think maybe you shouldn't go forward. And that's kind of a bummer. You can do some valuable stuff with imperfect data for sure.
Anthony Deighton - 00:23:48:
Sure. And I love this idea of manufacturing the data, like running the experiment as a mechanism of generating the data that you need or a proxy for data that you would like. And doing so not necessarily maybe on a subset or in a certain market or something like that. Is that a fair way to say it?
Joel Shapiro - 00:24:09:
Yeah, like I've had people say things like, well, we'd love to build a predictive model on what customer response is gonna be to this new product, but it's new, so we don't have any data. Great, so go out with it. To a small group. Look at the reaction, deliberately measure, right? If you're forward looking in how you're gonna measure, things become much easier than if you're just looking backwards at the data that existed. It's always cleaner and easier if you sort of plan for measurement. But you go out and you offer it to a select group with the idea being, I want to collect enough data about response so that it will enable me to build models to deploy it to the broader market to see if it's a go or no go or who should get this offer or who shouldn't, whatever the context is.
Anthony Deighton - 00:24:50:
It feels like a lot of organizations would have benefited from that approach. I think myself with Coke, with New Coke, would have benefited from that strategy. So one question I've asked, I think almost every guest on the show, and in fairness, it's a slightly unfair question. But it is of the moment. And so I will ask it again here. Gen AI, ChatGPT, all these generative AI technologies. So first question is, is it over-hyped or under-hyped in your view?
Joel Shapiro - 00:25:19:
I think it's appropriately hyped.
Anthony Deighton - 00:25:21:
Exactly on target height.
Joel Shapiro - 00:25:22:
Exactly. Couldn't be any better. I definitely don't think it's over-hyped. I am a big believer that it has transformed, and it will continue to transform what we do. I'm a perfectly fine coder. I'm fine. I am a much, much better coder with a ChatGPT window open next to my coding environment, like much better. And when I just sort of think about how I'm benefiting from that, going to be staggeringly productive for a lot of people. So, I'm not entirely sure how it's going to be used in all sorts of contexts for sure, but it's big.
Anthony Deighton - 00:26:02:
And so let me ask you to apply that insight to a domain, which is analytics and decision making. So do you think Gen AI technologies should, would, will have an impact on data, analytics, decision making? And just to anchor for a second, I've heard people speak, and I think we've even had folks on the podcast who say, In the future, executives, business leaders will hang out on the beach and all of their best decisions can be made by a Gen AI agent that works on their behalf. Like we are really doing executive class out of a job. Again, I'm trying to anchor on the far extreme here. So what impact does AI have on data and analytics?
Joel Shapiro - 00:26:48:
I think that we got to differentiate between automating processes and analyses versus automating decisions. And I think the automation of decisions is not necessarily what I think of with Gen AI. I show in my class, I show this really simple example of a robot arm. It's a video that I show. It's a robot arm that's looking at recycling along a conveyor belt, and it's picking out tin cans, aluminum cans, whatever, picking them up and putting it into a pile. That's the automated process. And it looks super automated, and it is, but I always talk about the human involvement. And one of the most important human involvement components is how sure does the machine have to be that something is a can? Before it picks it up and treats it like a can. Because if you're setting a level that needs to be too sure, you're going to let a lot of cans go by. And if you set a level too low, you're going to pick up a lot of things that aren't cans and put them in the can pile. Right. And the balance of being wrong, the costs of being wrong, like, I don't know which is the more costly error in that automated process, but the people who do know are the experts in that industry. And that kind of decision-making, even in the most automated of cases, to me seems like it's going to be sticking around for quite a while. Yeah. I kind of hope we'll be seeing more time on the beach for everybody, but I'm not sure that's the context in which it'll happen.
Anthony Deighton - 00:28:20:
I think that's a really insightful point, certainly from Tamr's perspective, this idea of human in the loop and having humans guiding and providing feedback to the AI. We think about this, of course, in the context of looking at data and improving the quality of data. But to your point, how important is it if you match two customer records incorrectly? Is that a catastrophic error? Is that a minor error? If it's two healthcare patients, it could be catastrophic. If it's two people on Facebook, well, maybe not such a problem.
Joel Shapiro - 00:28:49:
Yeah, that's a great point. Great example.
Anthony Deighton - 00:28:51:
So this has been fantastic. I really appreciate you taking the time and joining us. There's so much we didn't talk about. So if people are interested in learning more about you, your writing, classes, how they can get into Kellogg, where can they go?
Joel Shapiro - 00:29:06:
Well, I'm not so sure about the how to get into Kellogg that I'll direct them to the Kellogg website. But for me, you can find me at joelshapiroanalytics.com. That's just a little site that I keep. And I have some articles and lists of workshops and those kinds of things that I do. So easy to find me there or on the Kellogg faculty page for sure. Or LinkedIn. I like LinkedIn.
Anthony Deighton - 00:29:26:
Awesome. Well, I would certainly encourage readers who found this helpful and interesting to dig in, because I'm sure there's much more detail available. Well, Joel, thanks so much for joining us on Data Masters.
Joel Shapiro - 00:29:38:
Yeah, my pleasure. Thanks for having me.
Data Masters is brought to you by Tamr, the leader in data products. Visit tamr.com to learn how Tamr helps data teams quickly improve the quality and accuracy of their customer and company data. Be sure to click subscribe so you don't miss any future episodes. On behalf of the team here at Tamr, thanks for listening.