Tamr Insights
Tamr Insights
The Leader in Data Products
June 7, 2021

What Blackstone’s Chief Data Architect Has Learned on the Firm’s Cloud and Data Journey

What Blackstone’s Chief Data Architect Has Learned on the Firm’s Cloud and Data Journey

The cloud plays a key role in the Blackstone Group’s digital strategy. The global investment management company maintains a cloud-first mentality and uses the cloud for all IT functions whenever possible. For data management, AWS and Snowflake play important roles. The firm runs Tamr on AWS to master, clean and curate its data to create golden, or complete records, on the companies in its investment portfolio. The data that’s fed into Tamr comes from Snowflake and is also stored there after being cleaned.

In this blog, Blackstone Chief Data Architect Thomas Pelogruto shares what lessons the firm has learned from its cloud and data journeys. Some of his advice he offered included aiming for small wins when migrating to the cloud, using data mastering as a start for digital transformation projects and turning to the cloud instead of setting up a data center.

To hear more on how Blackstone’s cloud-first mentality pays analytic and business dividends, listen to this webinar.

Wistia video thumbnail

Less noise, more golden records

Blackstone is a very large asset manager that has reached into a number of public and private entities spanning the globe. Keeping track of all those relationships — how can we uniquely identify a property portfolio company or a third-party vendor — is a daunting task. Doing that inside of a master data management system, you’re constantly faced with data governance challenges, duplicate records and all of that associated noise.

In order to improve our signal to noise ratio and actually get that golden copy of uniquely identifiable legal entities, that’s really where Tamr came in. You have data sitting in different caches and we try to move all the data into a singular data warehouse, but it’s still in different databases, in different data domains. And now we have that opportunity to actually be a lot better about unifying the data set and making it an actual golden copy of our data.

To the cloud and beyond

In the old days, you’d plan out a five to seven year infrastructure plan, build a data center and you would have to really be relatively static with that plan over that period of time because to do a server upgrade or network upgrade just took a really long time and was a huge capital expenditure. With AWS, you don’t have to worry about how many users there are or maintenance. All of that is then built in and just readily available. And having an ability to have a singular access point for all of the different data sets that exist in a firm as large as ours, the only way that was feasible to do that was to use a cloud-based data warehousing solution like Snowflake.

Get your data warehouse in order

You have to put a material amount of effort into assembling your data. That’s where Tamr, particularly around our cloud-based data warehouse Snowflake, was a very powerful combination. We got our data organized into a clean cloud architecture. We are also on AWS, so we’re moving everything into the cloud. That was definitely a prerequisite to being able to fully embrace and actually integrate a tool like Tamr to make that data clean and usable inside that warehouse.

Moving to the cloud? Start small

It’s definitely a divide and conquer strategy. Don’t boil the ocean. Getting a few wins, getting some of your data into the cloud is valuable for not only getting buy in, but also learning how to use this new set of tools. [The tools] are very different and that’s the more surprising part. It’s usually good to start with a small group of stakeholders and a very well-defined problem and dataset. You’ll learn a lot along the way and be in a much better position to really do this at a greater scale.

Don’t try to collect all of your data. That’s going to fail. You’ll never collect all of your data. Go with one data set, one set of questions you want to answer. Then other things will fall into place and you can build from there. We have a very agile methodology at Blackstone and this approach fits very well into that mindset.

Digital transformation starts with mastered data

[Data mastering] is the first step in all our automation stack. From there, it’s about actually undergoing digital transformation and making very laborious processes streamlined and automated as well as joining and reporting back on various kinds of data that you were never able to do before, simply because you didn’t know how to bring together data set A and B. All of a sudden, it becomes feasible to do. Then you start gearing up to get the rest done. It’s the first step in the journey. It’s definitely not the last step.