datamaster summit 2020

Tamr Wells Mastering for Energy Companies—Demo Days

 

Mingo Sanchez

Senior Sales Engineer

Gain unparalleled insights in your wells data. Learn how you can break down data silos to achieve high quality, consistent wells data in order to improve operations, lower their risk and enhance user experience.

Transcript

DemoDays Energy _ Wells Mastering.mp4

Length: 7min 56sec

Tags: Demo, Wells, Data set, lifecycle, Datum, Information, Source, Academic discipline, Different Recordings, API well number, Oil well, example, Expert, record, Wealth, location, Feedback, pattern, Machine learning, User

1 Mingo on how Tamr can be used to create a consolidated view of a well

Tags: Oil well, Oil, Data set, Machine, Mingo Sanchez

00:01 – 00:53

Mingo Sanchez

Hello, everyone, and thank you so much for joining today’s Tamr Demo Day. My name is Mingo Sanchez, and today we’ll be walking through one of our most interesting use cases at Tamr, which is Wells mastering in today’s demo. We’ll be going over how you can use Tamr to create a consolidated view of an oil well across its lifecycle. Now, before we jump into Tamr itself, I just very quickly want to start with why this is an important problem for us now. As you all know, working with oil wells data can be very challenging for a number of different reasons. You might have different formats within your data sets, different fields present from one source to another. You might even have latitude and longitude coordinates that aren’t at the same level of precision. So if you try to do a match on those lat long coordinates, you’re not going to be able to see that two different locations are actually the same place.

00:54 – 01:24

Mingo Sanchez

Now with Tamr, we take all of that messy data and we’re able to bring it together using human guided machine learning so that just like your data, experts would be able to do themselves, you’re able to teach a machine to do the exact same thing. So you’re able to bring together those sources of geological data drilling and completion data, even external data like permit information. All of this data that you have available at your disposal, you’re able to bring together and ultimately create that consolidated view across the lifecycle of an oil well.

2 How does Tamr work?

Tags: Different Recordings, Oil well, Information, Data set, Machine learning, Expert, Academic discipline

01:25 – 02:05

Mingo Sanchez

Now let’s dive into how it actually works in Tamr. So moving over here, you can see that we’ve loaded a number of different data sets into Tamr, and these cover all different types of information about this oil wealth. So clicking into one of these data sources, you can see we have a lot of information about these wells over time, stuff like the API number, for example, that you can use to uniquely identify a specific well where that well is located. The name of the oilfield at that wells contained in. And if I scroll over here, you can even see that for some of these records, we have information like the permanent number.

02:05 – 02:30

Mingo Sanchez

Now, it’d be awesome if we had all of this information for all of our records, but unfortunately, that’s not the case. It’s often the case that you’re bringing together different pieces of information from different places. So to get that consolidated view and have that trusted record for the oil well across its entire lifecycle. We really need to be able to bring together all of these different sources and consolidate them. And that’s where Tamr comes in.

02:33 – 03:12

Mingo Sanchez

So to show you what that looks like, we can see on this screen on the right hand side, all the various sources of data that we’ve loaded into Tamr. So you can see these records are coming from many different places. And you can see that from record to record. We have differences in how these fields are formatted, such as the API number. Whether or not those fields are even present at all is something that we’re not necessarily guaranteed. So as you can see, it’s really difficult to put together all these different records. But with tamer, we’re able to jump from all of these disparate sources that you see here on the right and grouped them into these groups of records that you see here on the left.

03:14 – 03:43

Mingo Sanchez

So here we have an example where Tamr has identified nine different records, all that previously were unable to be linked together across multiple sources, and it was able to do that very easily, using just a little bit of help from people up front. So you can see sometimes we don’t even have a key like the API number to link together these records. We don’t have the same name for this well, from one record to another and so on and so forth. But with Tamr, it’s really easy to put all this information together.

03:44 – 04:29

Mingo Sanchez

Now how do we get to this point where Tamr is able to recognize all these different records belong in the same group? Well, it’s as simple as just collecting a little bit of feedback from the data experts in order to train Tamr to recognize that these records belong together. So the way that we typically do this is by asking that experts really simple yes or no questions about individual pairs of records. It’s really difficult for a person to go in and find dozens or hundreds of records that all belong together. But it’s really easy for a person to look at two records and say, Are these the same as one another or are they different from one another? Just say, are they match yes or no? And then Tamara is going to be able to do the rest and pick up on those patterns.

04:31 – 05:26

Mingo Sanchez

So jumping into one of these examples here, you can see two records. They’re very similar in certain ways. You can see that these names are highly similar. And you can see these locations are roughly in the same place. But there’s some information here to indicate that maybe these aren’t actually the same well as one another. Mainly, these API numbers are completely different. So even though there’s a lot of similarity in these records, Tamr is smart enough to recognize that they should actually be split out from one another. And it’s able to say that it doesn’t think these records are a match. And on the other hand, we can find cases like this where you have records that are actually very similar to one another and probably should be brought together. So here you can see, even though those API numbers aren’t formatted in the exact same way, if a bunch of trailing zeros and punctuation in one of them, for example, tamer still smart enough to tell that these are actually the same oil well as one another.

05:27 – 06:01

Mingo Sanchez

And something that’s really interesting and unique about Tamr is that you don’t have to just use textual data like a traditional rules based solution. Word Tamr is the only MDM solution that has geospatial capabilities built right into the platform. So if I click over here, you can see that users are able to look at those points on a globe just like as if they were going to an external map service. And they’re able to compare those different oil well locations and see whether or not they look like they actually should be the same as one another.

06:02 – 06:26

Mingo Sanchez

So ensure all the same types of information that a person would be using to figure out whether or not to oil wells are the same. Going can be able to teach Tamr to recognize the same patterns, and that’s really where the machine learning comes in. Experts who understand the data really well are training tamer on how to put these wells together and in tamers, learning those patterns as able to apply that at scale to hundreds of thousands, millions or even billions of records at a time.

06:28 – 07:21

Mingo Sanchez

Now that’s all well and good, but where do we go from here? Well, tamers identified that all these records should be grouped together. And one thing that Tamr can also do is assign a unique I.D. that persists over time so that you’re able to track what’s happening as you ingest new records, as you make changes to existing records and so on and so forth. Now, not only does that ideal link together all these records, but also allows you to take this a step further and create what we call the golden record. So you’re able to bring together all of those underlying records that represent the same entity in this case, the same well and tamer on a field by field basis allows users to define how they want to choose that most trusted value so that when you’re creating your downstream reports or analytics, you’re able to have the most up to date and trustworthy information for your for your analytic purposes.

07:23 – 07:54

Mingo Sanchez

In summary, what we’ve shown today is that Tamr is able to allow users to very easily provide that feedback about the types of problems that are most important to them in this case, bringing together all of this wealth data, and it’s able to take all that information out of a person’s head and train a machine learning model to do the same thing at scale. If you’re interested in learning more, please visit our website at Tamr dot com and reach out to someone from our team. We’d love to speak with you. Thank you so much for your time and have a great rest of your day.