We’re on it! We will reach out to to schedule your demo. So we can prepare for the call, please provide a little more information.

We’re committed to your privacy. Tamr uses the information you provide to contact you about our relevant content, products, and services. For more information, read our privacy policy.

Tamr Insights

AI-native MDM

Updated

September 15, 2014

| Published

Three Approaches to Scalable Data Curation: Stonebraker @ Strata/Hadoop

Strata Conference NY: Thursday, October 16 @ 11:00am EDT

Enterprises and organizations have access to a huge variety of diverse data sources today: internal data sources, external public data sources, feeds from the Internet of Things and more. They want to be able to leverage this data, connect/combine it intelligently and efficiently, and tap into it for new kinds of analytics and applications. But neither traditional top-down data-integration approaches nor some of the newer, bottom-up data scientist tools can scale to meet the demands of Big Data Variety.Come see Tamr Co-Founder and CTO Michael Stonebraker at Strata Conference NY on October 16 discuss how a scalable data curation platform can help enterprises connect and enrich their data so they can leverage all of it.In his talk, Mike will describe data curation, which he defines as “the process of turning independently created data sources (structured and semi-structured data) into unified data sets ready for analytics, using domain experts to guide the process." It involves:

Identifying data sources of interest (whether from inside or outside the enterprise)
Verifying the data (to ascertain its composition)
Cleaning the incoming data (for example, 99999 is not a legal ZIP code)
Transforming the data (for example, from European date format to US date format)
Integrating it with other data sources of interest (into a composite whole)
and deduplicating the resulting composite data set.”

The more data you need to curate for analytics and other business purposes, the more costly and complex curation becomes - mostly because humans (domain experts, or data owners) aren’t scalable. Mike's talk will compare and contrast three approaches to data curation for Big Data Variety:1. ETL (Extract-Load-Transform) tools2. Data Science tools3. Enterprise curation toolsYou can also join Mike after his talk (at 11:50 AM) for an informal Office Hour with Michael Stonebraker.For more information about Strata NYC scheduling, click here.

Tamr Insights

Three Approaches to Scalable Data Curation: Stonebraker @ Strata/Hadoop

Tamr Insights

Strata Conference NY: Thursday, October 16 @ 11:00am EDT

Related posts

Redefining Data Stewardship and Governance: How AI-Native MDM Empowers Responsible Data Management

Data Management in the Age of AI: 6 Things CDOs Need to Do

The Evolving Role of the CDO: 4 Skills Every CDO Must Develop

Tamr Insights

Get a free, no-obligation 30-minute demo of Tamr.

Related posts

Redefining Data Stewardship and Governance: How AI-Native MDM Empowers Responsible Data Management

Data Management in the Age of AI: 6 Things CDOs Need to Do

The Evolving Role of the CDO: 4 Skills Every CDO Must Develop