Tamr's Patents

Innovation in machine learning and data mastering was at the heart of Tamr’s founding at MIT in 2013. As new standards in scale and efficiency have developed over time, Tamr has continued to enhance its data mastering workflows and the machine learning algorithms to meet the needs of the modern data ecosystem. Tamr’s patent portfolio reflects the company’s success in delivering innovation to customers looking to achieve clean, accurate data.

Key Areas of Innovation


Robust, ML-Powered Workflow


Method and System for
Large Scale Data Curation

US Patent Nos. 9,542,412 & 10,929,348

Tamr's core approach to large-scale data curation, using a combination of schema mapping, deduplication, and human expertise, all powered by machine learning.


Data Curation System
with Version Control

US Patent No. 11,042,523

A data curation system that includes various methods to enable efficient reuse of human and machine effort with version control.


Application at Scale


Scalable Binning for
Big Data Deduplication

US Patent Nos. 10,613,785 & 11,204,707

Blocking model that addresses n2 data deduplication problem by only considering plausible pairs.


System for Scalable Hierarchical Classification

US Patent Nos. 10,803,105 & 11,232,143

Method for large scale classification to a taxonomy, while addressing challenges of high record volume, large taxonomies, and sparse training data.



US Patent No. 10,877,948

Method for large scale deduplication with geospatial data by performing blocking in 3D: geospatial conflation to avoid comparing footprints of buildings and other objects on the surface of the earth in 2D projections.


Efficient, High-Impact User Input


Method of Using Clusters to Train Supervised Entity Resolution in Big Data

US Patent No. 11,049,028

Technique for cluster verification to derive training data and achieve deduplication.


In-situ Data Issue Reporting, Presentation, and Resolution

US Patent No. 10,817,362

Issue tracking and reporting system for locating data issues that scales for data volume and variety.


Transformations for Evolving
Schema Mapping

US Patent Nos. 10,860,548 & 11,003,636

Technique for providing, generating, and reusing transformations for schema mapping, accounting for evolving schemas and target schemas mapping.