
Tamr's Patents
Innovation in machine learning and data mastering was at the heart of Tamr’s founding at MIT in 2013. As new standards in scale and efficiency have developed over time, Tamr has continued to enhance its data mastering workflows and the machine learning algorithms to meet the needs of the modern data ecosystem. Tamr’s patent portfolio reflects the company’s success in delivering innovation to customers looking to achieve clean, accurate data.
Key Areas of Innovation

Robust, ML-Powered Workflow

Method and System for
Large Scale Data Curation
US Patent Nos. 9,542,412 & 10,929,348
Tamr's core approach to large-scale data curation, using a combination of schema mapping, deduplication, and human expertise, all powered by machine learning.

Data Curation System
with Version Control
US Patent No. 11,042,523
A data curation system that includes various methods to enable efficient reuse of human and machine effort with version control.

Application at Scale

Scalable Binning for
Big Data Deduplication
US Patent Nos. 10,613,785 & 11,204,707
Blocking model that addresses n2 data deduplication problem by only considering plausible pairs.

System for Scalable Hierarchical Classification
US Patent Nos. 10,803,105 & 11,232,143
Method for large scale classification to a taxonomy, while addressing challenges of high record volume, large taxonomies, and sparse training data.

Geospatial
Binning
US Patent No. 10,877,948
Method for large scale deduplication with geospatial data by performing blocking in 3D: geospatial conflation to avoid comparing footprints of buildings and other objects on the surface of the earth in 2D projections.

Efficient, High-Impact User Input

Method of Using Clusters to Train Supervised Entity Resolution in Big Data
US Patent No. 11,049,028
Technique for cluster verification to derive training data and achieve deduplication.

In-situ Data Issue Reporting, Presentation, and Resolution
US Patent No. 10,817,362
Issue tracking and reporting system for locating data issues that scales for data volume and variety.

Transformations for Evolving
Schema Mapping
US Patent Nos. 10,860,548 & 11,003,636
Technique for providing, generating, and reusing transformations for schema mapping, accounting for evolving schemas and target schemas mapping.