Tamr Insights
Tamr Insights
The Leader in Data Products
September 27, 2022

Master Provider Data with Human-Guided Machine Learning

Master Provider Data with Human-Guided Machine Learning

Fragmented, duplicated information in disparate, disconnected systems is common when it comes to healthcare provider data. For years, government agencies and health organizations have been compiling and sharing healthcare data sources including electronic medical records (EMRs), electronic health records (EHRs), prescription data, insurance claims, payment data, and more. And while all providers have a unique 10-digit identifier, aka a National Provider Index (NPI), which is a standard maintained across the US, they lack quality and might contain typos, fat-fingers, and suffixes. There is a need for healthcare organizations to streamline operations around provider data with improved data entry standards, reduced turnaround times to onboard providers, and a provider 360° view with connections to their payers and insurances.

However, providers pose a unique problem with their data. They could have multiple specialties and offer services at multiple facilities across states in the US with connections to vendors such as billing providers, insurers, and hospitals. And these vendors could be sharing their services with other providers. Imagine a giant network with many-to-many graphs of multiple entities. The result is that data mastering needs to go beyond just providers and extend to their many connecting entities.

Next-Generation Provider Mastering

Tamr is a next-generation data mastering solution that uses machine learning with a human in the loop to master data of all entities. With Tamr, providers can:

  • Master providers and all other connecting entities that might be related to the providers, offering a single source of truth
  • Alleviate some of the most common healthcare data challenges faced today around messy data
  • Drive inter-operational across unrelated organizations, disconnected providers, and their vendors

Traditional MDM platforms lean towards writing rules to match attributes with retrofit machine-learning models to choose the set of rules most applicable to the attributes of the data that requires mastering. This cost-intensive, deterministic method, if maintained and integrated constantly with manual inputs, works well enough. However, it is bound to struggle with scalability when it comes to adding new, unfamiliar data.

Next-generation data mastering with Tamr, on the other hand, uses a mix of probabilistic and deterministic methods to meet the medical need for next-generation data mastering solutions. Thiscan dramatically increase the quality of provider data with accurate reporting for healthcare data consumers in a timely manner that can scale both with more data and over time.

Enrichment of Provider Data

For transparent and accountable public access to data, the Centers of Medicare and Medicaid Services (CMS) have taken significant steps to disclose financial relationships between providers, medical manufacturers, and other related entities as part of open payments data. Additionally, CMS National Plan and Provider Enumeration System (NPPES) keep records of all active NPIs and publish the parts of the NPI record that have public relevance, including the provider’s name, specialty (taxonomy), and practice address. These external sources offer an opportunity for further enrichment of the data, increasing its overall quality.

Tamr Enrichment for Healthcare Providers is a fully-managed service that enhances existing datasets with additional, often external, data sources. With enrichment, data consumers benefit from a hierarchical view of providers, their practices, medical groups, and parent hospitals that they service. Unlike traditional data marketplaces, Tamr Enrichment simplifies the deployment process. Our data mastering engine matches the enrichment attributes with internal records automatically and with high accuracy, even without a primary key.

Tamr’s entity-agnostic approach enables providers to simultaneously master and look up multiple entities associated with the providers. Low-latency matching prevents duplication at the source, providing higher-quality provider data. And, using data enrichment, providers can enhance existing, internal datasets with information that is generated from additional, external data sources.

To learn more, schedule a demo.