Streamlining Digitized Patient Data into Analysis-Ready, Real-World Evidence

In the world of digitized healthcare, including electronic health records and claim databases, companies are struggling to keep up with the amount of data gathered about patients. This real-world data (RWD) isn’t just vast, it’s gathered in a number of different ways and formats—all designed to be easy for the healthcare professionals to submit. In fact, third-party vendors are excited to share all the information they’ve gathered via insurance claims, personal health devices and more.

But there’s one catch: While the data is all valuable, it’s impossible to do much with it because it cannot be viewed as a single unit due to the different ways the data is stored. Everyone, from hospitals and insurance companies to connected device companies like Fitbit and Apple, collects data their own way. Unfortunately, this makes it impossible for an analyst to truly evaluate data to the degree needed to help companies—especially in the competitive life sciences market—turn the data into a measurable return on investment. At least, it’s impossible without a little help. 

Corporations need help turning RWD into analysis-ready real world evidence (RWE) that drives business decisions. They require a tool to harmonize data into a single output to be analyzed, recorded, and ultimately, capitalized on.

Tackling the disparate data challenge

Tackling the challenges that life sciences companies face requires a powerful data harmonization process combined with human-guided machine learning. Huge volumes of data need to be streamlined into a single output, like, the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM). This way, companies can utilize the data, apply it to their research and development, and not only improve patient outcomes but also receive measurable feedback for pharmacovigilance and regulatory compliance regarding any and all products. By solving this challenge, many companies should also be able to gain market insights into their products that had previously been inaccessible due to disparate data. It’s a win-win across the board.

At a glance, life science companies require a process that not only turns RWD into RWE but offers these solutions to their most common data problems:

    1. Integrated data: Many companies in the life sciences space are limited when it comes to moving data efficiently through their data infrastructure. Most ETL solutions lack the ability to easily transform and process disparate data into a single output. Moreover, it can take weeks companies simply don’t have. 
    2. Streamlined processes: Global businesses are looking for a solution that can integrate with their existing processes. They have the tools they need, and they want to leverage those tools their teams are most familiar with when wrangling and transforming data sources into a single data model. 
    3. Improved R&D: Companies not only need existing data accounted for, but they also need future datasets cleaned and harmonized so that they can improve drug/trial designs, treatment analyses and predictive analyses. 
    4. Market insights: The life sciences industry is one of the most competitive in the world, and clean, usable data is what boosts a company’s ability to be competitive when it comes to market positioning, targeted marketing and product utilization. 
    5. Data lineage and audit trails: Everything in the life science world requires documentation of data, data transformations and user activity. In order for RWE to be successful, all documentation needs to be gathered throughout the RWD integration project for user collaboration, transparency, and ultimately, auditing purposes.
    6. Team empowerment: Scientists and researchers are often separated into different departments, and in some cases, different countries. What one researcher knows is not necessarily shared with others. With a single form of data, the entire company can utilize data for innovation.

Organizations require an end-to-end approach that integrates and applies RWD to drug and product trials and helps move innovative patient solutions from bench to bedside. They require a robust platform that furthers their business goals of improving patient outcomes and efficiently distributing medical products globally. 

One global life sciences company finds the solution with Tamr

While Tamr Unify—our solution to the RWE challenge—had been around for a long time, it had most frequently been used on projects requiring data harmonization on a scale of tens of millions. That was until we were approached by a global life sciences company proposing our largest project: They had over 30 billion records, and they needed to integrate them into a single output format that could be leveraged by their analysts.

Like many in the industry, this life sciences giant had collected health data in real-world settings from third-party vendors, including information on a patient IBM’s Truven Health demographics, diagnoses and prescriptions. Like many companies, the data was gathered for collection and divided into groups like inpatient admissions, outpatient services and lab results. This company, however, wanted to utilize the Observational Medical Outcomes Partnership’s Common Data Model (OMOP CDM) format, so data could be organized in a way that matched analytic questions and tables could be created that represented entities, including people, diagnoses and medical procedures.

Using Tamr and its built-in, customizable data models, complex transformations were built to convert data from its original format into the new format. In order to do this, the team:

  1. Unified common data model management: Data models of any variety were managed through the data integration process.
  2. Performed data model discovery and maintenance: Using human-guided machine learning and the life science giant’s own experts, we provided automated mapping suggestions to onboard new datasets and correct schema drift. 
  3. Created a powerful data transformation engine for common data models: Once transformation capabilities were established, data was processed using the high-performance, scalable Spark engine.
  4. Converted custom mapping specifications: Projects with data harmonization specifications were uploaded with mapping specs and integrated into unified data models. 
  5. Established a data lineage and audit trail: All data conversion transformations and user interactions were tracked and logged, providing clear data lineage and audit trails.

By the completion of the project, Tamr not only integrated disparate data sources, it proved its ability to process 30 billion records in a single week; improve R&D designs and analyses; deliver competitive market insights; automate and streamline data analyses; confirm both drug safety and efficacy to regulators; consolidate data conversion specs; and ultimately, enable scientists and researchers to innovate faster. 

The future of real-world evidence 

Healthcare data is only becoming more robust as we figure out ways to better access and store patient information. Add the complication that there’s a lack of consensus around standard data models in the community, and it’s obvious why data is inconsistent in quality, purpose, and design. Now, companies in the life sciences industry have a solution that utilizes pre-defined templates of common data models—both customized and standard. Companies have a framework to create and modify data models quickly without the risk of time wasted in the discovery process. All this is to say that now, organizations can achieve results that were once considered impossible.

Life science companies are now able to leverage RWD as the invaluable asset that it is to further their business goals. Patients are expecting better, more efficient treatments and stronger outcomes. Adopting the application of RWE, organizations can now deliver transformational changes to help meet patient expectations. 

But life science companies aren’t the only ones that can benefit from streamlining disparate data and processes. Consider companies such as auto manufacturers with thousands of locations and very little data on their customers across those branches. Datasets can be gathered and records that are similar to one another (like transactional records) can be identified. Using this information, companies can do anything from removing costly overlaps to negotiating better deals by avoiding duplication.

To learn more about how Tamr helps corporations leverage analysis-ready real world evidence, download our overview here.

Tamr real world evidence integration

Tamr Real World Evidence (RWE) Integration

Whitepaper

Tamr’s Real World Evidence Integration Solution tackles the data challenge with a powerful data harmonization process that streamlines real world evidence integration for large volumes of disparate data to deliver transformational outcomes.

Download Now



Keziah is a Field Engineer at Tamr, responsible for ensuring the full automation of data pipelines for fortune 500 companies, customizing core product to individual customer needs using java, and engineering features for use in machine learning using SQL and python. Keziah has a PhD in Ecology from Colorado State University.