Written by Michael Collins
A recent Forbes article by Randy Bean and Grant Stephen outlined 5 ways that Big Data and AI will impact Life Sciences firms in 2018. While some of the predictions were oriented around the use of new technologies (like blockchain) to improve business operations, most reflected the tremendous business value of solving age-old data management problems.
One prediction stated that in our current political environment, there is a hostility towards high drug prices and, as a result, it will be essential for firms to “defend their research budgets and their profit margins by utilizing robust data that clearly demonstrate the value of their products”. In doing so, there is a need to integrate data from the real world (e.g. medical insurance claims), genomic research and clinical trials to “unlock answers to a host of high-value questions such as the true effectiveness of treatments which can then be used to defend pricing positions.”
At the core, this prediction is about the value of mastering scalable data integration, which is a competency that enterprises across all industrieshave struggled with for quite some time. The ambition of Life Sciences firms to integrate their vast array of disparate datasets to answer larger, more complex business and clinical questions has been hampered by heavily manual, top-down approaches. Most often, these companies attempt to solve their integration challenges via large teams of programmers or through the use of traditional data integration tools like ETL. Unfortunately, these techniques are heavily dependent on manual scripting of rules, squashing any hope of achieving scalability. See Dr. Michael Stonebraker’s white paper on this topic.
Most Life Sciences organizations have not yet realized the benefits of utilizing artificial intelligence, specifically machine learning, in their data integration efforts and we believe that will change in 2018. We expect to see more of them recognize the transformative business impact associated with complete, accurate and timely insight generated as a result of applying machine learning to the task of integrating datasets. A great example of this is the work that we’ve done with Amgen in their quest to build a translational data platform at scale. Amgen embarked on a journey to build a translational data platform that enabled its scientists to analyze a multitude of studies, treatments, and specimens by using Tamr to pull together data from across the organization, including research data, biospecimen data and clinical data. Scientists can now discover new therapies or new applications of existing therapies and have all of their questions answers / hypotheses tested in a timely manner. Tamr’s machine learning-based approach processed over 200 legacy clinical studies (~4000 SAS files) within a year using an average of 2 FTEs, an impressive feat that provided Amgen with scalable data integration capabilities they needed.
Finally, we believe that over the past year, there have been best practices for building scalable R&D data platforms starting to emerge that will help firms “see around the corner” when addressing the challenge of using their data to unlock answers to critical questions, whether it be to defend drug pricing decisions or beyond. In any case, 2018 will be a tipping point where Life Sciences firms start to see exponential yield in both their growth prospects and patient outcomes as a result of applying AI to their most basic, yet most challenging, data issues.