(Official White House Photo by David Lienemann)
Subject: Fueling the Moonshot
Dear Mr. Vice President:
We at Tamr have great individual and institutional passion for what you’re attempting to do with the White House Cancer Moonshot Task Force: “double the rate of progress in the fight against cancer.” And how you’re proposing to do it: by systematically clearing out “the bureaucratic hurdles — and, quite frankly, let[ting] science happen.”
Clearing out hurdles to progress. Nothing energizes good entrepreneurs and engineers more than that aspiration. That you’re hell-bent on doing it within the federal government and cancer research ecosystem? Even better.
So count us in. We’re game to help on the data side, where the volume and variety of “big cancer data” may be the Moonshot’s most valuable asset and its greatest challenge. Namely, organizing all of this shared data, preparing it and then unifying it for analysis. This is what we do at Tamr; as we’ve announced today via a free software license offer, we’d love to apply our technology and expertise to the objective you raised in your recent Medium post:
… [W]e have the potential to take advantage of big data and advances in supercomputing with greater data sharing. Almost every cancer center keeps a database of information — genetic history, medical records, and tissue banks — that might hold the key to improving certain cancer therapies. Allowing researchers and oncologists to tap into this treasure trove of information is absolutely vital to speeding up the pace of progress toward a cure. If we ensure this data is interoperable and accessible for scientists, researchers and physicians, the consensus is that we can absolutely speed up research advances, improve patient care and get ourselves closer to a cure.
As you note, data sharing and access are no doubt pivotal steps in accelerating progress. This also tracks with a recent proposal by the International Committee of Medical Journal Editors (ICMJE) that would require authors, as a condition for clinical trial report publication, to share “the de-identified individual-patient data (IPD) underlying the results presented in the article.”
Candidly, though, increased sharing and access is only going to get the Moonshot into orbit. Landing the ship and planting the flag will take a deep, sustained commitment to the other requirement you highlighted: interoperability. As in integrating, curating and unifying the many diverse data sources that will flood into this suddenly opened system.
Think about analyzing how lung cancer patients respond to different therapies currently being administered. Thousands of clinics, using slightly different names for the same drugs, recording results idiosyncratically, saving files in different formats. You get the picture.
As we’ve seen from our work with biopharmas over the years, the knowledge to be gained in the aggregation and integration of the work of thousands of individual scientists can be transformational. But this diversity comes with a cost: the sheer variety of sources and formats … and the standardization and integration required before the aggregated data becomes useful.
Two current federal examples suggest data standardization and integration could pose a considerable challenge for the Moonshot:
- By the end of this year, the FDA will require that all electronic data for clinical trial submissions conform to Clinical Data Interchange Standards Consortium (CDISC) standards, which support the acquisition, exchange, submission and archiving of clinical research data and metadata. Historically, converting this clinical data to approved CDISC formats manually has proven time-consuming and expensive — creating a human bottleneck that dramatically slows data analysis and understanding.
- The Office of National Coordinator (ONC) for Health IT is in the 12th year of its mission to build an “an interoperable, private and secure nationwide health information system.” ONC recently acknowledged that its goal remains “a work in progress” – which would seem to reflect in particular the complexities of electronic health record standardization.
These standardization and integration challenges, taken together, lead us to suggest a few ideas that could help you launch the Moonshot with the right trajectory, enough boost and plenty of fuel to reach your destination – fast.
- Think outside the (federal) box for data “command and control.” Vesting a federal agency like the FDA or HHS with the task of designing and implementing a data access and interoperability system would be an old, tired approach to a problem that will take innovation and energy to solve. So why not take a “SpaceX” approach with a public/private partnership among government entities, cancer centers and the pharma and tech industries? The benefits of the Moonshot will accrue to all parties, creating an opportunity to align data interests in a framework optimized for speed and innovation.
- Solve for interoperability and unification from the start. Most cancer research organizations can’t even look at all the data they already have on studies they’ve run, much less effectively incorporate data from the outside. Your mission demands breaking through enterprise silos and crossing the boundaries of hundreds – maybe even thousands – of organizations. Without a plan for unifying this data for analysis quickly, it will be extremely difficult to make sense of what’s in front of you.
- Invest today in the tech of tomorrow. In the case of interoperability, this means machine learning technology like Tamr’s that automates the vast majority of data preparation and unification across thousands of sources. When the machine can’t solve the problem on its own, data experts can weigh in with human guidance that gets fed back into the system – to make it even smarter and faster the next time around. At Tamr, we’ve been using this very approach for several large pharmas to catalog and unify data from hundreds and thousands of research scientists and groups – helping them make sense of their own “treasure troves” of information.
Beyond these suggestions, we’d also like to offer our help: Tamr expertise and technology to apply this next-generation approach to data unification. Our staff has decades of experience solving biopharma data management and integration problems — with many lessons learned along the way. Our data unification solutions are grounded in these lessons as well as in advanced database and machine learning technologies pioneered by some of the best minds and institutions in the business. We’d be honored to lend Tamr’s resources to your noblest purpose: “to end cancer as we know it.”
Nidhi Aggarwal, Product and Strategy Lead, Tamr, Inc.
Moonshot organizations: Click here to apply for a complimentary Tamr software license.