Written by Tamr
An international pharmaceutical company has a massive problem: critical research data from thousands of scientists in labs spread across the globe is trapped in tens of thousands of spreadsheets. Associates need to quickly find critical data on compounds, assays and reactions in order to increase research efficiency and decrease drug-development costs, but the information is locked in spreadsheets with more than 100,000 different attribute names; millions of rows; and inconsistencies in labeling, measurement units and even language—all managed by 8,000 scientists with little ability or incentive to clean and share the information. For the initial project, Tamr analyzed thousands of spreadsheets from more than 400 scientists. The system automatically matched 86% of attributes. Then, leveraging Tamr’s expert sourcing functions, a team of 40 people further improved mapping by 10%. While the company found the matching results impressive on their own, the true benefit of the project was seen as the cultural shift it inspired: data source owners began to proactively improve the organization’s data quality
Please complete the form below to download your whitepaper.