Allie Gilland
Allie Gilland
Analytics Engineer
SHARE
October 14, 2021

Three Essential Steps that Improve Data Quality and Usability Through Enrichment

Three Essential Steps that Improve Data Quality and Usability Through Enrichment

Introduction:

Every business today wants to be data-driven and use up-to-date and accurate information to support their decisions. However, as we all know: incomplete, incorrect, and missing data lowers trust in operational and analytical decision-making about key business relationships from customers to suppliers.

In an effort to improve data quality, data enrichment services have emerged as a way to transform existing bad data into a reliable and actionable source. As the breadth of data providers grows, the data enrichment landscape is becoming more complicated. We’ll highlight key features needed for a robust and successful enrichment service.

What is Data Enrichment?

Data enrichment is the process of increasing the value of a customer’s data assets by integrating internal data assets with external data.

Overview:

Data enrichment improves the quality of your company’s existing data and adds additional relevant or missing information to increase the usability of the data. For existing data, enrichment can ensure that the fields captured are valid, have correct syntax, and are ready for use. Also, the relevant information that enrichment adds may open the data up to more uses and applications.

Common providers of external enrichment data are: D&B, Companies House, and GLEIF for corporate and legal entity information. But, data providers vary widely by industry. For example, IQVIA and Verinovum specialize in healthcare data enrichment, from patient medical records to clinical data surrounding specific chronic conditions. By adding information and cleaning up existing data, enrichment improves the data quality and ultimately its usability.

Example of Data Enrichment:

In this image above, the existing fields were validated and standardized, and the records were mastered and enriched with firmographic information. Before mastering, two duplicate records represented the same entity, each with dirty and differing information. Once mastered and enriched, there is one unique record complete with correct and clean information, along with additional data like firmographic data which can now be used to construct a corporate hierarchy for this company.

Recently we put this strategy into practice using Tamr on a list of 30,000+ attendees from Dreamforce, Salesforce’s annual conference. We knew that the attendee list was full of missing or incorrect data. These data quality issues would slow marketing’s time to outreach – an important metric for successful outreach – and impact our chances of turning attendees into leads. So, with Tamr’s integrated data mastering and enrichment solution, we were able to obtain complete and correct golden records of attendees and match them into our Salesforce system.

Below is an example of the enrichment that Tamr was able to provide for the company attendee records from Dreamforce:

Before Tamr Enrichment:

After Tamr Enrichment:

This real example above shows the type of value that enrichment provides by improving existing data sets with new and complete information. Using data enrichment our marketing team now has complete, golden records for their outreach campaign. You can read more about our internal usage of Tamr in the blog post.

Three Steps for Successful Data Enrichment:

Ensuring the data is accurately and consistently enriched is easier said than done. So, successful data enrichment will have a robust, DataOps-driven process in place. An agile and effective enrichment service will involve the following components:

Data Mastering: Before adding additional data sources or new fields to enhance the quality of the data, it is important to ensure that the existing data foundation is in the best shape possible. If the original data is extremely messy to begin with, it’s likely you won’t achieve the intended results. If a critical record attribute such as ‘name’ is misspelled, enrichment services will likely return a null answer or add the wrong information. For example, one letter difference “Sunoco” instead of “Sonoco” is the difference between adding information related to a Texas-based oil and gas distribution company or a South Carolina industrials company. Enriching data before cleaning and curation take place can also be extensive. Many services charge on a per-record basis and if duplicate records exist, it often doubles the price. Data mastering will deduplicate and validate information, creating a single view of each record. The combined efforts of data mastering plus enrichment provide clean, up-to-date records to allow you to extract the maximum value from your data.

Data pipeline integration: Data becomes stale, meaning enrichment can’t be treated as a one-off process. It is something that needs to be executed continually, to ensure that your data is always up to date and of the highest quality. Embedding enrichment services within the data pipeline allows for seamless and consistent enrichment integration and execution, no matter how sources or information change. By integrating enrichment with your data pipeline, you can create a single solution for data processing and implementation, while avoiding the extremely time-consuming and manual work that goes into managing processes outside of a data pipeline. A consistent, robust, and repeatable enrichment solution built into the flow of operations improves efficiency as well as the time-to-value of applying data to gather insights.

Avoiding reliance on just one data provider: No external data provider offers all information about people or organizations. The breadth of useful external data sources continues to grow, which opens opportunities for companies to enrich their data with new attributes that were once unattainable. Healthcare companies could use claims information from healthcare providers or health insurers with IQVIA, or chronic condition management with Verinovum. Often, it’s necessary to combine several external data sources to gain the holistic data view needed. It’s very important to ensure that enrichment services, and any external reference data that they may use, contain various robust sources that are constantly being updated. By enriching with relevant and curated reference data, you can ensure that your data is complete and contains any important information, adding value and practical uses.

Conclusion:

Enrichment is an important solution to improve the quality, and thus the value, of data. In addition to using enrichment to support information about both people and organizations, it’s critical to consider the entire process of data enrichment, ensuring that it is used in conjunction with mastering, integrated within a data pipeline, and based on multiple sources and data providers. Combining these features will give the most complete and accurate data, allowing you to use it and apply it to whatever business need exists, for a greater return on investment.