Tamr Insights
Tamr Insights
The Leader in Data Products
October 11, 2021

21% of Dreamforce registrants used personal email: Fixing the Salesforce data quality problem

21% of Dreamforce registrants used personal email:  Fixing the Salesforce data quality problem

Getting your hands on Dreamforce’s list of attendees, as the name suggests, is quite literally a marketer’s dream. Salesforce’s annual event is one of the most well-known and attended technology conferences of the year. It has truly become a global event — this year taking place not only in the company’s hometown of San Francisco but also in cities around the world, including New York City, London, and Paris, in addition to online. With over 100,000 technology enthusiasts, employees from all types of business, and a list of celebrities and thought leaders deserving of a Hollywood red carpet, the event is famous for its festival-like atmosphere.

This year Tamr gave a presentation on how to apply machine learning to build accurate account hierarchies. As a presenter and exhibitor, we received the contact list of 39,100 Dreamforce attendees interested in post-event follow-up. Our sales and marketing team was anxiously awaiting the list and ready to get to work. But as we scrolled through the excel sheet of contacts, our team had questions:

  • What companies are in our target market and already in our Salesforce?
  • Which contacts have we already built a relationship with?
  • What do we know about the account hierarchies of the attendees?
  • And, as we noticed blank fields, spelling errors, and made-up contact information, should we trust the accuracy of the data?

We needed accurate, holistic account data and we needed it fast. It was time to get to work and convert contacts to leads.

The state of Dreamforce data

Our first step was to understand the state of the data. Bad data plagues most systems, from CRMs to ERPs to analytics and BI tools, limiting customer insights and frustrating business users. It’s important to assess the validity of new data sources before the data can be put to work.

Here are a few of the challenges we noted in the data we received from Salesforce on attendees:


Employees also entered their company’s name in lots of different ways. Given Salesforce’s mammoth scale with 50,000+ employees, it’s no surprise that Salesforce’s own employees were the #1 attendees at the conference. Here are just some of the ways employees listed ‘Salesforce’ under the company field:



While the data was far from perfect, these were data quality challenges we could handle. The cleaned list of company contacts showed high potential so the next step was to integrate the list with internal data in Salesforce to gain a more complete view of contact information and target accounts.

656 Dreamforce company leads are target accounts in our Salesforce

For Tamr, the ultimate result was turning a lead list filled with errors and incomplete information into 656 matched and targeted account follow-ups, integrated with our existing Salesforce accounts. Below is an example of the before and after view on some key company account attributes:



Not much for our marketing team to work with! We (sadly) know that we’re not the only company reaching out to contacts after Dreamforce, meaning that both time is of the essence and differentiated account follow-up is needed. This data won’t cut it.




By enriching the data and matching leads to our existing Salesforce accounts, we’re not only able to prioritize leads for outreach but also make more informed, tailored decisions on how to interact with our prospects and customers. While we won’t provide the full overview of our internal golden record view of accounts at Tamr, the example shows some key attributes we added. Ultimately, we can give our sales reps and marketing team a richer view of contacts, complete with key account information and a holistic record of our past engagement with the lead.

Using data mastering to solve issues of bad data

How did we do it? Well, we used Tamr of course!

Before an inbound contact list can be converted to sales leads, it’s vital to get the data in order – the goal is to make the data as clean, curated, consistent, complete, and consumable as possible.

Here are the steps we took with Tamr to master the Dreamforce dataset:

  1. Deduplicate and fix errors in the data
  2. Ensuring the data is accurate is key to speeding up outreach and lead conversion. Base checks include deduplicating records and ensuring key attributes like address and contact information are valid and up-to-date. The most common method for tackling data quality and record deduplication is a rules-based approach – typically lots of if-then nested statements. A rules-based approach works reasonably well for low volume, low variety, one-off requests. A machine learning approach is more accurate and efficient (70-90% less effort) for variable data sources and structures and large data volumes. By learning from examples and utilizing feedback for fuzzy match models, our machine learning approach can deal with complex data problems, recognizing that the 40 company name versions entered belong to the Salesforce.com hierarchy.
  3. Enrich the data with other data sources
  4. Reducing friction in sign-up and data capture is a priority for most marketing teams. The trade-off is that limiting the information requested from customers often provides little insight into contacts to act on. Enriching the data with external sources provides more relevant, robust data on accounts to customize touchpoints. For our sales team, it’s critical to understand the corporate account hierarchy across regions, so we integrated data sources such as Companies House and LEI.
  5. Match to existing Salesforce accounts and feed the data to systems
  6. The data is only helpful if sales and marketing can put it to work. Through Tamr, each sales account has a unique ID that connects customer records within and across systems (Salesforce, Hubspot, SAP) and enables a golden record view of customers. We matched the lead list to our existing contact base using machine learning models to ensure a holistic view of accounts and better understand what contacts sit within our target market.

Bad customer data is an ongoing struggle for most sales and marketing teams. The challenge is not new. In this instance, we solved many of the bad data issues by using Tamr’s data mastering capabilities and enriching the Dreamforce contact list with reliable third-party data. Possessing a more accurate and reliable list of Dreamforce attendees should help us be more effective with our marketing and sales outreach.