A Data-Driven Approach to Taxonomy Design
Organizations classify and organize data around taxonomies to manage large varieties of business assets. Accurate, comprehensive taxonomies help organizations drive vital strategic insights and efficient operations around processes involving their products, asset inventories, spending, and more.
However, organizations often fail to create effective taxonomies of key business assets because the data volumes become so large and the complexities of organizing millions of products and transactions are oftentimes too difficult to reconcile. Within large organizations trying to classify products into a wide range of categories, individual category managers are typically responsible for defining only subsets of the overall taxonomy, therefore creating a fragmented, inconsistent data view that yields few or faulty insights. Adding to the challenge, existing taxonomies are often outdated and rigid. All of these factors hinder the ability to provide trusted, transformative data analytics for the organization.
Is your taxonomy effective?
So, how can you determine if a taxonomy is effective? One quick test is to observe how consistently assets and transactions are classified across the entire taxonomy. If you start digging into taxonomies, you might find surprises and unexpected complications.
Imagine that you’re a manufacturer with a spend taxonomy to categorize procured products such as industrial apparel, work gloves, and wiring. Within these products, people may make subjective decisions around how to categorize these procured items. For instance, some may have put rubber gloves into a sub-category under Electrical Parts and Equipment because rubber gloves are protective equipment against electric parts. Others may have put rubber gloves under Factory Supplies as common supplies found in the factory. Moreover, a product item as generic as industrial apparel may rationally fit in either sub-category as well.
Another test for effectiveness is the relative size of each category. Some categories may be overly broad, tempting managers to use a certain category as a catch-all bucket. As a result, the organization loses product transparency and insights that could come from further breaking down the taxonomy. On the flip side, organizations may create categories that are too narrow or specific, which would reduce the ability to gain broad insights around similar types of products. When taxonomy categories aren’t comprehensively and consistently defined, organizations may be forced to put a variety of supplies into the dreaded “Other” bucket–making everything in that vague category essentially unclassified, unmanageable, and unhelpful for tracking and analysis.
In addition to the issue in accuracy and specificity, poorly designed taxonomies can cause unnecessary future disruptions to the business. Product taxonomies often take months to create and are developed with a forward-looking mindset to encapsulate the organization’s view of its business. However, once implemented, the taxonomy may not compliment the actual product data being collected by the business, resulting in the desire to redesign the taxonomy structure. Obtaining comprehensive cluster views of similar products is the key to data-driven taxonomies.
To avoid taxonomy issues and align business vision with actual product data, Tamr’s unification platform provides clustered views of products based on various qualitative attributes such as data descriptions and business value such as associated dollar amounts or volume.
Tamr product clusters provide an opportunity for the organization to explore and better understand the landscape of its product data before committing to a specific classification taxonomy. This allows organizations to adjust their taxonomy in a way that actually conforms to the reality of the business in a data-driven way. Organizations using this approach may explore how grouping different clusters of product data interacts with their current or planned taxonomy structure.
Instead of relying solely on subjective categorization decisions, Tamr’s data-driven taxonomy approach provides organizations with objective data points to adjust specific categories in an agile cadence over time. Thus, Tamr’s data unification platform can be used to continually maintain a taxonomy’s effectiveness and even support long-term projects like overhauling an entire taxonomy. Any changes to the taxonomy structure can be implemented automatically with Tamr’s machine learning engine.
The potential for tens of millions in savings
Using a machine-learning, data-driven approach, Tamr customers are achieving better data quality, improving data asset management, and giving data stewards more control over data curation. Better still, customers are able to easily leverage human-guided machine learning and data experiences toward identifying trends and patterns in their business.
The results for our customers are impressive. One $18bn global industrial firm used Tamr to increase granularity of its taxonomy with 15% more categories and completely eliminated its “Other” bucket, which accounted for over 10% of its spend. Through this engagement, the firm realized $10+ million in savings.
People in data analytics agree that taxonomies are complex and sometimes painful to get right. Tamr’s goal is to provide an approach to reduce the complexity, time, and subjectivity of creating and maintaining them.
Get the details you need to start taking a programmatic approach to consistently monitor and evaluate the effectiveness of different classification taxonomies in our white paper below.