Written by Sohaiyla Khalili
Before we get into how Tamr is tackling data transformation, let’s start with the definition. Data transformation is defined as the process of converting data from one format or structure into another format or structure. Tamr enhances the data pipeline management process by giving users the ability to perform transformations at scale within the Unify UI. Along with time and cost savings, the accuracy and usability of the unified datasets are greatly enhanced with transformations making the data invaluable for all downstream applications.
Tamr transformations are Spark-powered for speed and scalability. Users have the ability to perform transformations on all records in the unified dataset or can specify records from certain source datasets. Specific transformations like fill can be added with easy dropdowns, or you can code your own to accomplish other advanced transformations. If at any point while coding your transformations, you would like to learn more about a function, simply command-click on the function to open our documentation. Further useful benefits are listed below:
- Color-blind friendly
- Simplify maintenance of your data pipeline by being able to easily modify transforms
- Reduce maintenance risk by using a SQL-like syntax that is understood by many, allowing for a seamless transfer of ownership
- Increase trust in the data with the ability to collaborate and iterate with multiple users
- Easily perform tasks such as feature extractions that enrich mastering and classification pipelines
We are very excited by these powerful transformation capabilities and the impact they will have on improving the unified datasets.
The demo video below shows the transformation capabilities and highlights the ease of use for the user.