How to Avoid the 10 Big Data Analytics Blunders
Leading organizations are leveraging an analytics-driven approach—fueled and informed by data—to achieve marketplace advantages and create entirely new business models. However, even the savviest companies are repeating common missteps. I recently gave a presentation on this very topic, which you can watch here.
Here are the top 10 blunders we see in working with our customers—plus, insights into how you can work to overcome them.
If your organization isn’t planning to become cloud-exclusive, you could be backing losing technology. The cloud is more elastic than your in-house solution and more cost-effective in the long run.
The cloud will save your organization a raft of money, allow your business to take advantage of new technologies with elastic compute, and open your organization to the world. Tamr is the only cloud-native data mastering solution for Google Cloud, AWS, and Microsoft Azure.
You’ve hired data scientists, so you think you’ve got big data analytics covered. Wrong.
Without clean data, your data science initiatives will fail!
Don’t just take my word: Hear DataRobot’s Co-founder and CEO explain just how critical clean data is for any transformational data science.
Data warehouses are great for structured data from around 10 data sources, but they don’t work for things like text, images, and video. Many companies have bought into traditional data warehouse technology that costs up to seven figures a year. But, they’re only useful in a limited way. If you have a data warehouse, don’t try to shoehorn unstructured data into it.
Blunder #7: Believing that data lakes will solve all your problems.
Many people assume that if a company loads all its data into a data lake—a centralized repository for all data—they’ll be able to correlate all their data sets. But they often end up with data swamps, not data lakes. Just consolidating data into an environment isn’t a solution by itself.
Companies need to clean their lake data with a data curation system that will solve these problems. Tamr, from its academic roots at MIT to today, has worked through the incredibly hard math and technical hurdles to solve the problem of delivering clean, curated data for our customers.
Blunder #9: Succumbing to the Innovator’s Dilemma.
In his classic book The Innovator’s Dilemma, Harvard Business School professor Clayton Christiansen suggests that when technology changes and you are a vendor that is selling the “old stuff”, it is very difficult to pivot to the new stuff, without losing significant market share in the process.
As a business, you have to be willing to change and evolve when it is needed. It’s possible—and even likely—that a reinvention will hurt your business in the short term, but it’s absolutely necessary to stay in business for the long run.
If you work for a company that’s falling into any of the above blunders, figure out how to fix it—or start looking for a new job.
Blunder #2: Not planning for AI/ML to be disruptive.
Make no mistake: AI (and machine learning) will displace some of your workers and has the potential to upend how you handle your operations. But there is only one choice: you can be a disruptor, or you can be disrupted.
Tamr is disrupting the traditional rules-based MDM with our machine learning-first approach and closing the gap between data and analytic outcomes. Learn more about our modern approach by reading our “Essential Buyer’s Guide to Cloud MDM”.
Blunder #4: Believing that traditional data integration techniques will solve issue #3.
Clean, integrated data—at scale—has become nearly impossible to achieve with traditional techniques and technologies. Extract, transform, load (ETL) processes require intensive human effort, and take a lot of time. Conventional, rules-based, Master Data Management (MDM) systems don’t scale.
The platform shift of data and compute to the cloud has opened new possibilities. A cloud-based, agile, machine learning-first approach lets modern enterprises master their data at hyper-scale to finally solve their toughest data challenges and drive radically better business decisions.
Blunder #6: Believing that Hadoop/Spark will solve all your problems.
Many companies have invested in Hadoop, the open-source software collection from Apache, or Spark, the company’s analytics engine for big data processing. They have their place, but they are not the answer to everything. Would you use a “lowest common denominator” solution for your company’s “secret sauce”—or the best the industry has to offer?
Also, keep in mind that Hadoop and Spark won’t solve your data integration problems, where data scientists spend the bulk of their time. (see Blunder #3).
Blunder #8: Outsourcing your new stuff to big data analytics services firms.
As I say, this is a likely “company-ending blunder.” Typical enterprises spend about 95% of the IT budget on running legacy code, and they often have their best people doing things like maintenance. The most exciting stuff gets outsourced, often because there is no appropriate talent internally, or because the best people are stuck keeping existing systems running.
Building great software takes time and teams with diverse expertise. By using purpose-built solutions, interoperable solutions like Tamr, you are free to start solving your big problems right away.
Blunder #10: Not paying up for a few “rocket scientists.”
To address all of the above issues, and the hundreds of others you will inevitably face, companies need to invest in a few highly skilled employees. The new hires are not going to wear suits, but they will be your guiding lights.
If you enjoyed this presentation about the ten common big data blunders, you might like my presentation on Data Mastering at scale.
Schedule a consultation with our data experts to learn more about the machine learning approach to data mastering. We will discuss your business objectives and share customer success implementations of organizations that have leveraged ML to tackle similar challenges.
Learn more about Tamr.