Stonebraker on … Hadoop in Managing Big Data

Screen Shot 2015-07-26 at 2.45.32 PM


Forrester sat down recently with Tamr Co-Founder and CEO — and 2014 A.M. Turing Award winner — Mike Stonebraker for a wide-ranging conversation covering “data and innovation issues confronting today’s enterprises.”

Today’s topic: The role of the Hadoop stack in managing big data in the enterprise.

As Mike told Forrester,

Historically, Hadoop was the open-source version of MapReduce running on top of HDFS, with HIVE or PIG above that. … In 2009, we wrote a paper saying MapReduce is ridiculous for two reasons.

Number one, Hive equals SQL unless you squint. … If you’re doing SQL, the last thing you want is MapReduce as an interface.

[Number two,] unless you have something that’s embarrassingly parallel, you don’t need MapReduce. … But only about five percent of the problems that anybody’s interested in are embarrassingly parallel. So, basically, MapReduce is this insignificant little corner case and is a terrible internal interface for a higher-level system.

For Mike’s complete Q&A with Forrester, downloaded the report here (subscription or fee is required).