The Apache Hadoop project took off in enterprises over a fairly short period of time. Four or five years ago, Hadoop was just becoming a “thing” for enterprise data processing and experimentation. MapReduce was at the heart of that thing, and Spark was still only a research project at the University of California at Berkeley. Soon after, though, if you were doing “Big Data,” you were using Hadoop.
Spark wasn’t even an Apache project when Cloudera, Hortonworks and MapR were already in full business swing in 2013 with Hadoop offerings. Only two years ago did it graduate to be a top-level project.
Today, Spark is a part of most Big Data conversations, as is evidenced by how many vendors are offering integrations, or are planning them in the near future. Large enterprises, such as Toyota, Palantir, Netflix and Goldman Sachs, are embracing the technology.
(Related: A detailed look at Spark 1.6)
Is this uptake at the expense of Hadoop? That’s a larger question, but to begin with, it’s become clear that Spark is replacing MapReduce. Anand Venugopal, head of product for StreamAnalytix at Impetus Technologies, said he believes this is the case.
“The MapReduce computing paradigm is likely going to get replaced by Spark as the distributed compute model overall for any workload,” he said. “There’s one metric I use [when deciding what to support], which is, what is the number of customers that tell us ‘We don’t want to talk until you have Spark?’ That same metric is used for any technology: Is there a critical mass of customers who have a seriously broad decision-making body in the enterprise customer that has committed itself to a particular enterprise technology?”
He went on to state that this critical mass currently exists in Spark, and that his company’s streaming analytics platform is bringing support online in the first quarter of 2016.
Ajay Anand, vice president of products for Kyvos Insights, said, “Most customers expect to see Spark support in the road map, and we are definitely embracing it along with Hadoop. From my perspective, we look at what is the problem we’re looking to solve, and what is the right technology that is mature enough to help us solve that problem.”