Typesafe, provider of the world’s leading Reactive platform and the company behind Play Framework, Akka, and Scala, today released the findings of a survey of more than 2,100 enterprise developers, data scientists, executives and system architects, analyzing adoption patterns around the Big Data processing engine, Apache Spark.
Key findings include (download the full report here):
- Spark awareness and adoption are seeing hockey-stick-like growth. Google Trends confirms this finding and the survey shows that 71 percent of respondents have at least evaluation or research experience with Spark — 35 percent are using it or plan to adopt soon. Of the survey respondents running Big Data applications in production, 82 percent indicated that they are eager to replace MapReduce with Spark as the core processing engine.
- Faster data processing and event streaming are the focus for enterprises. By far the most desirable features are Spark’s vastly improved processing power over MapReduce (over 78 percent mention this) and the ability to process event streams (over 66 percent mention this), which MapReduce cannot do.
- Perceived barriers to adoption are not major blockers. When asked, respondents mentioned lack of in-house experience and perceived immaturity of some Spark components and integrations with other middleware and management tools. Also cited are needs for better commercial support options and for more comprehensive documentation and advanced examples. Some respondents mentioned that their organizations aren’t currently in need of “big” data solutions at this time.
“The need to process Big Data faster has largely fueled the intense developer interest in Spark,” according to Dr. Dean Wampler, Big Data Architect at Typesafe. “Hadoop’s historic focus on batch processing of data was well supported by MapReduce, but there is an appetite for more flexible developer tools to support the larger market of ‘mid-size’ datasets and use cases that call for real-time processing.”
“Coming directly from developers, this survey reiterated the rapid adoption of Spark for large-scale data processing. I’m especially excited by the breadth of use cases seen, which range from batch jobs to streaming and machine learning,” said Matei Zaharia, CTO at Databricks and Vice President of Apache Spark. “It’s this type of direct feedback and dialogue with our community that enables us to continue to improve the usability, performance and built-in libraries of Spark.”
Developers across all industries have been turning to Typesafe to build Reactive applications, of which Big Data is a core component. Because it is built with Scala, it was a logical choice for Typesafe to add full lifecycle support for Apache Spark to the Typesafe Together Project Success Subscription program to accelerate developer adoption and success in building Reactive Big Data applications.
“This survey further validates Databricks’ partnership and shared vision with Typesafe to bring a comprehensive suite of application development tools for developers that enable enterprises to operate with more agility and speed,” said Kavitha Mariappan, vice president, marketing at Databricks. “We look forward to collectively utilizing this feedback to make the Spark developer experience not only richer but also as seamless as possible.”
Other key findings addressed in the report (download the full report here) include:
- Big Data adoption figures across specific industries
- Increasing adoption figures for other emergent technologies like CoreOS, Docker and Mesos in Big Data environments
- Most common use cases being solved with Spark
- Preferred methods for loading data into Spark
- Barriers to entry for Spark