Cloudera has integrated eight open-source offerings with its Hadoop distribution that it says will ease enterprise-grade adoption of the map/reduce framework.
The third commercial distribution of Cloudera’s Distribution for Hadoop shipped on Tuesday. It includes deployment management tools that address data integration requirements, high-level languages, remote procedure call, serialization, scheduling and workflow.
Software is bundled from open-source projects, including Flume, HBase, Hive, Hadoop User Experience, Oozie, Pig, Sqoop and ZooKeeper. “In places where the community has provided a decent solution, we’ve bundled. Where there are no enterprise-ready solutions, we build our own,” said Jeff Hammerbacher, cofounder and chief scientist at Cloudera. “We are expanding the notion of what the platform surrounding Hadoop should look like.”
“As organizations increasingly struggle to extract value from an ever-expanding sea of data, more and more of them are turning to Hadoop,” said RedMonk analyst Stephen O’Grady. “Cloudera’s new offerings lower the barrier to entry for enterprises looking to deploy Hadoop in production environments.”
Cloudera is a major contributor to Apache Software Foundation’s open-source Hadoop project, and it employs the project’s creator and lead architect, Doug Cutting. Cloudera views itself as a platform vendor, creating the main Hadoop distribution but not being focused on cross-version compatibility, Hammerbacher said.
“Tool vendors can build on top of the platform to enable analysts and developers to do work with it,” he added. Some of its partners include Karmasphere, Quest Software and Talend.
Hadoop without services is like the Linux kernel by itself without the GNU operating system, Hammerbacher explained. “You add a suite of services, and all of the sudden you get real work done.”