Hortonworks, a leading contributor and enabler to enterprise Apache Hadoop, and Red Hat, Inc., the world’s leading provider of open source solutions, today announced an engineering collaboration to advance open source big data community projects. Working with the Apache Hadoop Community, the Hortonworks and Red Hat engineering teams will work together to accelerate the enablement of the broader file system ecosystem to be used with Apache Hadoop. The companies also announced the integration and support of Hortonworks Data Platform with Red Hat Storage, which can reduce a Hadoop cluster cost by up to 50 percent since customers can now run Hadoop directly on a POSIX-compliant storage node.

The Hortonworks and Red Hat engineering effort has three main focus areas and is expected to increase the breadth of storage offerings that integrate and interoperate with Hadoop, making it possible to analyze data in place anywhere within the enterprise.
The first focus area is to enhance the Apache Ambari project, the open source project to monitor and manage Apache Hadoop to support the management of Hadoop-compatible file systems, such as GlusterFS. With this integration, users will be able to provision, deploy, monitor and manage alternative file systems with Ambari. The source code is 100 percent open and available to the entire Apache community, allowing participants to leverage these features to enable the implementation of many of the leading file systems and object stores available today.

The second focus area is the creation of generic test suites to validate compatibility between Hadoop and alternative file systems. Hortonworks and Red Hat will contribute these extensive testing blueprints to the open source community for use by any developer looking to test file system compatibility with Hadoop.

The third focus area is both companies are working to integrate Hortonworks Data Platform with Red Hat Storage so that enterprise customers will be able to process stored data on Red Hat Storage. Since Red Hat Storage is POSIX-compliant, it makes it easy to connect to the enterprise applications and run Hadoop analytics on enterprise data to reduce duplication of data and save costs.