The Apache Apex project has been promoted from the Apache Incubator to becoming top-level project as of today. This open-source stream- and batch-processing platform works with YARN and HDFS, runs in memory, and can handle event processing and fault tolerance.
Apex started out as the real-time streaming core of DataTorrent. The company contributed its platform to Apache last fall, and has since expanded it as it incubated. Today, Apex is composed of the platform itself, which runs in Hadoop, and Apex Malhar, a library of operations that implement common business logic.
(Related: New data-management tools on display at Hadoop World)
The Malhar library includes support for several file transfer protocols, messaging queues and databases. These extend from FTP, NFS and JMS, to Kafka, RabbitMQ and a host of popular NoSQL databases.
Thomas Weise, Apache Apex PMC member, said, “It is very exciting to see Apex after nearly four years since inception becoming an ASF top-level project. It opens the strong capabilities and potential of the platform to a wider audience, and we’re looking forward to a growing community to continue driving innovation in the stream-processing space.”
Parag Goradia, executive director of Predix Data Services, said, “We at GE Predix data services have used Apex for our data pipeline product and look forward to our continued usage and contribution. We had great experience with Apache Apex and its capabilities. We believe Apex has a bright future as it will continue to solve big problems in the Big Data industry. We are proud to be associated with this project and excited that it is now in top-level status.”
The Apex Project is available on GitHub, and currently has 29 contributors.