Hazelcast today released the first version of Hazelcast Jet, a distributed processing engine for Big Data streams. Jet integrates with the Hazelcast in-memory data grid to process data in parallel across nodes in near real time.
Hazelcast Jet uses directed acyclic graphs to model relationships between tasks in the data-processing pipeline. The system is built on one-record-per-time architecture, which allows Jet to process data immediately, rather than in batches.
(Related: Apache Beam goes top level)
For users with high-volume data streams, Jet ingests information via socket, file, HDFS or Kafka. Once inside, developers can take advantage of event-based architecture to process that information as it’s needed through a high level Java API.
Greg Luck, CEO of Hazelcast, said, “Hazelcast Jet is a super-fast, low-latency, next-generation DAG Engine for Big Data processing. We believe that the Hadoop and Spark ecosystems are too complex to program and to deploy, and have set out to bring Hazelcast’s legendary simplicity to Big Data. We have designed it as a general-purpose engine for the intersect of Big Data programmers and Java programmers. But if you are already a Hazelcast user or have data in Hazelcast, it will be the easiest way to solve your Big Data problems.”
Hazelcast Jet can be embedded into existing applications to provide an end-to-end stream-processing workflow. The system also includes a more low-level core API for maximizing flexibility and allowing direct manipulation of the data readers.