The Apache Beam team has announced the first stable release of the project, version 2.0. Apache Beam is an advanced unified programming model designed for batch and streaming data processing.
In addition to this being the first stable release of the project, the team says this is also the third most important milestone for the community. Apache Beam first entered the Apache Incubator in February 2016, and later became a top level project in December.
The release features improved user experience, seamless portability across execution environments, API stability, stateful data processing paradigms, support for user-extensible file systems, and a metrics subsystem.
“The first stable release is an important milestone for the Apache Beam community,” said Davor Bonaci, vice president of Apache Beam. “This is a statement from the community that it intends to maintain API stability with all releases for the foreseeable future, making Beam suitable for enterprise deployment.”
Beam’s data processing pipelines and runners can be executed on Apache Apex, Flink, SPark and Google Cloud Dataflow as well as other execution engines. Other features include a single programming model for batch and streaming use cases, ability to executive pipelines on multiple environments, and the ability to write and share new SDKs, IO connectors and transformation libraries.
Apache Beam is already being used by companies like Google Cloud, PayPal and Talend.
“We congratulate the Apache Beam community for reaching the key milestone of a first stable release,” said William Vambenepe, lead product manager for Big Data at Google Cloud. “We look forward to our Google Cloud Dataflow customers taking full advantage of Beam’s powerful programming model and newest features to run their data processing pipelines on Google Cloud.”