The Apache Arrow team has announced the release of Apache Arrow 1.0.0. Apache Arrow is a development platform for in-memory analytics.
Version 1.0.0 is the 18th release of the platform. It features 810 resolved issues from 100 contributors.
According to the team, this release marks a transition to binary stability of the columnar format and a transition to Semantic Versioning for the Arrow software libraries.
The columnar format has received a number of changes in this release:
- The metadata version was bumped to a new version.
- Dictionary indices gained the ability to be insight integers.
- A new “Feature” enum was added.
- Optional buffer compression using LZ4 or ZStandard was added to the IPC format.
- Decimal types gained an optional “bitWidth” field that defaults to 123. According to the team, this will allow them to support other decimal widths in the future, such as 32- and 64-bit.
- The validity bitmap buffer was removed.
In addition, the team has expanded integration testing to test for extension types and nested dictionaries.
Apache Arrow Flight, which is a framework for high-performance data services, also received a few updates. It now offers DoExchange, which is a bidirectional data endpoint, as well as DuGet and DuPut. In addition, servers and clients can now set read and write options in all languages. This will make compatibility with earlier versions of Apache Flight easier, the team explained.
The team also updated support for C++, Java, Python, R, Ruby and C GLib, and Rust. More information is available here.