Data-as-a-Service platform provider Dremio announced a new open-source initiative for Apache Arrow this week. The Gandiva Initiative for Apache Arrow aims to speed up and improve the performance of in-memory analytics using Apache Arrow.
The project will leverage the open-source compiler LLVM, and apply any changes to programming languages and libraries starting with C++ and Java, with Python, Ruby, Go, Rust and JavaScript changes to follow. With LLVM, Dremio says it will be able to optimize Arrow’s libraries, and low-level operations for specific runtime environments as well as improve resource utilization and provide lower-costs operations.
“Apache Arrow was created to provide an industry-standard, columnar, in-memory data representation,” said Jacques Nadeau, co-founder and CTO of Dremio, and PMC Chair of Apache Arrow. “Dozens of open source and commercial technologies have since embraced Arrow as their standard for high-performance analytics. The Gandiva Initiative introduces a cross-platform data processing engine for Arrow, representing a quantum leap forward for processing data. Users will experience speed and efficiency gains of up to 100x in the coming months.”
The main goals of the initiative include improving the time it takes to gain insights from analytics, machine learning and data sciences; and lower the costs of cloud infrastructure operations.