“The Big Data community has recognized an opportunity to develop a shared technology to address columnar in-memory analytics, and has joined forces to create Apache Arrow,” wrote Nadeau. “The Apache Drill community is seeding the project with the Java library, based on Drill’s existing ValueVectors technology, and Wes McKinney, creator of Pandas and Ibis, is contributing the initial C/C++ library. Given the credentials of those involved as well as code provenance, the Apache Software Foundation decided to make Apache Arrow a Top-Level Project, highlighting the importance of the project and community behind it.”
Arrow is, decidedly, the future of in-memory columnar storage, at least, for the Apache Foundation, wrote Nadeau. “Drill, Impala, Kudu, Ibis and Spark will become Arrow-enabled this year, and I anticipate that many other projects will embrace Arrow in the near future as well. Arrow community members (including myself) will [be] speaking at upcoming conferences, including Strata San Jose, Strata London and numerous meetups,” he wrote.
Apache Arrow is already available from the Apache GitHub repository.