The Apache Software Foundation has announced the release of Apache Drill 1.0.
Drill is a schema-free SQL query engine for Hadoop, NoSQL and cloud storage that uses columnar execution, data-driven query compilation, and the JSON document model to store data in various formats for Big Data analytics and BI. Drill was upgraded to a Top-Level Project within the ASF back in December 2014.
Tomer Shiran, member of the Apache Drill Project Management Committee, said Drill is unique in that it gives developers and enterprises more agility in data querying across datastores without requiring transformation.
“Traditional query engines demand significant IT intervention before data can be queried,” said Shiran. “Drill gets rid of all that overhead so that users can just query the raw data in situ. There’s no need to load the data, create and maintain schemas, or transform the data before it can be processed. Instead, simply include the path to a Hadoop directory, MongoDB collection or S3 bucket in the SQL query.”
(Related: Hadoop and beyond: A primer on Big Data for the little guy)
Apache Drill is now considered production-ready at version 1.0 after almost three years in development, and it currently adheres to a four- to six-week release cycle. Drill also integrates with other Hadoop technologies, including the columnar storage of newly minted TLP Apache Parquet, and various data virtualization, visualization and BI tools.
Shiran said the next release will include additional analytical functions and performance-related enhancements.
“Drill currently supports a variety of datastores, such as Hadoop, HBase and MongoDB, and support for additional datastores is in the pipeline,” said Shiran. “Drill’s mission is to provide a high-performance, schema-free SQL layer for all non-relational datastores so that business users, analysts and data scientists can easily access these increasingly popular systems.”
More details about Apache Drill 1.0 are available here.