The open-source, native analytic database for Apache Hadoop is graduating from the Apache Software Foundation’s Incubator today to a Top-Level Project (TLP). This is an important milestone in the project’s development.
“In 2011, we started development of Impala in order to make state-of-the-art SQL analytics available to the user community as open-source technology,” said Marcel Kornacker, original founder of the Impala project. “The graduation to an Apache Top-Level Project is a recognition of the exceptional developer community that stands behind this project.”
The project features the same unified storage platform as other Hadoop components as well as the same metadata, SQL syntax, ODBC driver and user interface as Apache Hive in order to provide a familiar and unified platform. Other features include the ability to query high volumes of data in Hadoop, distributed queries in a cluster environment, and the ability to share data files between different components.
“Impala raises the bar for SQL query performance on Apache Hadoop while retaining a familiar user experience. With Impala, you can query data, whether stored in HDFS or Apache HBase – including SELECT, JOIN, and aggregate functions – in real time. Furthermore, Impala uses the same metadata, SQL syntax (Hive SQL), ODBC driver, and user interface (Hue Beeswax) as Apache Hive, providing a familiar and unified platform for batch-oriented or real-time queries. (For that reason, Hive users can utilize Impala with little setup overhead.),” according to the project’s website
Impala is used across a number of industries such as financial services, healthcare and telecommunications and within companies like Caterpillar, Cox Automotive, Jobrapido, Marketing Associates, the New York Stock Exchange, phData, and Quest Diagnostics. It is also shipped by Cloudera, MapR and Oracle.
“Apache Impala is our interactive SQL tool of choice. Over 30 phData customers have it deployed to production,” said Brock Noland, chief architect at phData. “Combined with Apache Kudu for real-time storage, Impala has made architecting IoT and Data Warehousing use-cases dead simple. We can deploy more production use-cases with fewer people, delivering increased value to our customers. We’re excited to see Impala graduate to a top-level project and look forward to contributing to its success.”