After eight years of development, Apache HBase officially reached version 1.0 yesterday. This stable release includes more than 1,500 bug fixes and changes, and has fully revised documentation. HBase is the database that runs inside of Hadoop, and can be used to store relational data from traditional data stores on the Hadoop File System.
Michael Stack, vice president of Apache HBase, said that version 1.0 “marks a major milestone in the project’s development. It is a monumental moment that the army of contributors who have made this possible should all be proud of. The result is a thing of collaborative beauty that also happens to power key, large-scale Internet platforms.”
Performance improvements topped the list of features for version 1.0. Old APIs are being replaced on the client-side, and HTableInterface, HTable and HBaseAdmin are all being deprecated and designated for removal as of the 2.x releases. (No date is yet set for those releases.)
Facebook’s contributions to HBase had been left behind at a branch of version 0.89, but those changes were finally merged into the 1.0 release. They include allowing a subset of the server configuration to be reloaded, without requiring a restart of the region servers.
Mike Hoskins, CTO of Actian, said that his company uses HBase internally. “I love it. It’s one level up from HDFS, it’s columnar, has a flexible schema, and it’s time-based,” he said.
“We use it as an event historian. It’s an infinitely large historian, where we can look through all the timestamps coming through our pipeline. These pipelines send time-stamped metrics that are generated from events, and we have to eat them, and process them, and derive big insights and big models from these streams. Time is a first-class dimension [in HBase], which I am a big believer in.”