Elastic is implementing a new approach for storing vectorized data that will require 95% less memory.
Better Binary Quantization, or BBQ, is based on a technique called RaBitQ, which was developed earlier this year by researchers at Nanyang Technological University Singapore.
According to Elastic, the biggest differences between BBQ and native binary quantization are that:
- All vectors get normalized around a centroid
- Multiple error correction values are stored
- Asymmetric quantization increases search quality without increasing storage costs
- The way that query vectors are quantized and transformed enables more efficient bit-wise operations
“Elasticsearch is evolving to become one of the best vector databases in the world, and we see our users wanting to put more and more vectorized data in it,” said Ajay Nair, general manager of Platform at Elastic. “Better Binary Quantization is our latest innovation to reduce the resources needed to store vectorized data and provide freedom to our users to vectorize all the things.”
BBQ is currently available as a technical preview for self-managed and cloud Elasticsearch users. In order to use BBQ, users can set dense_vector.index_type
as bbq_hnsw
or bbq_flat
. The company will also be contributing the technique to Apache Lucene.
More information on this new technique, including benchmarking data, can be found in Elastic’s blog post about BBQ.