Apache Ignite has reached a new milestone with the release of 2.5. The Apache Software Foundation has announced Apache Ignite 2.5 is now able to be scaled to 1000s nodes clusters as well as it scales to 100s nodes clusters.
Apache Ignite is a “memory-centric distributed database, caching and processing platform for transactional, analytical and steaming workloads.”
“Apache Ignite was always appreciated by its users for two primary things it delivers – scalability and performance. Throughout the lifetime many distributed systems tend to do performance optimizations from a release to release while making scalability related improvements just a couple of times. It’s not because the scalability is of no interest. Usually, scalability requirements are set and solved once by a distributed system and don’t require significant additional interventions by engineers,” the ASF wrote in a post.
To improve its scalability capabilities, the Ignite team utilized Apache ZooKeeper, a ”centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group service.” According to the team, the Ignite default TCP/IP discovery solution impacted the overall cluster responsiveness and performance when processing events.
“The new ZooKeeper Discovery uses ZooKeeper as a single point of synchronization where Ignite nodes are exchanging discovery events through it. It solved the issue with long-to-be-processed discovery messages and, as a result, allowed Ignite scaling to large cluster topologies,” the ASF wrote.
The Ignite team suggests users keep the default TCP/IP discovery solution if their cluster is unlikely to scale beyond 300s nodes.
The team also improved the way users train machine learning models over terabytes and petabytes of data. “ The partition-based datasets moved us closer to the implementation of Zero-ETL concept which implies that Ignite can be used as a single storage where ML models and algorithms are being improved iteratively and online without ETLing data back and forth between Ignite and another storage,” the ASF wrote.
Improvements included genetic algorithms, continuous self-healing and consistency checks. Security and fast data loading, in-place execution of Spark DataFrame queries and DEB and RPM packages.
The full release notes are available here.