The open-source distributed pub-sub messaging system Apache Pulsar is graduating from the Apache Incubator and becoming a Top-Level project. Top-level status at the Apache Software Foundation signifies a milestone in the project’s development, progress and future.
“The fact that Apache Pulsar has gone from incubator project to top-level in two short years is a testament to the community growth around the project,” said Matteo Merli, vice president of Apache Pulsar and co-founder of the data platform provider Streamlio. “Organizations are rapidly adopting Pulsar and it has become instrumental in a broad range of modern data-driven applications.”
Pulsar first started out at Yahoo in 2015, which is now a part of Oath, as a result of the need to scale, manage and protect its growing amounts of data. According to the company, existing technologies fell short in being able to maintain data and ensure data was never lost, or turned out to be too expensive. Pulsar was built with scalability, performance and resiliency in mind. In June of 2017, the project was submitted to the Apache Incubator.
“After deploying and operating Pulsar at Yahoo! for multiple years, we realized that there was a broader community of people who needed the same things that Yahoo! needed. That was what led us to the decision to contribute Apache Pulsar to the Apache Incubator, the start of the process that led to Pulsar’s graduation to a top-level Apache project today. We saw that making Pulsar an open source project would not only enable broader adoption, but also accelerate innovation based on Pulsar’s core architecture,” Merli and Steamlio co-founder Karthik Ramasamy, wrote in a post.
The project’s initial goal was to provide a multi-tenant scalable messaging system for a wide variety of use cases. Today, the project provides lightweight computer and connect frameworks for processing data and integrating with external systems. Apache Pulsar 2.0 was announced in June with performance, scalability and durability improvements. The team recently announced Pulsar 2.1 last month with the Pulsar IO connector framework, new built in connectors, tiered storage, and stateful functions. “In Pulsar 2.1, we continued following this “simplicity first” principle on developing Pulsar. We developed this IO (input/output) connector framework on top of Pulsar Functions, to simplify getting data in and out of Apache Pulsar. You don’t need to write any single line of code. All you need is prepare a configuration file of the system your want to connect to, and use Pulsar admin CLI to submit a connector to Pulsar. Pulsar will take care of all the other stuffs, such as fault-tolerance, rebalancing and etc,” the Pulsar team wrote in a blog post.
Other features include developer-friendly APIs for deploying lightweight compute logic, the ability to horizontally scale, low publish latency, geo-replication and persistent storage. The project is currently being used at MercadoLibre, Oath, One Click Retail, STICorp, TaxiStartup, Yahoo Japan Corporation and Zhaopin.com.
“Having that set of capabilities in a single, scalable and high-performance solution opens up a wide array of possibilities. From building a data fabric that connects data from the edge to the cloud to the datacenter on a common platform, to enabling real-time interactions with customers and partners, to handling demanding low-latency data processing and analytics on market data and transactions, Apache Pulsar is proving itself in companies both big and small. We’re excited to continue to work with the Apache community to continue to innovate and drive Pulsar forward,” Merli and Ramsamy wrote.