Apache MADlib, an open-source Big Data machine learning library used for scalable in-database analytics, graduated from the Apache Incubator to become a Top-Level Project (TLP) today. The project has been well-governed under the Apache Software Foundation’s processes and principles, and its graduation to TLP signifies an important milestone for the Apache MADlib community.
“During the incubation process, the MADlib community worked very hard to develop high quality software for in-database analytics, in an open and inclusive manner in accordance with the Apache Way,” said Aaron Feng, vice president of Apache MADlib.
Apache MADlib is a comprehensive library for scalable in-database analytics, providing users with parallel implementations of machine learning, graph, mathematical and statistical methods for structured and unstructured data. It came to be an open source project after database engine developers, data scientists, IT architects and academics became interested in new approaches to sophisticated and scalable in-database analytics, according to the Apache announcement.
“MADlib was conceived from the outset as an open-source meeting ground for software developers, computing researchers and data scientists to collaborate on scalable, in-database machine learning and statistics,” said Joe Hellerstein, professor of computer science at UC Berkeley, cofounder and Chief Strategy Officer at Trifacta, and one of the original authors of MADlib. “It has been great to witness the growth of the MADlib community and codebase as an ASF incubating project, and I look forward to this continuing as a Top-Level Project.”
MAD lib is already deployed on many academic projects and at the industry level. For instance, Pivotal has seen its customers successfully deploy MADlib on large scale data science projects, according to Elisabeth Hendrickson, vice president R&D for data at Pivotal.
“As MADlib graduates to a Top-Level Project at the ASF, we anticipate increased adoption in the enterprise given the mature level of the codebase and the active developer community,” she said.
New participants are more than welcome to join the project, according to Feng, and the team looks forward to working with more contributors as Apache MADlib makes its way to a fully-fledged project at Apache.