Cloudera believes that community participation in developing these standards will ensure that companies can leverage their machine learning investments and pave a path for the future.
“Machine learning models are already part of almost every aspect of our lives from automating internal processes to optimizing the design, creation, and marketing behind virtually every product consumed,” said Nick Patience, founder and research vice president, software at 451 Research. “As ML proliferates, the management of those models becomes challenging, as they have to deal with issues such as model drift and repeatability that affect productivity, security and governance. The solution is to create a set of universal, open standards so that machine learning metadata definitions, monitoring, and operations become normalized, the way metadata and data governance are standardized for data pipelines.”
Doug Cutting, Cloudera’s chief architect, further explained that they didn’t want to work to solve challenges with machine learning models only for their customers, and that it should be addressed at the industry-wide level.
He also added that Apache Atlas is a well-positioned framework to combine data management and explainable, interoperable, and reproducible MLOps workflows. Atlas is a data goverance and metadata framework for Hadoop. “The Apache Atlas (Project) fits all the needs for defining ML metadata objects and governance standards,” said Cutting. “It is open-source, extensible, and has pre-built governance features.”