Google launched the beta of Cloud AI Platform Pipelines, which combines repeatable machine learning pipelines along with monitoring, auditing, version tracking, and reproducibility. It aims to deliver an enterprise-ready, easy to install, secure execution environment for ML workflows.
AI Platform Pipelines were created because machine learning workflows can involve many steps with dependencies on each other, making it difficult to compose and track the processes in an ad-hoc manner, according to Google.
Cloud AI Platform pipelines support two SDKs to create ML pipelines: the Kubeflow Pipelines SDK—part of the Kubeflow OSS project, enabling direct Kubernetes resource control and simple sharing of containerized components; and the TensorFlow (TFX) SDK, which provides higher-level abstraction with prescriptive, but customizable components with predefined ML types.
The Kubeflow Pipelines SDK is a lower-level SDK and ML-framework-neutral. It enables fully custom pipelines or prebuilt pipelines with KFP components.
In the TFX SDK, the automatic metadata tracking logs the artifacts used in each pipeline step, pipeline parameters, and the linkage across the input/output artifacts, as well as the pipeline steps that created and consumed them.
“ML workflows typically involve creating and tracking multiple types of artifacts—things like models, data statistics, model evaluation metrics, and many more. With AI Platform Pipelines UI, it’s easy to keep track of artifacts for an ML pipeline,” Anush Ramesh, product manager and Amy Unruh, staff developer advocate at Google, wrote in a blog post.
To make it easier for developers to get started with ML pipeline code, the TFX SDK provides templates, or scaffolds, with step-by-step guidance on building a production ML pipeline. Developers can then incrementally add different components to the pipeline and iterate on them, the team explained.
According to Google, AI Platform Pipelines offers:
- Push-button installation via the Google Cloud Console
- Enterprise features for running ML workloads, including pipeline versioning, automatic metadata tracking of artifacts and executions, Cloud Logging, visualization tools, and more
- Seamless integration with Google Cloud managed services like BigQuery, Dataflow, AI Platform Training and Serving, Cloud Functions, and many others
- Many prebuilt pipeline components (pipeline steps) for ML workflows, with easy construction of your own custom components
The beta launch of AI Platform Pipelines also includes a number of new features, including support for template-based pipeline construction, versioning, and automatic artifact and lineage tracking.