In July 2018 I wrote here about the next evolution of application life-cycle management (ALM), which is extending its reach into the space of DevOps continuous delivery management, helping to extend the reach of full traceability from requirements to deployed code. ALM tools and the art and science of software engineering supporting it have much to teach the emerging space of machine learning (ML) life-cycle management.
Artificial intelligence (AI) has emerged in recent years from a period of reduced research funding, called an AI winter, that started in late 1990s. The most exciting technology that caused this healthy resurgence is deep learning (DL), a branch of ML, which is itself a branch of AI. To be clear we are still talking about advanced “signal processing” (terminology that electrical engineers will recognize) with a degree of intelligence, what Ovum calls machine intelligence, a phase in the evolution of AI that is not quite narrow AI and still a long way away from general AI (the point where intelligent machines can match human intelligence).
Despite this limited scope, DL systems are useful: They can perform functions superior to human performance in a range of activities, making them ready for real-world applications. The car industry for example is spending millions on autonomous driving research based on machine intelligence, the fruits of this research are already feeding into advanced driver-assistance systems.
Some of the industries most impacted by ML today include: finance, algorithmic trading has transformed the investment industry; health, doctor assistants in image analysis and research mining, drug discovery in pharmaceutical industry; customer service, AI powered virtual assistants and front-line customer support; and telecommunications, in both the customer care business and in network engineering.
ML applications are also set to expand driven by the rollout of mobile 5G and next-generation technologies, from hyperconverged infrastructure to cloud-native computing. These in turn will grow edge computing and Internet of Things, creating opportunities for ML applications at the edge as compute power increases and costs reduce.
Against this surge of ML activity enterprises looking to deploy such applications are finding there is a serious cultural gap in how the deployment is managed. Data science and data engineering are relatively young fields, and for many years have been active in largely research modes. What has changed is how enterprises are releasing multiple ML applications into production and finding that while traditional software applications have ALM tools to support development and deployment, the need for such support of ML applications is just beginning to be appreciated.
From a life-cycle management viewpoint, ML applications equate most closely with software applications that have complex database activities. The data dimension in ML applications is as important as the algorithmic dimension. Managing data is a hugely complex task that needs to be supported in training ML applications and then supported at scale in production (inference mode).
Enterprises serious about the deployment of ML applications will need ML life-cycle management tools — this will become a hot space. Some of the players/products include: CognitiveScale, DataKitchen, DataRobot, MLflow, ParallelM MLOps, and Valohai. In addition, Google’s Kubeflow open-source project is complementary to this community as it focuses on creating a platform for running containerized ML components managed by Kubernetes.
The key challenge for the data science/data engineering community is a cultural one. The adoption of ML life-cycle concepts is one of maturity of process and the community is discovering this need the hard way through mistakes. But we expect this will change, running data science projects in silos with small-scale production requirements is rather different from embedding ML components in business-critical systems at scale.
For example, take one aspect, version control — a discipline well-drilled into software engineers. For data scientists and engineers there is the need to version control data sets used in training, testing, and validation, plus configuration files, hyperparameter sets as well as algorithm versions. To reproduce results everything needs to be version controlled.
As data science and engineering matures the recognition for the role of ML lifecycle management will also grow.