Machine Learning is quickly becoming an important tool for automation, but failing models and improper background knowledge are creating more issues than they are solving.
“I think to build a good machine learning model… if you’re trying to do it repeatedly, you need great talent, you need an outstanding research process, and then finally you need technology and tooling that’s kind of up to date and modern,” said Matthew Granade, co-founder of machine learning platform provider Domino Data Lab. He explained how all three of these elements have to come together and operate in unity in order to create the best possible model, though Granade placed a special emphasis on the second aspect. “The research process determines how you’re going to identify problems to work on, find data, work with other parts of the business, test your results, and deliver those results to the business,” he explained.
According to Granade, the absence of the essential combination of those aspects is the reason why so many organizations are faced with failing models. “Companies have really high expectations for what data science can do but they’re struggling to bring those three different ingredients together,” he said. This raises the question: why are organizations investing so much into machine learning models but failing to invest in the things that will actually make their models an ultimate success? According to a study conducted by Domino Data Lab, 97% of those polled say data science is crucial to long-term success, however, nearly as many say that organizations lack the staff, skills, and tools needed to sustain that success.
Granade traces this problem back to the tendency to look for shortcuts. “I think the mistake a lot of companies make is that they kind of look for a quick fix,” he began, “They look for a point solution, or this idea of ‘I’m going to hire three or four really smart PhD’s and that’s going to solve my problem.’ “According to Granade, these types of quick fixes never work long-term because the issues run deeper. It is always going to be important to have the best minds on your team but they cannot exist independently. Without the best processes and best tech to back them up, it becomes a futile attempt to utilize data science.
Domino Data Lab’s study also revealed that 82% of executives polled said they thought that leadership needed to be concerned about bad or failing models as the consequences of those models could be astronomical. “Those models could lead to bad decisions that produce lost revenue, it could lead to bad key performance indicators, and security risks,” Granade explained.
Granade predicts that those companies that find themselves behind the curve on data science and machine learning practices will work quickly to correct their mistakes. Organizations that have tried and failed to implement this kind of technology will keep their eye on the others that have succeeded and take tips where they can get them. Not adapting to this practice isn’t an option in most industries as it will inevitably lead to certain companies falling behind as a business. Granade goes back to a comprehensive approach as the key to remedy the mistakes he has seen. “I think you can say ‘we’re going to invest as a company to build out this capability holistically.’ We’re going to hire the right people, we’re going to put a data science process in place, and the right tooling to support that process and those people,’ and I think if you do that you can see great results,” he said.
Jason Knight, co-founder and CPO at OctoML, believes that another aspect of creating a successful data science and machine learning model is a firm understanding of the data you’re working with. “You can think you have the right data but because of underlying issues with how it’s collected or annotated or generated in the first place, it can kind of create problems where you can’t generate a model out of it,” he explained. When there is an issue with the data that goes into generating a successful model, no matter what technique an organization uses, it will not work in the way it was intended. This is why it is so important not to skip steps when working with this kind of technology, assuming that the source data will work without properly understanding its details will spark issues down the line.
Vaibhav Nivargi, founder and CTO of the cloud-based AI platform Moveworks, also emphasized good data as being an essential aspect of creating a successful model. “It requires everything from the right data to represent the real world, to the right understanding of this data for a given domain, to the right algorithm for making predictions,” he said. The combination of these will help to ensure that the data going into creating the model is dependable and will create the desired results.
OctoML’s Knight also said that while certain organizations have not seen the success they had originally intended with data science and machine learning, he thinks the future is bright. “In terms of the future, I remain optimistic that people are pushing forward the improvements needed to give solutions for the problems we have seen,” he said.