There’s been much talk about how artificial intelligence will benefit society, but what about the potential impacts that AI has when the system is poorly designed and creates problems? This is a question several researchers and OpenAI, a non-profit artificial intelligence research company, tackled in a recent paper.
The paper was written by researchers from Google Brain, Stanford University and the University of California, Berkeley, as well as John Schulman, research scientist at OpenAI. It’s titled Concrete Problems in AI Safety, and it looks at research problems around ensuring that modern machine learning systems operate as intended.
Researchers have started to focus on safety research in the machine learning community, including a recent paper from DeepMind and the Future of Humanity Institute that looked at how to make sure that human interventions during the learning process would not induce a bias toward undesirable behaviors in machine learning robots. But, according to a blog post by OpenAI, many machine learning researchers are wondering just how much safety research can be done today.
(Related: Machine learning is the new SOA)
The authors of the paper focused on five topics: safe exploration, robustness to distributional shift, avoiding negative side effects, avoiding “reward hacking” and “wireheading,” and scalable oversight.
The authors also took a broad look at the problem of accidents in machine learning systems, and they defined accidents as “unintended and harmful behavior that may emerge from machine learning systems when we specify the wrong objective function, are not careful about the learning process, or commit other machine learning-related implementation errors,” according to the paper.
The paper referenced a fictional robot whose job is to clean up messes in an office using common cleaning tools. It illustrated how it could behave if the designers of its machine learning system create one of the five aforementioned risks. For instance, the researchers asked, “How can we ensure that our cleaning robot will not distort the environment in negative ways while pursuing its goals?” and, “How do we ensure that the cleaning robot doesn’t make exploratory moves with bad repercussions?”
To answer these questions, the researchers focused on concreteness on either reinforcement learning agents or supervised learning systems which are “not the only possible paradigms for AI or ML systems, but we believe they are sufficient to illustrate the issues we have in mind, and that similar issues are likely to arise for other kinds of AI systems,” according to the paper.
According to OpenAI, many of these problems addressed in the paper are not new, but the paper explored them in the context of cutting-edge systems. The researchers hope that this will inspire more work on AI safety research, at OpenAI and elsewhere.