OpenAI says it is backlogged with a waitlist of prospective testers seeking to assess if the first private beta of its GPT-3 natural language programming (NLP) tool really can push the boundaries of artificial intelligence (AI).
Since making the GPT-3 beta available in June as an API to those who go through OpenAI’s vetting process, it has generated considerable buzz on social media. GPT-3 is the latest iteration of OpenAI’s neural-network-developed language model. The first to evaluate the beta, according to OpenAI, include Algolia, Quizlet and Reddit, and researchers at the Middlebury Institute.
Although GPT-3 is based on the same technology as its predecessor GPT-2, released last year, the new version is an exponentially larger data model. With nearly 175 billion trainable parameters, GPT-3 is 100 times larger than GPT-2. GPT-3 is 10 times larger in parameters than its closest rival, Microsoft’s Turing NLG, which has only 17 billion.
Experts have described GPT-3 as the most capable language model created to date. Among them is David Chalmers, professor of Philosophy and Neural Science at New York University and co-director of NYU’s Center for Mind, Brain, and Consciousness. Chalmers underscored in a recent post that GPT-3 is trained on key data models such as Common Crawl, an open repository of searchable internet data, along with a huge library of books and all of Wikipedia. Besides its scale, GPT-3 is raising eyebrows at its ability to automatically generate text rivaling what a human can write.
“GPT-3 is instantly one of the most interesting and important AI systems ever produced,” Chalmers wrote. “This is not just because of its impressive conversational and writing abilities. It was certainly disconcerting to have GPT-3 produce a plausible-looking interview with me. GPT-3 seems to be closer to passing the Turing test than any other system to date (although “closer” does not mean “close”).”
Another early tester of GPT-3, Arram Sabeti, was also impressed. Sabeti, an investor who remains chairman of ZeroCater, was among the first to get his hands on the GPT-3 API in July. “I have to say I’m blown away. It’s far more coherent than any AI language system I’ve ever tried,” Sabeti noted in a post, where he where he shared his findings.
“All you have to do is write a prompt and it’ll add text it thinks would plausibly follow,” he added. “I’ve gotten it to write songs, stories, press releases, guitar tabs, interviews, essays, technical manuals. It’s hilarious and frightening. I feel like I’ve seen the future and that full AGI [artificial general intelligence] might not be too far away.”
It is the “frightening” aspect that OpenAI is not taking lightly, which is why the company is taking a selective stance in vetting who can test the GPT-3 beta. In the wrong hands, GPT-3 could be the recipe for misuse. Among other things, one could use GPT-3 to create and spread propaganda on social media, now commonly called “fake news.”
OpenAI’s Plan to Commercialize GPT-3
The potential for misuse is why OpenAI chose to release it as an API rather than open sourcing the technology, the company said in a FAQ. “The API model allows us to more easily respond to misuse of the technology,” the company explained. “Since it is hard to predict the downstream use cases of our models, it feels inherently safer to release them via an API and broaden access over time, rather than release an open source model where access cannot be adjusted if it turns out to have harmful applications.”
OpenAI had other motives for going the API route as well. Notably, because the NLP models are so large, it takes significant expertise to develop and deploy, which makes it expensive to run. Consequently, the company is looking to make the API accessible to smaller organizations as well as larger ones.
Not surprisingly, by commercializing GPT-3, OpenAI can fund ongoing research in AI, as well as continued efforts to ensure it is used safely with resources to lobby for policy efforts as they arise.
Ultimately, OpenAI will release a commercial version of GPT-3, although the company hasn’t announced when, or how much it will cost. The latter could be significant in determining how accessible it becomes. The company says part of the private beta aims to determine what type of licensing model it will offer.
OpenAI, started as a non-profit research organization in late 2015 with help from deep-pocketed founders who include Elon Musk, last year emerged into a for-profit business with a $1 billion investment from Microsoft. As part of that investment, OpenAI runs in the Microsoft Azure cloud.
The two companies recently shared the fruits of their partnership one year later. At this year’s Microsoft Build conference, held as a virtual event in May, Microsoft CTO Kevin Scott said the company has created one of the world’s largest supercomputers running in Azure.
OpenAI Seeds Microsoft’s AI Supercomputer in Azure
Speaking during a keynote session at the Build conference, Scott said Microsoft completed its supercomputer in Azure at the end of last year, taking just six months, according to the company. Scott said the effort will help bring these large models in reach of all software developers.
Scott likened it to the automotive industry, which has used the niche high-end racing use case to develop technologies such as hybrid powertrains, all-wheel drive and antilocking brakes. Some of the benefits of its supercomputing capabilities and the large ML models hosted in Azure enabled by those capabilities are significant to developers, Scott said.
“This new kind of computing power is going to drive amazing benefits for the developer community, empowering previously unbelievable AI software platforms that will accelerate your projects large and small,” he said. “Just like the ubiquity of sensors and smartphones, multi-touch location, high-quality cameras, accelerometers enabled an entirely new set of experiences, the output of this work is going to give developers a new platform to build new products and services.”
Scott said OpenAI is conducting the most ambitious work in AI today, indicating work like GPT-3 will give developers access to very large models that were out of their reach until now. Sam Altman, OpenAI’s CEO, joined Scott in his Build keynote to explain some of the implications.
Altman said OpenAI wants to build large-scale systems and see how far the company can push it. “As we do more and more advanced research and scale it up into bigger and bigger systems, we begin to make this whole new wave of tools and systems that can do things that were in the realm of science fiction only a few years ago,” Altman said.
“People have been thinking for a long time about computers that can understand the world and sort of do something like thinking,” Altman added. “But now that we have those systems beginning to come to fruition, I think what we’re going to see from developers, the new products and services that can be imagined and created are going to be incredible. I think it’s like a fundamental new piece of computing infrastructure.”
Beyond Natural Language
As the models become a platform, Altman said OpenAI is already looking beyond just natural language. “We’re interested in trying to understand all the data in the world, so language, images, audio, and more,” he said. “The fact that the same technology can solve this very broad array of problems and understand different things in different ways, that’s the promise of these more generalized systems that can do a broad variety of tasks for a long time. And as we work with the supercomputer to scale up these models, we keep finding new tasks that the models are capable of.”
Despite its promise, OpenAI and its vast network of ML models don’t close the gap on all that’s missing with AI.
Boris Paskalev, co-founder and CEO of DeepCode, said GPT-3 provides models that are an order of magnitude larger than GPT-2. But he warned that developers should beware of drawing any conclusions that GPT-3 will help them automate code creation.
“Using NLP to generate software code does not work for the very simple reason that software code is semantically complex,” Paskalev told SD Times. “There is absolutely no actual use for it for code synthesis or for finding issues or fixing issues. Because it’s missing that logical step that is actually embedded, or the art of software development that the engineers use when they create code, like the intent. There’s no way you can do that.”
Moiz Saifee, a principal on the analytics team of Correlation Ventures, posted a similar assessment. “While GPT-3 delivers great performance on a lot of NLP tasks — word prediction, common sense reasoning– it doesn’t do equally well on everything. For instance, it doesn’t do great on things like text synthesis, some reading comprehension tasks, etc. In addition to this, it also suffers from bias in the data, which may lead the model to generate stereotyped or prejudiced content. So, there is more work to be done.”