GPT-3: Advancing the understanding of cues for coding, writing

Published: September 25th, 2020

OpenAI says it is backlogged with a waitlist of prospective testers seeking to assess if the first private beta of its GPT-3 natural language programming (NLP) tool really can push the boundaries of artificial intelligence (AI).

Since making the GPT-3 beta available in June as an API to those who go through OpenAI’s vetting process, it has generated considerable buzz on social media. GPT-3 is the latest iteration of OpenAI’s neural-network-developed language model. The first to evaluate the beta, according to OpenAI, include Algolia, Quizlet and Reddit, and researchers at the Middlebury Institute.

Although GPT-3 is based on the same technology as its predecessor GPT-2, released last year, the new version is an exponentially larger data model. With nearly 175 billion trainable parameters, GPT-3 is 100 times larger than GPT-2. GPT-3 is 10 times larger in parameters than its closest rival, Microsoft’s Turing NLG, which has only 17 billion.

Experts have described GPT-3 as the most capable language model created to date. Among them is David Chalmers, professor of Philosophy and Neural Science at New York University and co-director of NYU’s Center for Mind, Brain, and Consciousness. Chalmers underscored in a recent post that GPT-3 is trained on key data models such as Common Crawl, an open repository of searchable internet data, along with a huge library of books and all of Wikipedia. Besides its scale, GPT-3 is raising eyebrows at its ability to automatically generate text rivaling what a human can write.

“GPT-3 is instantly one of the most interesting and important AI systems ever produced,” Chalmers wrote. “This is not just because of its impressive conversational and writing abilities. It was certainly disconcerting to have GPT-3 produce a plausible-looking interview with me. GPT-3 seems to be closer to passing the Turing test than any other system to date (although “closer” does not mean “close”).”

Another early tester of GPT-3, Arram Sabeti, was also impressed. Sabeti, an investor who remains chairman of ZeroCater, was among the first to get his hands on the GPT-3 API in July. “I have to say I’m blown away. It’s far more coherent than any AI language system I’ve ever tried,” Sabeti noted in a post, where he where he shared his findings.

“All you have to do is write a prompt and it’ll add text it thinks would plausibly follow,” he added. “I’ve gotten it to write songs, stories, press releases, guitar tabs, interviews, essays, technical manuals. It’s hilarious and frightening. I feel like I’ve seen the future and that full AGI [artificial general intelligence] might not be too far away.”

It is the “frightening” aspect that OpenAI is not taking lightly, which is why the company is taking a selective stance in vetting who can test the GPT-3 beta. In the wrong hands, GPT-3 could be the recipe for misuse. Among other things, one could use GPT-3 to create and spread propaganda on social media, now commonly called “fake news.”

OpenAI’s Plan to Commercialize GPT-3
The potential for misuse is why OpenAI chose to release it as an API rather than open sourcing the technology, the company said in a FAQ. “The API model allows us to more easily respond to misuse of the technology,” the company explained. “Since it is hard to predict the downstream use cases of our models, it feels inherently safer to release them via an API and broaden access over time, rather than release an open source model where access cannot be adjusted if it turns out to have harmful applications.”

OpenAI had other motives for going the API route as well. Notably, because the NLP models are so large, it takes significant expertise to develop and deploy, which makes it expensive to run. Consequently, the company is looking to make the API accessible to smaller organizations as well as larger ones.

Not surprisingly, by commercializing GPT-3, OpenAI can fund ongoing research in AI, as well as continued efforts to ensure it is used safely with resources to lobby for policy efforts as they arise.

Ultimately, OpenAI will release a commercial version of GPT-3, although the company hasn’t announced when, or how much it will cost. The latter could be significant in determining how accessible it becomes. The company says part of the private beta aims to determine what type of licensing model it will offer.

OpenAI, started as a non-profit research organization in late 2015 with help from deep-pocketed founders who include Elon Musk, last year emerged into a for-profit business with a $1 billion investment from Microsoft. As part of that investment, OpenAI runs in the Microsoft Azure cloud.

The two companies recently shared the fruits of their partnership one year later. At this year’s Microsoft Build conference, held as a virtual event in May, Microsoft CTO Kevin Scott said the company has created one of the world’s largest supercomputers running in Azure.

OpenAI Seeds Microsoft’s AI Supercomputer in Azure
Speaking during a keynote session at the Build conference, Scott said Microsoft completed its supercomputer in Azure at the end of last year, taking just six months, according to the company. Scott said the effort will help bring these large models in reach of all software developers.

Scott likened it to the automotive industry, which has used the niche high-end racing use case to develop technologies such as hybrid powertrains, all-wheel drive and antilocking brakes. Some of the benefits of its supercomputing capabilities and the large ML models hosted in Azure enabled by those capabilities are significant to developers, Scott said.

“This new kind of computing power is going to drive amazing benefits for the developer community, empowering previously unbelievable AI software platforms that will accelerate your projects large and small,” he said. “Just like the ubiquity of sensors and smartphones, multi-touch location, high-quality cameras, accelerometers enabled an entirely new set of experiences, the output of this work is going to give developers a new platform to build new products and services.”

Scott said OpenAI is conducting the most ambitious work in AI today, indicating work like GPT-3 will give developers access to very large models that were out of their reach until now. Sam Altman, OpenAI’s CEO, joined Scott in his Build keynote to explain some of the implications.

Altman said OpenAI wants to build large-scale systems and see how far the company can push it. “As we do more and more advanced research and scale it up into bigger and bigger systems, we begin to make this whole new wave of tools and systems that can do things that were in the realm of science fiction only a few years ago,” Altman said.

“People have been thinking for a long time about computers that can understand the world and sort of do something like thinking,” Altman added. “But now that we have those systems beginning to come to fruition, I think what we’re going to see from developers, the new products and services that can be imagined and created are going to be incredible. I think it’s like a fundamental new piece of computing infrastructure.”

Beyond Natural Language
As the models become a platform, Altman said OpenAI is already looking beyond just natural language. “We’re interested in trying to understand all the data in the world, so language, images, audio, and more,” he said. “The fact that the same technology can solve this very broad array of problems and understand different things in different ways, that’s the promise of these more generalized systems that can do a broad variety of tasks for a long time. And as we work with the supercomputer to scale up these models, we keep finding new tasks that the models are capable of.”

Despite its promise, OpenAI and its vast network of ML models don’t close the gap on all that’s missing with AI.

Boris Paskalev, co-founder and CEO of DeepCode, said GPT-3 provides models that are an order of magnitude larger than GPT-2. But he warned that developers should beware of drawing any conclusions that GPT-3 will help them automate code creation.

“Using NLP to generate software code does not work for the very simple reason that software code is semantically complex,” Paskalev told SD Times. “There is absolutely no actual use for it for code synthesis or for finding issues or fixing issues. Because it’s missing that logical step that is actually embedded, or the art of software development that the engineers use when they create code, like the intent. There’s no way you can do that.”

Moiz Saifee, a principal on the analytics team of Correlation Ventures, posted a similar assessment. “While GPT-3 delivers great performance on a lot of NLP tasks — word prediction, common sense reasoning– it doesn’t do equally well on everything. For instance, it doesn’t do great on things like text synthesis, some reading comprehension tasks, etc. In addition to this, it also suffers from bias in the data, which may lead the model to generate stereotyped or prejudiced content. So, there is more work to be done.”

Article Tags

AI, GPT-3, natural language processing, NLP

About Jeffrey Schwartz

Jeffrey Schwartz has covered all aspects of IT, from datacenter, networking, storage, cloud and end-user client computing infrastructure, to software development, collaboration and services for nearly three decades. Most recently he was editor-in-chief of Redmond magazine, where he also had roles with sister publications Virtualization Review, Application Development Trends and Visual Studio Magazine, among others.

View all posts by Jeffrey Schwartz

Cookie	Duration	Description
cf_use_ob	past	Cloudflare sets this cookie to improve page load times and to disallow any security restrictions based on the visitor's IP address.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it, before the share count cache is updated.
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_S6PB8V57DG	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_846073_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
_jsuid	1 year	This cookie contains random number which is generated when a visitor visits the website for the first time. This cookie is used to identify the new visitors to the website.
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
iutk	5 months 27 days	This cookie is used by Issuu analytic system to gather information regarding visitor activity on Issuu products.
uvc	1 year 1 month	Set by addthis.com to determine the usage of addthis.com service.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.
WMF-Last-Access	1 month 14 hours 26 minutes	This cookie is used to calculate unique devices accessing the website.

Cookie	Duration	Description
__Host-GAPS	2 years	This cookie allows the website to identify a user and provide enhanced functionality and personalisation.
_pxhd	session	Used by Zoominfo to enhance customer data.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
mc	1 year 1 month	Quantserve sets the mc cookie to anonymously track user behaviour on the website.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
__gpi	1 year 24 days	No description
__Secure-YEC	1 year 1 month	No description
_heatmaps_g2g_100754890	10 minutes	No description
_techvalidate_session	session	No description
cf_7166_id	20 years	No description
cf_7166_person_last_update	session	No description
f5avraaaaaaaaaaaaaaaa_session_	session	No description available.
GoogleAdServingTest	session	No description
Gyazo_cfwoker	7 years 2 months 17 days 7 hours	No description
incap_ses_451_2783402	session	No description
incap_ses_769_2783402	session	No description
loglevel	never	No description available.
m	2 years	No description available.
nlbi_2783402	session	No description
prism_252377639	1 month	No description
TS011605d9	session	No description
ustream-guest	session	No description available.
visid_incap_2783402	1 year	No description
xtc	1 year 1 month	No description

AI

AI and Software Development

Observability

Guide to Observability

CI/CD

A guide to CI/CD

Cloud Native

Cloud Native Content

Data

A Guide to Data

Test

Security Testing

Mobile

Mobile Testing

API

Sponsored by Parasoft

Performance

Load & Performance Testing

DevSecOps

A Guide to DevSecOps

Enterprise Security

A Guide to Security

Supply Chain Security

Supply Chain Security

Dev Manager

Dev Managers Content

Agile

A Guide To Agile

Value Stream

A Guide To Value Stream

Productivity

A Guide To Productivity

DevOps

DevOps Content

API

Gravitee.io

AI

AI and Software Development

Value Stream Management

A Guide To Value Stream

GPT-3: Advancing the understanding of cues for coding, writing

Article Tags

Subscribe to SDTimes

About Jeffrey Schwartz

Related Articles

Anthropic proposes transparency framework for frontier AI development

Last week in AI dev tools: Cloudflare blocking AI crawlers by default, Perplexity Max subscription, and more (July 7, 2025)

The AI productivity paradox in software engineering: Balancing efficiency and human skill retention

Gartner: More than 40% of agentic AI projects will be canceled in the next few years