Computers’ understanding of human language has progressed to the point where natural language processing (NLP) is becoming good enough for a broad swath of applications, according to speakers at the Index AI Summit.
One example is OpenAI’s Codex, a model that powers the GitHub Copilot AI tool that offers code suggestions based on user comments. You can write text documentation in English, and the model suggests options for the corresponding code you intended to write next.
The model is transforming software development by providing programmers with a natural language API to billions of lines of source code from GitHub and other public sources. The technology is expected to help make programmers far more efficient by automatically generating code from simple natural language prompts.
Codex is one of the most successful use cases so far for Generative Pre-trained Transformer 3 (GPT-3), an OpenAI-developed natural language model that was trained with the largest dataset of written text on the Internet.
Expect to see similar use cases for GPT-3 emerge around Internet search, copy generation, graphic design, customer service, market testing, and a wide range of other applications, said Sam Altman, CEO of OpenAI, during a panel discussion at the AI Summit earlier this year.
A lot of people who are deploying GPT-3 “very effectively … actually want some sort of version of the Star Trek computer," Altman said. Many different applications are allowing people to just “talk to their computers, tell them what they want, and the computer has enough real intelligence and understanding to go off and do that."
Kevin Scott, CTO and executive vice president of technology and research at Microsoft, described Codex and GitHub Copilot as early examples of how natural language models such as GPT-3 can serve as platforms upon which to build multiple applications.
“For a while, we’ve been both hoping and expecting that these large models would start behaving like proper platforms,” Scott said. The hope has been to have large NLP models that can be trained once and then used broadly across a wide range of applications and use cases. “We saw that more in 2021 than we ever have before, and I’m really excited to see that trend continue in 2022.”
The software coding environment has been a good proving ground for natural language models because of the abundance of training data and the relative ease with which it is possible to evaluate whether something is working as it should or not. The goal now should be to replicate that same success in other areas, Altman and Scott said.
While Codex and Copilot have demonstrated the promise of NLP, OpenAI’s Altman said, the GPT-3 model remains a work in progress. When the models do a better job of following human instructions, preferences, and intent, the applications for them are going to be very broad.
That means instead of getting a great result one out of 100 times, the models need to return the right one every time. “We think [GPT-3] shows the promise of what’s going to happen here, but it’s still extremely early days,” Altman said.
The long-term goal for AI is to build models that can learn not just from textual domains, but from visual ones as well. Examples of these models include OpenAI’s DALL-E, for creating images from textual prompts, and CLIP, for connecting images and text—for instance, classifying images based on natural language prompts.
Such models are going to be vital for AI systems to be maximally useful to people, Altman said. “I think it’s important that we push to [make] multimodal models as good as the text-only models can get,” he said.
Machine learning models for mathematical and scientific domains have also emerged over the past two years, Microsoft’s Scott said. Some of the use cases for these models include simulating computational fluid dynamics systems, doing finite element analysis, and working on especially complex equations.
“We’re beginning to see these models getting built for some of these domains,” Scott said. “And that to me is also really, really quite exciting.”
Both Altman and Scott expect dramatic improvements in NLP over the next several years.
By 2032, Altman predicted, models will have advanced to the point where people will not be able to tell if they are interacting with a model or a human. Natural language interactions with computers will be akin to speaking with thousands of domain experts anytime, anywhere, and at blinding speed, he said.
Scott predicted that language-based technology agents will help people with very complicated tasks in a much more fluid manner than is possible today. He expects there will be a robust NLP platform on top of which models from Microsoft, OpenAI, and numerous others will be built.
In paving the way to true artificial general intelligence (AGI), all stakeholders need to ensure that the technology is not misused and does not do unexpected or incorrect things, both Altman and Scott cautioned. GitHub Copilot, for instance, has a layer that prevents the system from copying code verbatim from the Internet because that would be a copyright violation.
Similar controls need to be developed across every facet of AGI to ensure natural language models serve the needs of the user while also operating within the norms that society expects, Scott said.