Will GPT-4 change the world?
What's different about this latest iteration, and what it could mean for the future of AI
Microsoft-backed startup OpenAI this week started rolling out its latest "large language model" artificial intelligence software, GPT-4. Like GPT-3.5, which powered the company's groundbreaking ChatGPT chatbot, GPT-4 draws on vast amounts of data available online to generate complex textual responses to users' queries. It can handle more words — more than 64,000, compared to GPT-3.5's maximum of 8,000 — making it less likely to "go off the guardrails," OpenAI says. But The New York Times says it still has what the company calls "hallucinations," the made-up responses that have plagued all leading chatbots. "It is more creative than previous models, it hallucinates significantly less, and it is less biased," Sam Altman, CEO of OpenAI, wrote in a series of tweets announcing the update.
Unlike its predecessors, GPT-4 is "multi-modal" — it can analyze both images and text. Input a picture of the contents of your fridge, and GPT-4 will suggest meals to make with what you have on hand. Enter a hand-drawn mock-up of a website, and it can use its mastery of coding languages to spit out a basic but functioning site. "If you show it a meme, it can tell you why it's funny," said OpenAI's chief scientist, Ilya Sutskever. The public can give it a test drive. It's incorporated into the latest version of Microsoft's Bing Chat.
Reviewers say GPT-4 cranks out answers with fewer errors than previous GPT (Generative Pre-trained Transformer) versions. It's also better at taking standardized tests, scoring in the top 10 percent of test takers in a simulated law school bar exam. The older version scored in the bottom 10 percent. "The continued improvements along many dimensions are remarkable," says Oren Etzioni at the Allen Institute for AI. "GPT-4 is now the standard by which all foundation models will be evaluated."
Subscribe to The Week
Escape your echo chamber. Get the facts behind the news, plus analysis from multiple perspectives.
Sign up for The Week's Free Newsletters
From our morning news briefing to a weekly Good News Newsletter, get the best of The Week delivered directly to your inbox.
From our morning news briefing to a weekly Good News Newsletter, get the best of The Week delivered directly to your inbox.
What are the commentators saying?
The improvement since last year's ChatGPT launch "is incredibly impressive," said Kristina Terech in Tech Radar. GPT-4 is "40 percent more likely to provide factual responses," which is nice because Microsoft and others will use it in search engines people use to get facts. It's also 82 percent less likely to give responses for "disallowed" content, things that are illegal or objectionable. OpenAI spent months using "an improved monitoring framework" and "working with experts in a variety of sensitive fields, such as medicine and geopolitics, to ensure the replies it gives are accurate and safe." It's "far from perfect, as OpenAI admits," but it's a huge improvement.
No matter how good it is, GPT-4 "is not in a league of its own, as GPT-3 was when it first appeared in 2020," said Will Douglas Heaven in MIT Technology Review. AI has taken off in the last three years, and GPT will have to duke it out with "other multimodal models, including Flamingo from DeepMind." And now AI startup Hugging Face is developing "an open-source multimodal model that will be free for others to use and adapt," according to co-founder Thomas Wolf. But all "large language models remain fundamentally flawed. GPT-4 can still generate biased, false, and hateful text; it can also still be hacked to bypass its guardrails. Though OpenAI has improved this technology, it has not fixed it by a long shot."
It would have been impossible to live up to the "near-messianic fanfare" that preceded GPT-4's unveiling, said Kevin Roose in The New York Times, with people saying they'd heard it could handle trillions of parameters, or get a perfect 1600 on the SAT (it really gets a 1410). The rumors weren't true, but "they hinted at how jarring the technology's abilities can feel." One early GPT-4 tester said the experience could provoke an "'existential crisis,' because it revealed how powerful and creative the A.I. was compared with the tester's own puny brain." It's enough to make you "wonder whether we're going to be experiencing 'future shock' — the term coined by the writer Alvin Toffler for the feeling that too much is changing, too quickly — for the rest of our lives."
What's next for AI?
The technology isn't just going to revolutionize search. It's already being used to enhance everything from a Duolingo subscription tier, where it provides a virtual language tutor, to customer service at payment processing questions Stripe, with many other companies working on incorporating the AI technology into their services. "Artificial intelligence has the awesome power to change the way we live our lives, in both good and dangerous ways," said Anthony Zurcher in BBC News. "Experts have little confidence that those in power are prepared for what's coming." The world has reached an "inflection point," said Arati Prabhakar, director of the White House's Office of Science and Technology Policy, said this week at the South by Southwest Interactive conference in Austin, Texas. "All of history shows that these kinds of powerful new technologies can and will be used for good and for ill," she said. Experts say the technology could make everyone's life easier, or it could obliterate data privacy and consolidate power in companies that manage to harness the technology. "If in six months you are not completely freaked the [expletive] out, then I will buy you dinner," said another panelist at the conference, advisory group SeedAI founder Austin Carson.
Sign up for Today's Best Articles in your inbox
A free daily email with the biggest news stories of the day – and the best features from TheWeek.com
Harold Maass is a contributing editor at The Week. He has been writing for The Week since the 2001 debut of the U.S. print edition and served as editor of TheWeek.com when it launched in 2008. Harold started his career as a newspaper reporter in South Florida and Haiti. He has previously worked for a variety of news outlets, including The Miami Herald, ABC News and Fox News, and for several years wrote a daily roundup of financial news for The Week and Yahoo Finance.