Microsoft-backed startup OpenAI this week started rolling out its latest "large language model" artificial intelligence software, GPT-4. Like GPT-3.5, which powered the company's groundbreaking ChatGPT chatbot, GPT-4 draws on vast amounts of data available online to generate complex textual responses to users' queries. It can handle more words — more than 64,000, compared to GPT-3.5's maximum of 8,000 — making it less likely to "go off the guardrails," OpenAI says. But The New York Times says it still has what the company calls "hallucinations," the made-up responses that have plagued all leading chatbots. "It is more creative than previous models, it hallucinates significantly less, and it is less biased," Sam Altman, CEO of OpenAI, wrote in a series of tweets announcing the update.

Unlike its predecessors, GPT-4 is "multi-modal" — it can analyze both images and text. Input a picture of the contents of your fridge, and GPT-4 will suggest meals to make with what you have on hand. Enter a hand-drawn mock-up of a website, and it can use its mastery of coding languages to spit out a basic but functioning site. "If you show it a meme, it can tell you why it's funny," said OpenAI's chief scientist, Ilya Sutskever. The public can give it a test drive. It's incorporated into the latest version of Microsoft's Bing Chat.

Reviewers say GPT-4 cranks out answers with fewer errors than previous GPT (Generative Pre-trained Transformer) versions. It's also better at taking standardized tests, scoring in the top 10 percent of test takers in a simulated law school bar exam. The older version scored in the bottom 10 percent. "The continued improvements along many dimensions are remarkable," says Oren Etzioni at the Allen Institute for AI. "GPT-4 is now the standard by which all foundation models will be evaluated."

What are the commentators saying?

The improvement since last year's ChatGPT launch "is incredibly impressive," said Kristina Terech in Tech Radar. GPT-4 is "40 percent more likely to provide factual responses," which is nice because Microsoft and others will use it in search engines people use to get facts. It's also 82 percent less likely to give responses for "disallowed" content, things that are illegal or objectionable. OpenAI spent months using "an improved monitoring framework" and "working with experts in a variety of sensitive fields, such as medicine and geopolitics, to ensure the replies it gives are accurate and safe." It's "far from perfect, as OpenAI admits," but it's a huge improvement.