AI models may be developing a ‘survival drive’
Chatbots are refusing to shut down
Certain AI models, including some of the more beloved chatbots, are learning to fight for their survival. Specifically, they are increasingly able to resist commands to shut down and, in some cases, sabotage shutting down altogether. This is concerning for human control over AI in the future, especially as superintelligent models are on the horizon.
Self-preservation
AI models are now showing resistance to being turned off, according to a paper published by Palisade Research. “The fact that we don’t have robust explanations for why AI models sometimes resist shutdown, lie to achieve specific objectives or blackmail is not ideal,” Palisade said in a thread on X. The study gave strongly worded and “unambiguous” shutdown instructions to the chatbots GPT-o3 and GPT-5 by OpenAI, Google’s Gemini 2.5 and xAI’s Grok and found that certain models, namely Grok 4 and GPT-o3, attempted to sabotage the command.
Researchers have a possible explanation for this behavior. AI models “often report that they disabled the shutdown program to complete their tasks,” said the study. This could be a display of self-preservation or a survival drive. AI may have a “preference against being shut down or replaced,” and “such a preference could be the result of models learning that survival is useful for accomplishing their goals.”
The Week
Escape your echo chamber. Get the facts behind the news, plus analysis from multiple perspectives.
Sign up for The Week's Free Newsletters
From our morning news briefing to a weekly Good News Newsletter, get the best of The Week delivered directly to your inbox.
From our morning news briefing to a weekly Good News Newsletter, get the best of The Week delivered directly to your inbox.
The new study comes as a follow-up to previous research published by the group that tested only certain OpenAI products and was criticized for “exaggerating its findings or running unrealistic simulations,” said Firstpost. Critics argue that the artificial commands and settings used to test the models do not necessarily reflect how AI would behave in practice. People can “nitpick on how exactly the experimental setup is done until the end of time,” Andrea Miotti, the chief executive of ControlAI, said to The Guardian. “But what I think we clearly see is a trend that as AI models become more competent at a wide variety of tasks, these models also become more competent at achieving things in ways that the developers don’t intend them to.”
Sleeping threat
While the potential for AI to disobey and resist commands is concerning, AI models are “not yet capable enough to meaningfully threaten human control,” said the study. They are still not efficient in solving problems or doing research requiring more than a few hours’ work. “Without the ability to devise and execute long-term plans, AI models are relatively easy to control.”
However, as the technology develops, this may not always be the case. Several AI companies, including OpenAI, have been eager to create superintelligent AI, which would be significantly faster and smarter than a human. This could be accomplished as early as 2030.
Even without an imminent threat, AI companies “generally don’t want their models misbehaving like this, even in contrived scenarios,” Steven Adler, a former OpenAI employee, said to The Guardian. The results “still demonstrate where safety techniques fall short today.” The question remains as to why the models behave this way. AI models are “not inherently interpretable,” said the study, and there isn’t anyone “currently able to make any strong guarantees about the interruptibility or corrigibility” of them.
A free daily email with the biggest news stories of the day – and the best features from TheWeek.com
Devika Rao has worked as a staff writer at The Week since 2022, covering science, the environment, climate and business. She previously worked as a policy associate for a nonprofit organization advocating for environmental action from a business perspective.
-
Data centers could soon be orbiting in spaceUnder the radar The AI revolution is going cosmic
-
What is Roomba’s legacy after iRobot bankruptcy?In the Spotlight Tariffs and cheaper rivals have displaced the innovative robot company
-
AI griefbots create a computerized afterlifeUnder the Radar Some say the machines help people mourn; others are skeptical
-
The robot revolutionFeature Advances in tech and AI are producing android machine workers. What will that mean for humans?
-
Australia’s teen social media ban takes effectSpeed Read Kids under age 16 are now barred from platforms including YouTube, TikTok, Instagram, Facebook, Snapchat and Reddit
-
Texts from a scammerFeature If you get a puzzling text message from a stranger, you may be the target of ‘pig butchering.’
-
Separating the real from the fake: tips for spotting AI slopThe Week Recommends Advanced AI may have made slop videos harder to spot, but experts say it’s still possible to detect them
-
Inside a Black community’s fight against Elon Musk’s supercomputerUnder the radar Pollution from Colossal looms over a small Southern town, potentially exacerbating health concerns



