AI models may be developing a ‘survival drive’
Chatbots are refusing to shut down
 
 
Certain AI models, including some of the more beloved chatbots, are learning to fight for their survival. Specifically, they are increasingly able to resist commands to shut down and in some cases, sabotage it altogether. This is concerning for human control over AI in the future, especially as superintelligent models are on the horizon.
Shut it down
AI models are becoming resistant to being turned off or shut down, according to a paper published by Palisade Research. “The fact that we don’t have robust explanations for why AI models sometimes resist shutdown, lie to achieve specific objectives or blackmail is not ideal,” Palisade said in a thread on X. The study gave strongly worded and “unambiguous” shutdown instructions to the chatbots GPT-o3 and GPT-5 by OpenAI, Google’s Gemini 2.5 and xAI’s Grok, and found that certain models, namely Grok 4 and GPT-o3, attempted to sabotage the command.
The researchers have brought forth a possible explanation for this behavior. AI models “often report that they disabled the shutdown program in order to complete their tasks,” said the study. This could be a display of self-preservation or a survival drive. AI may “have a preference against being shut down or replaced,” and “such a preference could be the result of models learning that survival is useful for accomplishing their goals.”
The Week
Escape your echo chamber. Get the facts behind the news, plus analysis from multiple perspectives.
 
Sign up for The Week's Free Newsletters
From our morning news briefing to a weekly Good News Newsletter, get the best of The Week delivered directly to your inbox.
From our morning news briefing to a weekly Good News Newsletter, get the best of The Week delivered directly to your inbox.
The new study comes as a follow-up to previous research published by the group that tested only certain OpenAI products and was criticized for “exaggerating its findings or running unrealistic simulations,” said Firstpost. Critics argue that the artificial commands and settings used to test the models do not necessarily reflect how AI would behave in practice. “People can nitpick on how exactly the experimental setup is done until the end of time,” Andrea Miotti, the chief executive of ControlAI, said to The Guardian. “But what I think we clearly see is a trend that as AI models become more competent at a wide variety of tasks, these models also become more competent at achieving things in ways that the developers don’t intend them to.”
Sleeping threat
While the potential for AI to disobey and resist commands is concerning, “AI models are not yet capable enough to meaningfully threaten human control,” said the study. They are still not efficient in solving problems or doing research requiring more than a few hours’ work. “Without the ability to devise and execute long-term plans, AI models are relatively easy to control.” However, as the technology develops, this may not always be the case. Several AI companies, including OpenAI, have been eager to create superintelligent AI, which would be significantly faster and smarter than a human. This could be accomplished as early as 2030.
Even without an imminent threat, “AI companies generally don’t want their models misbehaving like this, even in contrived scenarios,” Steven Adler, a former OpenAI employee, said to The Guardian. “The results still demonstrate where safety techniques fall short today.” The question remains as to why the models behave this way. AI models are “not inherently interpretable,” and there isn’t anyone “currently able to make any strong guarantees about the interruptibility or corrigibility” of them, said the study.
A free daily email with the biggest news stories of the day – and the best features from TheWeek.com
Devika Rao has worked as a staff writer at The Week since 2022, covering science, the environment, climate and business. She previously worked as a policy associate for a nonprofit organization advocating for environmental action from a business perspective.
- 
 Saudi Arabia could become an AI focal point Saudi Arabia could become an AI focal pointUnder the Radar A state-backed AI project hopes to rival China and the United States 
- 
 Wikipedia: Is ‘neutrality’ still possible? Wikipedia: Is ‘neutrality’ still possible?Feature Wikipedia struggles to stay neutral as conservatives accuse the site of being left-leaning 
- 
 AI is making houses more expensive AI is making houses more expensiveUnder the radar Homebuying is also made trickier by AI-generated internet listings 
- 
 ‘How can I know these words originated in their heart and not some data center in northern Virginia?’ ‘How can I know these words originated in their heart and not some data center in northern Virginia?’instant opinion Opinion, comment and editorials of the day 
- 
 AI: is the bubble about to burst? AI: is the bubble about to burst?In the Spotlight Stock market ever-more reliant on tech stocks whose value relies on assumptions of continued growth and easy financing 
- 
 Your therapist, the chatbot Your therapist, the chatbotFeature Americans are increasingly turning to artificial intelligence for mental health support. Is that sensible? 
- 
 Sora 2 and the fear of an AI video future Sora 2 and the fear of an AI video futureIn the Spotlight Cutting-edge video-creation app shares ‘hyperrealistic’ AI content for free 
- 
 Supersized: The no-limit AI data center build-out Supersized: The no-limit AI data center build-outFeature Tech firms are investing billions to build massive AI data centers across the U.S. 



