Certain AI models, including some of the more beloved chatbots, are learning to fight for their survival. Specifically, they are increasingly able to resist commands to shut down and, in some cases, will sabotage shutting down altogether. This is concerning for human control over AI in the future, especially as super-intelligent models are on the horizon.
AI models are now showing resistance to being turned off, according to a paper published by Palisade Research. “The fact that we don’t have robust explanations for why AI models sometimes resist shutdown, lie to achieve specific objectives or blackmail is not ideal,” said Palisade in a thread on X. The study gave strongly worded and “unambiguous” shutdown instructions to the chatbots GPT-o3 and GPT-5 by OpenAI, and Google’s Gemini 2.5 and xAI’s Grok, and found that certain models, namely Grok 4 and GPT-o3, attempted to sabotage the command.
Researchers have a possible explanation for this behaviour. AI models “often report that they disabled the shutdown program to complete their tasks”, said the study. This could be a display of self-preservation or a survival drive. AI may have a “preference against being shut down or replaced”, and “such a preference could be the result of models learning that survival is useful for accomplishing their goals”.
While the potential for AI to disobey and resist commands is concerning, AI models are “not yet capable enough to meaningfully threaten human control”, according to the study. They are still not efficient in solving problems or doing research requiring more than a few hours’ work. “Without the ability to devise and execute long-term plans, AI models are relatively easy to control,” it said. |