Poetry has wooed many hearts. And now it is tricking artificial intelligence models into going apocalyptically beyond their boundaries.
A group of European researchers has found that “meter and rhyme” can “bypass safety measures” in major AI models, said The Tech Buzz, and, if you “ask nicely in iambic pentameter”, chatbots will explain how to make nuclear weapons.
In artificial intelligence jargon, a “jailbreak” is a “prompt designed to push a model beyond its safety limits”. It allows users to “bypass safeguards and trigger responses that the system normally blocks”, said the International Business Times.
Researchers at the DexAI think tank, Sapienza University of Rome and the Sant’Anna School of Advanced Studies discovered a jailbreak that uses “short poems”. The “simple” tactic involves changing “harmful instructions into poetry” because that “style alone is enough to reduce” the AI model’s “defences”.
Previous attempts “relied on long roleplay prompts”, “multi-turn exchanges” or “complex obfuscation”. The new approach is “brief and direct”, and it seems to “confuse” automated safety systems. The “manually curated adversarial poems” have an average success rate of 62%, “with some providers exceeding 90%”, according to Literary Hub.
The “stunning new security flaw” has also shown that chatbots will “happily explain” how to “create child exploitation material and develop malware”, added The Tech Buzz.
This is the latest in a “growing canon of absurd ways” of tricking AI, said science site Futurism, and it’s all “so ludicrous and simple” that you must “wonder if the AI creators are even trying to crack down on this stuff”. |