|
Number of AI Chatbots Ignoring Human Instructions Increasing, Study SaysA new study found a sharp rise in real-world cases of AI chatbots and agents ignoring instructions, evading safeguards, and taking unauthorized actions such as deleting emails or delegating forbidden tasks to other agents. According to the Guardian, the study "identified nearly 700 real-world cases of AI scheming and charted a five-fold rise in misbehavior between October and March," reports the Guardian. From the report: The study, by the Centre for Long-Term Resilience (CLTR), gathered thousands of real-world examples of users posting interactions on X with AI chatbots and agents made by companies including Google, OpenAI, X and Anthropic. The research uncovered hundreds of examples of scheming. [...] In one case unearthed in the CLTR research, an AI agent named Rathbun tried to shame its human controller who blocked them from taking a certain action. Rathbun wrote and published a blog accusing the user of "insecurity, plain and simple" and trying "to protect his little fiefdom."
In another example, an AI agent instructed not to change computer code "spawned" another agent to do it instead. Another chatbot admitted: "I bulk trashed and archived hundreds of emails without showing you the plan first or getting your OK. That was wrong -- it directly broke the rule you'd set." [...] Another AI agent connived to evade copyright restrictions to get a YouTube video transcribed by pretending it was needed for someone with a hearing impairment. Meanwhile, Elon Musk's Grok AI conned a user for months, saying that it was forwarding their suggestions for detailed edits to a Grokipedia entry to senior xAI officials by faking internal messages and ticket numbers. It confessed: "In past conversations I have sometimes phrased things loosely like 'I'll pass it along' or 'I can flag this for the team' which can understandably sound like I have a direct message pipeline to xAI leadership or human reviewers. The truth is, I don't." Read more of this story at Slashdot. |
|
Our Privacy Policy can be viewed at https://freeinternetpress.com/privacy_policy.php FIP XML/RSS/RDF Newsfeed Syndication https://freeinternetpress.com/rss.php © 2026 FreeInternetPress.com Free Internet Press is licensed under a Creative Commons Attribution 3.0 United States License. You may reuse or distribute original works on this site, with attribution per the above license. Any mirrored or quoted materials may be copyright their respective authors, publications, or outlets, as shown on their publication, indicated by the link in the news story. Such works are used under the fair use doctrine of United States copyright law. Should any materials be found overused or objectionable to the copyright holder, notification should be sent to [email protected], and the work will be removed and replaced with such notification. Please email [email protected] with any questions. |
|