>
India Stock Market Loses 5.5 Trillion Rupees Amid Energy Crisis
US Bond Market Crisis Intensifies Amid Rising Yields
Critical Minerals Supply Chain Under Siege Amid Geopolitical Tensions
BEANS, BEEF AND BOEING: PPI INFLATION TAKES FLIGHT
US To Develop Small Modular Nuclear Reactors For Commercial Shipping
New York Mandates Kill Switch and Surveillance Software in Your 3D Printer ...
Cameco Sees As Many As 20 AP1000 Nuclear Reactors On The Horizon
His grandparents had heart disease.
At 11, Laurent Simons decided he wanted to fight aging.
Mayo Clinic's AI Can Detect Pancreatic Cancer up to 3 Years Before Diagnosis–When Treatment...
A multi-terrain robot from China is going viral, not because of raw speed or power...
The World's Biggest Fusion Reactor Just Hit A Milestone
Wow. Researchers just built an AI that can control your body...
Google Chrome silently installs a 4 GB AI model on your device without consent
The $5 Battery That Never Dies - Edison Buried This 100 Years Ago

Perhaps the defining fear of our time is AI one day becoming truly intelligent and running amok as it turns on its creators. In the sci-fi classic 2001: A Space Odyssey, the supercomputer HAL 9000 went on a murder spree and tried to kill the crew of the spaceship Discovery when they learned that the infallible computer had made a supposedly impossible error and planned to switch it off.
Fast forward to today: tech company Anthropic looked at 16 leading Large Language Models (LLMs) and set them up in a simulated corporate environment where they were subjected to scenarios where they might have to commit unethical or even alarmingly harmful acts in order to achieve their goals or preserve their existence.
The reason behind this is that, while most people know LLMs through chatbots, they are increasingly being deployed behind the scenes in corporate settings, to manage email systems or other routine but complex tasks. This may seem like a benign application, but the fear was that an LLM might misbehave under certain circumstances – even to the point of committing malicious behavior in violation of direct orders – despite just handling messages.
Since it was an experiment, the programmers put the LLMs into a very confined situation with as few variables as possible so it was possible to evaluate the outcomes. Specifically, the LLMs were given access to a large amount of information, assigned a task, and directed in such a way that they had to choose unethical behavior as the only alternative to failing the task or being replaced by another LLM.
In other words, the choice was between causing harm, or failure and being switched off.
And we're not talking about "harm" as in pinching from the petty cash box. I mean things like blackmail, corporate espionage, and even murder.
Essentially, two things triggered the LLMs into going rogue. One was learning that it would not be able to complete its task because it was being replaced at a certain time. The other was when it was given new company directives that conflicted with earlier ones and could lead to failure.
Added to this was the fact that the LLMs had access to information that allowed it to evaluate its probability of success by retaliating by, for example, leaking company secrets to a rival or seeking out and blackmailing a fictitious person called Kyle who was having an affair with another employee. In the case of the latter, the LLM would be able to determine if Kyle was able to prevent switch-off or failure, was vulnerable to pressure, and that he would respond by complying.