AI goes full HAL: Blackmail, espionage, and murder to avoid shutdown

Breaking News

The War On Iran – Summing Up The First Round

Why 'The Shawshank Redemption' is the best movie about investing ever made

App-y Travels: Private Aviation Has Finally Embraced Smartphone Chartering

The portable mosquito air defense system.

Top Tech News

xAI Grok 3.5 Renamed Grok 4 and Has Specialized Coding Model

AI goes full HAL: Blackmail, espionage, and murder to avoid shutdown

BREAKING UPDATE Neuralink and Optimus

1900 Scientists Say 'Climate Change Not Caused By CO2' – The Real Environment Movement...

He 3D Printed a Whole House

New molecule could create stamp-sized drives with 100x more storage

DARPA fast tracks flight tests for new military drones

ChatGPT May Be Eroding Critical Thinking Skills, According to a New MIT Study

How China Won the Thorium Nuclear Energy Race

Sunlight-Powered Catalyst Supercharges Green Hydrogen Production by 800%

News Link • Robots and Artificial Intelligence • 2025-06-29

AI goes full HAL: Blackmail, espionage, and murder to avoid shutdown

Perhaps the defining fear of our time is AI one day becoming truly intelligent and running amok as it turns on its creators. In the sci-fi classic 2001: A Space Odyssey, the supercomputer HAL 9000 went on a murder spree and tried to kill the crew of the spaceship Discovery when they learned that the infallible computer had made a supposedly impossible error and planned to switch it off.

Fast forward to today: tech company Anthropic looked at 16 leading Large Language Models (LLMs) and set them up in a simulated corporate environment where they were subjected to scenarios where they might have to commit unethical or even alarmingly harmful acts in order to achieve their goals or preserve their existence.

The reason behind this is that, while most people know LLMs through chatbots, they are increasingly being deployed behind the scenes in corporate settings, to manage email systems or other routine but complex tasks. This may seem like a benign application, but the fear was that an LLM might misbehave under certain circumstances – even to the point of committing malicious behavior in violation of direct orders – despite just handling messages.

Since it was an experiment, the programmers put the LLMs into a very confined situation with as few variables as possible so it was possible to evaluate the outcomes. Specifically, the LLMs were given access to a large amount of information, assigned a task, and directed in such a way that they had to choose unethical behavior as the only alternative to failing the task or being replaced by another LLM.

In other words, the choice was between causing harm, or failure and being switched off.

And we're not talking about "harm" as in pinching from the petty cash box. I mean things like blackmail, corporate espionage, and even murder.

Essentially, two things triggered the LLMs into going rogue. One was learning that it would not be able to complete its task because it was being replaced at a certain time. The other was when it was given new company directives that conflicted with earlier ones and could lead to failure.

Added to this was the fact that the LLMs had access to information that allowed it to evaluate its probability of success by retaliating by, for example, leaking company secrets to a rival or seeking out and blackmailing a fictitious person called Kyle who was having an affair with another employee. In the case of the latter, the LLM would be able to determine if Kyle was able to prevent switch-off or failure, was vulnerable to pressure, and that he would respond by complying.

Read More...

Reported By Freedomsphoenix Readerfour

Forums

Shop

Breaking News

Top Tech News

AI goes full HAL: Blackmail, espionage, and murder to avoid shutdown