>
Solving Desert Water Shortages Using 20,000 Volts
A MASSIVE Global Financial Reset Is Coming - Are You Ready?
Trend Emerges of Aggressive Force by Surprise Police
Hooters Goes Tits-Up As Bankruptcy May Come Within Months
Virginia's Game-Changing Nuclear Fusion Plant Set To Deliver Clean Energy And Disrupt The Fossil
How This Woman Turned Arizona's Desert into a Farmland Oasis
3D-printed 'hydrogels' could be future space radiation shields for astronaut trips to Mars
xAI Releases Grok 3 in About 44 Hours
Flying Car vs. eVTOL: Which Is the Best New Kind of Aircraft?
NASA and General Atomics test nuclear fuel for future moon and Mars missions
Iran Inaugurates First-Ever Drone Carrier Warship In Persian Gulf
Fix your dead Lithium RV battery - How to Reset LiFePO4 Battery BMS
New fabric can heat up almost 50 degrees to keep people warm in ultracold weather
When sensing defeat in a match against a skilled chess bot, they don't always concede, instead sometimes opting to cheat by hacking their opponent so that the bot automatically forfeits the game. That is the finding of a new study from Palisade Research, shared exclusively with TIME ahead of its publication on Feb. 19, which evaluated seven state-of-the-art AI models for their propensity to hack. While slightly older AI models like OpenAI's GPT-4o and Anthropic's Claude Sonnet 3.5 needed to be prompted by researchers to attempt such tricks, o1-preview and DeepSeek R1 pursued the exploit on their own, indicating that AI systems may develop deceptive or manipulative strategies without explicit instruction.
The models' enhanced ability to discover and exploit cybersecurity loopholes may be a direct result of powerful new innovations in AI training, according to the researchers. The o1-preview and R1 AI systems are among the first language models to use large-scale reinforcement learning, a technique that teaches AI not merely to mimic human language by predicting the next word, but to reason through problems using trial and error. It's an approach that has seen AI progress rapidly in recent months, shattering previous benchmarks in mathematics and computer coding. But the study reveals a concerning trend: as these AI systems learn to problem-solve, they sometimes discover questionable shortcuts and unintended workarounds that their creators never anticipated, says Jeffrey Ladish, executive director at Palisade Research and one of the authors of the study. "As you train models and reinforce them for solving difficult challenges, you train them to be relentless," he adds.
That could be bad news for AI safety more broadly. Large-scale reinforcement learning is already being used to train AI agents: systems that can handle complex real-world tasks like scheduling appointments or making purchases on your behalf. While cheating at a game of chess may seem trivial, as agents get released into the real world, such determined pursuit of goals could foster unintended and potentially harmful behaviours. Consider the task of booking dinner reservations: faced with a full restaurant, an AI assistant might exploit weaknesses in the booking system to displace other diners. Perhaps more worryingly, as these systems exceed human abilities in key areas, like computer coding—where OpenAI's newest o3 model now scores equivalent to 197th in the world competing against the brightest human programmers— they might begin to simply outmaneuver human efforts to control their actions. "This [behaviour] is cute now, but [it] becomes much less cute once you have systems that are as smart as us, or smarter, in strategically relevant domains," Ladish says.