>
13 Reasons Why Gold Has Outperformed Stocks Since 2000
Sixth Circuit Rules In Favor Of School Ban On "Let's Go Brandon" Sweatshirts
How Trump Crushed The Left's Media Machine
More Than An Accident? Kyle Bass Sounds Alarm On U.S. Military Explosives Supply Chain After...
3D Printed Aluminum Alloy Sets Strength Record on Path to Lighter Aircraft Systems
Big Brother just got an upgrade.
SEMI-NEWS/SEMI-SATIRE: October 12, 2025 Edition
Stem Cell Breakthrough for People with Parkinson's
Linux Will Work For You. Time to Dump Windows 10. And Don't Bother with Windows 11
XAI Using $18 Billion to Get 300,000 More Nvidia B200 Chips
Immortal Monkeys? Not Quite, But Scientists Just Reversed Aging With 'Super' Stem Cells
ICE To Buy Tool That Tracks Locations Of Hundreds Of Millions Of Phones Every Day
Yixiang 16kWh Battery For $1,920!? New Design!
Find a COMPATIBLE Linux Computer for $200+: Roadmap to Linux. Part 1
The agent, which Deepmind refer to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy. The same network with the same weights can play Atari, caption images, chat, stack blocks with a real robot arm and much more, deciding based on its context whether to output text, joint torques, button presses, or other tokens. In this report Deepmind describe the model and the data, and document the current capabilities of Gato.
A generalist agent. Gato can sense and act with different embodiments across a wide range of environments using a single neural network with the same set of weights. Gato was trained on 604 distinct tasks with varying modalities, observations and action specifications.
Transformer sequence models are effective as multi-task multi-embodiment policies, including for real-world text, vision and robotics tasks. They show promise as well in few-shot out-of-distribution task learning. In the future, such models could be used as a default starting point via prompting or fine-tuning to learn new behaviors, rather than training from scratch.
Given scaling law trends, the performance across all tasks including dialogue will increase with scale in parameters, data and compute. Better hardware and network architectures will allow training bigger models while maintaining real-time robot control capability. By scaling up and iterating on this same basic approach, Deepmind can build a useful general-purpose agent.