
Google's Gemini Robotics On-Device: AI Model Powers Robots Locally
Hey everyone! I've got some exciting news coming out of Google DeepMind. They've just unveiled Gemini Robotics On-Device, a new language model designed to let robots operate independently, without needing a constant internet connection. Pretty cool, right?
Think about it: this model can actually control a robot's movements, and developers can tweak it using simple, natural language. Imagine the possibilities! We are talking about robots that can fold clothes and do some interesting stuff. Google says it performs almost as well as their cloud-based Gemini Robotics model, and better than other on-device models. While they didn't name names, that's still a bold claim!
What's even more interesting is that this model, initially trained for ALOHA robots, has been adapted to work on other robots like the bi-arm Franka FR3 and Apptronik's Apollo humanoid robot. The Franka FR3 even managed to handle tasks and objects it hadn't encountered before, like assembly on an industrial belt.
To make things even easier for developers, Google DeepMind is also launching a Gemini Robotics SDK. This will allow them to train robots on new tasks by showing them as few as 50 to 100 demonstrations in the MuJoCo physics simulator. That is, it will allow the robot to do tasks as requested by the user!
It seems like everyone is getting in on the robotics game. Nvidia is working on a platform for creating foundation models for humanoids, Hugging Face is developing open models and datasets for robotics and even building robots, and RLWRLD is creating foundational models too. This is an exciting time for AI and robotics, and I can't wait to see what the future holds!
1 Image of On-Device Robotics:

Source: TechCrunch