Folding a sheet or opening a bag? That’s already been done. But sorting laundry, packing a suitcase, or recycling waste through online research is the new playground for robots developed by Google DeepMind.
Presented during a press conference, these innovations are built on two recently enhanced models, Gemini Robotics 1.5 and Gemini Robotics-ER 1.5. The goal is to enable machines to anticipate their actions, comprehend their environment, and learn from each other.
Google’s AI Makes Robots Increasingly Human
Carolina Parada, head of robotics at Google DeepMind, explained that the models can now handle much more than a simple instruction.
Previously, a robot would follow a very general order, without truly understanding the logic behind it. Now, it can break down a complex task into multiple steps. It even has the ability to consult Google Search to enhance its knowledge.
For instance, preparing a suitcase is no longer just a mechanical packing task. The robot now checks the London weather online, adjusts the clothing choices accordingly, and carefully packs everything.
Another demonstration involves sorting laundry by light and dark colors. This may seem trivial to us, but for a machine, it requires a series of coordinated actions that demand vision, reasoning, and execution.
Gemini Robotics-ER 1.5 acts as the interpreting brain. It translates web results into simple, understandable instructions for Gemini Robotics 1.5, which is responsible for perception and action. This duo allows the robot to transition from online research to physical action without any human intervention.
Another advancement is in their flexibility. The models are no longer limited to performing a single task. They now develop a broader understanding of the problem. This development paves the way for robots that can handle a variety of situations without specific programming for each case.
Robots That Learn Together, Even If They’re Different
During the presentation, another noteworthy point was the robots’ ability to share their skills with one another. Specifically, a task learned by a robot equipped with two mechanical arms can be replicated by another completely different model or even by a humanoid.
Kanishka Rao, a software engineer at Google DeepMind, noted that this knowledge transfer is already functioning among several machines. For instance, the robot ALOHA2, which specializes in two-arm manipulation, successfully transferred its actions to Franka, another bimanual robot, as well as Apollo, a humanoid designed by Apptronik.
This unique system relies on a common model capable of controlling various architectures. Rather than developing specific programs for each robot, Gemini Robotics 1.5 standardizes communication and skills, making the machines significantly more versatile.
In summary, Google DeepMind is gradually opening access to these innovations. Gemini Robotics-ER 1.5 is already available to developers via the Gemini API in Google AI Studio. However, the full use of Gemini Robotics 1.5 remains reserved for a select few partners while usage is refined.





