Google DeepMind Integrates Gemini AI Model into Robotics

Google DeepMind, an artificial intelligence company, has integrated a version of its most advanced large language model (LLM), named Gemini, into robots. This model enables machines to execute various tasks—such as 'slam dunking' a small basketball through a desktop hoop—without ever having observed another robot perform the action, according to the firm. The company is part of a broader movement to leverage AI advancements that power chatbots to develop general-purpose robots. However, this approach raises safety concerns due to the potential for such models to generate inaccurate and harmful outputs. The aim is to design machines that are easy to operate and capable of performing a variety of physical tasks without needing human oversight or preprogramming.
By linking to Gemini's robotic models, developers can enhance their robots, enabling them to understand “natural language and comprehend the physical world in much greater detail, ” explains Carolina Parada, who heads the Google DeepMind robotics team based in Boulder, Colorado. The model referred to as Gemini Robotics, which was unveiled on March 12 through a blog post and technical paper, is described as “a small but tangible step” towards realizing this vision, according to Alexander Khazatsky, an AI researcher and co-founder of CollectedAI in Berkeley, California, focused on creating datasets for AI-driven robots. **Spatial Awareness** Based in London, a team at Google DeepMind began with Gemini 2. 0, the company’s most sophisticated vision and language model, which was trained by analyzing vast amounts of data for pattern recognition. They developed a specialized version of the model tailored for reasoning tasks that require 3D physical and spatial understanding—such as predicting the trajectory of objects or recognizing the same part of an object in images captured from different angles. Subsequently, they further trained the model using data from thousands of hours of real, remotely operated robot demonstrations. This allowed the robotic ‘brain’ to perform actual actions, paralleling how LLMs generate subsequent words based on learned associations. The team evaluated Gemini Robotics on humanoid robots and robotic arms, assessing both trained tasks and novel activities. According to their findings, robots utilizing the model consistently outperformed leading rivals in tests involving both familiar tasks with altered details and entirely new challenges. **Folding Origami**
Brief news summary
Google DeepMind has made significant strides in robotics by integrating its Gemini language model, which allows robots to operate autonomously without needing prior demonstrations. Announced on March 12, this development enhances robotic capabilities, enabling machines to perform various tasks, such as executing slam dunks with mini basketballs. The primary goal is to create flexible robots that can intuitively handle tasks with minimal human oversight. However, this advancement raises concerns about potential risks associated with AI, particularly the chances of making errors or harmful choices. Carolina Parada, the head of the robotics team, emphasized that the Gemini model greatly enhances robots' ability to understand natural language and their physical environments. Built on the Gemini 2.0 framework, the model was trained on vast datasets to improve reasoning in three-dimensional spaces and fine-tuned using examples from remote-controlled robots. Gemini excels beyond other top robotic models in both familiar and novel tasks, positioning it to drive substantial advancements in intelligent robotics and broaden its applications—from simple origami to intricate physical maneuvers.
AI-powered Lead Generation in Social Media
and Search Engines
Let AI take control and automatically generate leads for you!

I'm your Content Manager, ready to handle your first test assignment
Learn how AI can help your business.
Let’s talk!
Hot news

Apple's AI Executive Joins Meta's Superintelligen…
Ruoming Pang, a senior executive at Apple who heads the company’s artificial intelligence foundation models team, is departing the tech giant to join Meta Platforms, according to Bloomberg News reports.

Ripple Applies for U.S. Banking License Amidst Cr…
Ripple has recently submitted an application for a Federal Reserve master account through its newly acquired trust company, Standard Custody.

AI in Autonomous Vehicles: Overcoming Safety Chal…
Engineers and developers are intensively working to resolve safety issues related to AI-driven autonomous vehicles, especially in response to recent incidents that have sparked widespread debate on the reliability and security of this evolving technology.

SAP Integrates Blockchain for ESG Reporting in ER…
SAP, a global leader in enterprise software, has announced a crucial enhancement to its enterprise resource planning (ERP) systems by integrating blockchain-based Environmental, Social, and Governance (ESG) reporting tools.

Middle Managers Diminish as AI Adoption Increases
As artificial intelligence (AI) rapidly advances, its influence on organizational structures—especially middle management—is becoming increasingly clear.

The Blockchain Group Bolsters Bitcoin Reserves Wi…
The Blockchain Group Strengthens Bitcoin Holdings Through $12

Kinexys Launches Carbon Market Blockchain Tokeniz…
Kinexys by J.P. Morgan, the firm’s leading blockchain business unit, is developing an innovative blockchain application on Kinexys Digital Assets, its multi-asset tokenization platform, aimed at tokenizing global carbon credits at the registry level.