Brain
Google DeepMind's foundational 2023 research VLA. RT-2 is a transformer VLA trained on web text and images that directly outputs robot actions. Instantiations on PaLM-E and PaLI-X. RT-2-X is a 55B-parameter variant. Chain-of-thought reasoning for long-horizon planning. Superseded by Gemini Robotics for commercial paths, but conceptually ancestral to GR00T, Helix, pi0, and the broader dual-system descendants.
Research model · Maturity: Research · Closed
Machine-readable surfaces
- Markdown mirror: /brains/rt-2.md
- RSS feed: /brains/rt-2/feed.xml
- JSON-LD: embedded in this page’s head
- REST API: /v1/brains/738c50ad-11cd-41d6-b5a4-1ac63e2acac2
- Data documentation: /data
Architecture
Transformer VLA trained on web text and images, directly outputs robot actions. Instantiations on PaLM-E and PaLI-X. RT-2-X is 55B parameters. Chain-of-thought reasoning for long-horizon planning. Trained on web data plus Open X-Embodiment.
Key facts
- Significance
- First-of-its-kind VLA; demonstrated VLMs can become VLAs
- Ancestry
- Conceptual ancestor of the GR00T / Helix / pi0 dual-system descendants
- Powered platforms
- Research robot platforms; not a commercial-deployment brain
- Successor
- Superseded by Gemini Robotics in DeepMind's commercial roadmap
Developed by (1)
Sources (2)
- Google DeepMind Blog: RT-2 (primary) · https://deepmind.google/blog/rt-2-new-model-translates-vision-and-language-into-action/
- Google Blog: RT-2 coverage · https://blog.google/
Canonical ID 738c50ad-11cd-41d6-b5a4-1ac63e2acac2