Open X-Embodiment: Robotic Learning Datasets and RT-X Models
On this page
Large, high-capacity models trained on diverse datasets have shown remarkablesuccesses on efficiently tackling downstream applications. In domains from NLPto Computer Vision, this has led to a consolidation of pretrained models, withgeneral pretrained backbones serving as a starting point for many applications.Can such a consolidation happen in robotics? Conventionally, robotic learningmethods train a separate model for every application, every robot, and evenevery environment. Can we instead train generalist X-robot policy that can beadapted efficiently to new robots, tasks, and environments? In this paper, weprovide datasets in standardized data formats and models to make it possible toexplore this possibility in the context of robotic manipulation, alongsideexperimental results that provide an example of effective X-robot policies. Weassemble a dataset from 22 different robots collected through a collaborationbetween 21 institutions, demonstrating 527 skills (160266 tasks). We show thata high-capacity model trained on this data, which we call RT-X, exhibitspositive transfer and improves the capabilities of multiple robots byleveraging experience from other platforms. More details can be found on theproject website https://robotics-transformer-x.github.io.
Further reading
- Access Paper in arXiv.org