nielsr's picture
nielsr HF Staff
Add model card for Robot Learning tutorial model
8a130d3 verified
|
raw
history blame
3.17 kB
metadata
license: mit
pipeline_tag: robotics
library_name: lerobot
tags:
  - action-chunking-with-transformers

Robot Learning: A Tutorial - ACT Model

This repository contains a model checkpoint (act-resnet18-upside-down-side-v0.1) associated with the paper "Robot Learning: A Tutorial". This tutorial navigates the landscape of modern robot learning, providing practical examples using the lerobot library developed by Hugging Face. The model provided here is an instance of the Action Chunking with Transformers (ACT) framework.

Abstract

Robot learning is at an inflection point, driven by rapid advancements in machine learning and the growing availability of large-scale robotics data. This shift from classical, model-based methods to data-driven, learning-based paradigms is unlocking unprecedented capabilities in autonomous systems. This tutorial navigates the landscape of modern robot learning, charting a course from the foundational principles of Reinforcement Learning and Behavioral Cloning to generalist, language-conditioned models capable of operating across diverse tasks and even robot embodiments. This work is intended as a guide for researchers and practitioners, and our goal is to equip the reader with the conceptual understanding and practical tools necessary to contribute to developments in robot learning, with ready-to-use examples implemented in $\texttt{lerobot}$.

Model Details

This model checkpoint is configured as an Action Chunking with Transformers (ACT) policy, as defined in its config.json. It utilizes a resnet18 vision backbone for processing visual observations and is designed for robotic action prediction.

  • Model Type: act (Action Chunking with Transformers)
  • Vision Backbone: resnet18
  • Input Features:
    • observation.state: State vector of shape [6]
    • observation.images.up: Visual input from an 'up' camera, shape [3, 480, 640]
    • observation.images.side: Visual input from a 'side' camera, shape [3, 480, 640]
  • Output Features:
    • action: Action vector of shape [6]
  • Uses VAE: true

Usage

For detailed instructions on installation, training, and utilizing robot learning models within the lerobot ecosystem, please refer to the comprehensive GitHub repository of the "Robot Learning: A Tutorial". The repository provides various code examples and guidance for implementing and experimenting with such models.

License

The code examples within the "Robot Learning: A Tutorial" GitHub repository are licensed under the MIT License. This model artifact is associated with that codebase.