CyberTron

Share with Friends

Help us spread the word about CyberTron by sharing it with your friends and colleagues on various social media platforms:

We appreciate your support in sharing CyberTron and making it accessible to the wider community!

CyberTron is an open-source suite of robotic transformers models designed to simplify the training, finetuning, and inference processes. With its plug-and-play functionality, CyberTron provides an easy-to-use interface to effortlessly utilize a variety of robotic transformers models. Whether you're working on robotics research, autonomous systems, or AI-driven robotic applications, CyberTron offers a comprehensive toolkit to enhance your projects.

Key Features

Easy integration and plug-and-play functionality.
Diverse range of pre-trained robotic transformers models.
Efficient training and finetuning pipelines.
Seamless inference capabilities.
Versatile and customizable for various robotic applications.
Active community and ongoing development.

Architecture

CyberTron is built on a modular architecture, enabling flexibility and extensibility for different use cases. The suite consists of the following components:

Model Library: CyberTron provides a comprehensive model library that includes various pre-trained robotic transformers models. These models are designed to tackle a wide range of robotics tasks, such as perception, motion planning, control, and more. Some of the available models in CyberTron include VC-1, RT-1, ROBOTCAT, KOSMOS-X, and many others.
Training and Finetuning: CyberTron offers a streamlined training and finetuning pipeline. You can easily train models from scratch or finetune existing models using your own datasets. The suite provides efficient data preprocessing, augmentation, and optimization techniques to enhance the training process.
Inference: CyberTron allows you to conduct seamless inference using the trained models. You can deploy the models in real-world scenarios, robotics applications, or integrate them into existing systems for robotic perception, decision-making, and control.

Getting Started

To get started with CyberTron, follow the instructions below:

Clone the CyberTron repository:

git clone https://github.com/kyegomez/CyberTron.git

Install the required dependencies:

pip install -r requirements.txt

Choose the desired model from the model library.
Utilize the provided examples and code snippets to train, finetune, or conduct inference using the selected model.
Customize the models and pipelines according to your specific requirements.

Roadmap

The future development of CyberTron includes the following milestones:

Expansion of the model library with additional pre-trained robotic transformers models.
Integration of advanced optimization techniques and model architectures.
Support for more diverse robotic applications and tasks.
Enhanced documentation, tutorials, and code examples.
Community-driven contributions and collaborations.

Stay tuned for exciting updates and improvements in CyberTron!

Model Directory

Sure! Here's an example of a table-like format in the README.md file, showcasing the models, their tasks, and other metadata:

Model Directory

Model	Description	Tasks	Key Features	Code and Resources
RT-1	Robotics Transformer for real-world control at scale	Picking and placing items, opening and closing drawers, getting items in and out of drawers, placing elongated items upright, knocking objects over, pulling napkins, opening jars, and more	- Transformer architecture with image and action tokenization - EfficientNet-B3 model for image tokenization - Token compression for faster inference - Supports a wide range of tasks and environments	Project Website RT-1 Code Repository
Gato	Generalist Agent for multi-modal, multi-task robotics	Playing Atari games, image captioning, chatbot interactions, real-world robot arm manipulation, and more	- Multi-modal support for text, images, proprioception, continuous actions, and discrete actions - Serialized tokenization of data for processing with a transformer neural network - Flexibility to output different modalities based on context	Published Paper Gato Code Repository

Datasets Directory

This section provides an overview of the datasets used in the project. The datasets are divided into two categories: control datasets used to train Gato and vision & language datasets used for vision and language tasks.

Control Datasets

GATO

Dataset	Tasks
DM Lab	254
ALE Atari	51
ALE Atari Extended	28
Sokoban	1
BabyAI	46
DM Control Suite	30
DM Control Suite Pixels	28
DM Control Suite Random Small	26
DM Control Suite Random Large	26
Meta-World	45
Procgen Benchmark	16
RGB Stacking simulator	1
RGB Stacking real robot	1
Modular RL	38
DM Manipulation Playground	4
Playroom	1

Vision / Language Datasets

Dataset	Tasks
MassiveText	N/A
M3W	N/A
ALIGN	N/A
MS-COCO Captions	N/A
Conceptual Captions	N/A
LTIP	N/A
OKVQA	N/A
VQAV2	N/A

PALM-E Datasets

Dataset	Tasks
Webli (Chen et al., 2022)	N/A
VQ2A (Changpinyo et al., 2022)	N/A
VQG (Changpinyo et al., 2022)	N/A
CC3M (Sharma et al., 2018)	N/A
Object Aware (Piergiovanni et al., 2022)	N/A
OKVQA (Marino et al., 2019)	N/A
VQAv2 (Goyal et al., 2017)	N/A
COCO (Chen et al., 2015)	N/A
Wikipedia text	N/A
(robot) Mobile Manipulator, real	N/A
(robot) Language Table (Lynch et al., 2022), sim and real	N/A
(robot) TAMP, sim	N/A

Please note that the dataset descriptions provided are a summary. To access the datasets from Hugging Face, please visit the Hugging Face website and search for the respective dataset names.

Contributing

Contributions are welcome! If you have any ideas, suggestions, or bug reports, please feel free to open an issue or submit a pull request.

License

This project is licensed under the MIT License.

Roadmap

Integrate CHATGPT FOR ROBOTICS

Integrate Kosmos-X, Kosmos-2, PALM-E, ROBOCAT, any other robotic models we should integrate? Let me know!
Integrate embedding provider for RT-1
Integrate flash attention for RT-1
Integrate flasha Attention for GATO

Primus
Release 0.0.1

Release 0.0.1

0.0.1

Documentation

CyberTron

Share with Friends

Key Features

Architecture

Getting Started

Roadmap

Model Directory

Model Directory

Datasets Directory

Control Datasets

GATO

Vision / Language Datasets

PALM-E Datasets

Contributing

License

Roadmap

Stats

Development practices

Releases

Contributors

Primus Release 0.0.1

Release 0.0.1 Toggle Dropdown 0.0.1

Documentation

CyberTron

Share with Friends

Key Features

Architecture

Getting Started

Roadmap

Model Directory

Model Directory

Datasets Directory

Control Datasets

GATO

Vision / Language Datasets

PALM-E Datasets

Contributing

License

Roadmap

Stats

Development practices

Releases

Contributors

Primus
Release 0.0.1

Release 0.0.1

0.0.1