Skip to content

Conversation

ooctipus
Copy link
Collaborator

@ooctipus ooctipus commented Sep 8, 2025

Description

This PR provides remake and extension to orginal environment kuka-allegro-reorientation implemented in paper:
DexPBT: Scaling up Dexterous Manipulation for Hand-Arm Systems with Population Based Training
(https://arxiv.org/abs/2305.12127)
Aleksei Petrenko, Arthur Allshire, Gavriel State, Ankur Handa, Viktor Makoviychuk

and another environment kuka-allegro-lift implemented in paper:
Visuomotor Policies to Grasp Anything with Dexterous Hands
(https://dextrah-rgb.github.io/)
Ritvik Singh, Arthur Allshire, Ankur Handa, Nathan Ratliff, Karl Van Wyk

Though this is a remake, this remake ends up differs quite a lot in environment details for reasons like:

  1. Simplify reward structure,
  2. Unify environment implemtation,
  3. Standarize mdp,
  4. Utilizes manager-based API

That in my opinion, makes environment study and extension more accessible, and analyzable. For example you can train lift policy first then continuing the checkpoint in reorientation environment, since they share the observation space. : ))

It is a best to consider this a very careful re-interpretation rather than exact execution to migrate them to IsaacLab

Here is the training curve if you just train with
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Dexsuite-Kuka-Allegro-Lift-v0 --num_envs 8192 --headless

./isaaclab.sh -p -m torch.distributed.run --nnodes=1 --nproc_per_node=4 scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Dexsuite-Kuka-Allegro-Reorient-v0 --num_envs 40960 --headless --distributed

lift training ~ 4 hours
reorientation training ~ 2 days

Note that it requires a order of magnitude more data and time for reorientation to converge compare to lift under almost identical setup

training curve(screen captured from Wandb) - reward,
Cyan: reorient, Purple: Lift
Screenshot from 2025-09-07 22-58-13

video results
lift
cone_lift
fat_capsule_lift

reorient
cube_reorient
rod_reorient

Memo:
I really enjoy working on this remake, and hopefully for whoever plan to play and extend on this remake find it helpful and similarily joyful as I did. I will be very excited to see what you got : ))

Octi

CAUTION:
Do Not Merge until the asset is uploaded to S3 bucket!

Fixes # (issue)

  • New feature (non-breaking change which adds functionality)

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist

  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@ooctipus ooctipus force-pushed the dexsuite_state_only branch from a1bf1d9 to 0b5a9b9 Compare September 8, 2025 06:06
@ooctipus ooctipus changed the title Dexsuite state only Adds dexterous lift, dexterous reorientation manipulation environments Sep 8, 2025
@Mayankm96 Mayankm96 changed the title Adds dexterous lift, dexterous reorientation manipulation environments Adds dexterous lift and reorientation manipulation environments Sep 8, 2025
@ooctipus ooctipus force-pushed the dexsuite_state_only branch from b65190b to 9d1d6af Compare September 8, 2025 09:04
@ooctipus ooctipus force-pushed the dexsuite_state_only branch from 6aead8b to 7d00448 Compare September 9, 2025 19:53
@kellyguo11 kellyguo11 merged commit c7dde1b into isaac-sim:main Sep 9, 2025
8 checks passed
kellyguo11 pushed a commit that referenced this pull request Sep 9, 2025
# Description

This PR provides remake and extension to orginal environment
kuka-allegro-reorientation implemented in paper:
DexPBT: Scaling up Dexterous Manipulation for Hand-Arm Systems with
Population Based Training
(https://arxiv.org/abs/2305.12127)
[Aleksei
Petrenko](https://arxiv.org/search/cs?searchtype=author&query=Petrenko,+A),
[Arthur
Allshire](https://arxiv.org/search/cs?searchtype=author&query=Allshire,+A),
[Gavriel
State](https://arxiv.org/search/cs?searchtype=author&query=State,+G),
[Ankur
Handa](https://arxiv.org/search/cs?searchtype=author&query=Handa,+A),
[Viktor
Makoviychuk](https://arxiv.org/search/cs?searchtype=author&query=Makoviychuk,+V)

and another environment kuka-allegro-lift implemented in paper:
Visuomotor Policies to Grasp Anything with Dexterous Hands
(https://dextrah-rgb.github.io/)
[Ritvik Singh](https://www.ritvik-singh.com/), [Arthur
Allshire](https://allshire.org/), [Ankur
Handa](https://ankurhanda.github.io/), [Nathan
Ratliff](https://www.nathanratliff.com/), [Karl Van
Wyk](https://scholar.google.com/citations?user=TCYAoF8AAAAJ&hl=en)


Though this is a remake, this remake ends up differs quite a lot in
environment details for reasons like:
1. Simplify reward structure,
2. Unify environment implemtation,
3. Standarize mdp,
4. Utilizes manager-based API

That in my opinion, makes environment study and extension more
accessible, and analyzable. For example you can train lift policy first
then continuing the checkpoint in reorientation environment, since they
share the observation space. : ))

It is a best to consider this a very careful re-interpretation rather
than exact execution to migrate them to IsaacLab

Here is the training curve if you just train with
`./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task
Isaac-Dexsuite-Kuka-Allegro-Lift-v0 --num_envs 8192 --headless`

`./isaaclab.sh -p -m torch.distributed.run --nnodes=1 --nproc_per_node=4
scripts/reinforcement_learning/rsl_rl/train.py --task
Isaac-Dexsuite-Kuka-Allegro-Reorient-v0 --num_envs 40960 --headless
--distributed`

lift training ~ 4 hours
reorientation training ~ 2 days

Note that it requires a order of magnitude more data and time for
reorientation to converge compare to lift under almost identical setup

training curve(screen captured from Wandb) - reward, 
Cyan: reorient, Purple: Lift
<img width="1487" height="780" alt="Screenshot from 2025-09-07 22-58-13"
src="https://github.com/user-attachments/assets/bfa911de-4fee-4c0d-b39c-e9c33fae28f4"
/>


video results 
lift

![cone_lift](https://github.com/user-attachments/assets/e626eadb-b281-4ec9-af16-57f626fcc6aa)

![fat_capsule_lift](https://github.com/user-attachments/assets/cde57d4c-ceb2-40ab-88dd-44320da689c5)

reorient

![cube_reorient](https://github.com/user-attachments/assets/752809cb-ea19-4701-b124-20c1909e4566)

![rod_reorient](https://github.com/user-attachments/assets/f009605a-d93c-491c-b124-ff08606c63ec)


Memo:
I really enjoy working on this remake, and hopefully for whoever plan to
play and extend on this remake find it helpful and similarily joyful as
I did. I will be very excited to see what you got : ))

Octi


CAUTION:
Do Not Merge until the asset is uploaded to S3 bucket!

Fixes # (issue)
<!-- As you go through the list, delete the ones that are not
applicable. -->

- New feature (non-breaking change which adds functionality)

## Screenshots

Please attach before and after screenshots of the change if applicable.

<!--
Example:

| Before | After |
| ------ | ----- |
| _gif/png before_ | _gif/png after_ |

To upload images to a PR -- simply drag and drop an image while in edit
mode and it should upload the image directly. You can then paste that
source into the above before/after sections.
-->

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->
@romesco
Copy link
Contributor

romesco commented Sep 10, 2025

Just want to comment and say, awesome work Octi!!

ooctipus added a commit to ooctipus/IsaacLab that referenced this pull request Sep 20, 2025
…c-sim#3378)

# Description

This PR provides remake and extension to orginal environment
kuka-allegro-reorientation implemented in paper:
DexPBT: Scaling up Dexterous Manipulation for Hand-Arm Systems with
Population Based Training
(https://arxiv.org/abs/2305.12127)
[Aleksei
Petrenko](https://arxiv.org/search/cs?searchtype=author&query=Petrenko,+A),
[Arthur
Allshire](https://arxiv.org/search/cs?searchtype=author&query=Allshire,+A),
[Gavriel
State](https://arxiv.org/search/cs?searchtype=author&query=State,+G),
[Ankur
Handa](https://arxiv.org/search/cs?searchtype=author&query=Handa,+A),
[Viktor
Makoviychuk](https://arxiv.org/search/cs?searchtype=author&query=Makoviychuk,+V)

and another environment kuka-allegro-lift implemented in paper:
Visuomotor Policies to Grasp Anything with Dexterous Hands
(https://dextrah-rgb.github.io/)
[Ritvik Singh](https://www.ritvik-singh.com/), [Arthur
Allshire](https://allshire.org/), [Ankur
Handa](https://ankurhanda.github.io/), [Nathan
Ratliff](https://www.nathanratliff.com/), [Karl Van
Wyk](https://scholar.google.com/citations?user=TCYAoF8AAAAJ&hl=en)


Though this is a remake, this remake ends up differs quite a lot in
environment details for reasons like:
1. Simplify reward structure,
2. Unify environment implemtation,
3. Standarize mdp,
4. Utilizes manager-based API

That in my opinion, makes environment study and extension more
accessible, and analyzable. For example you can train lift policy first
then continuing the checkpoint in reorientation environment, since they
share the observation space. : ))

It is a best to consider this a very careful re-interpretation rather
than exact execution to migrate them to IsaacLab

Here is the training curve if you just train with
`./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task
Isaac-Dexsuite-Kuka-Allegro-Lift-v0 --num_envs 8192 --headless`

`./isaaclab.sh -p -m torch.distributed.run --nnodes=1 --nproc_per_node=4
scripts/reinforcement_learning/rsl_rl/train.py --task
Isaac-Dexsuite-Kuka-Allegro-Reorient-v0 --num_envs 40960 --headless
--distributed`

lift training ~ 4 hours
reorientation training ~ 2 days

Note that it requires a order of magnitude more data and time for
reorientation to converge compare to lift under almost identical setup

training curve(screen captured from Wandb) - reward, 
Cyan: reorient, Purple: Lift
<img width="1487" height="780" alt="Screenshot from 2025-09-07 22-58-13"
src="https://github.com/user-attachments/assets/bfa911de-4fee-4c0d-b39c-e9c33fae28f4"
/>


video results 
lift

![cone_lift](https://github.com/user-attachments/assets/e626eadb-b281-4ec9-af16-57f626fcc6aa)

![fat_capsule_lift](https://github.com/user-attachments/assets/cde57d4c-ceb2-40ab-88dd-44320da689c5)

reorient

![cube_reorient](https://github.com/user-attachments/assets/752809cb-ea19-4701-b124-20c1909e4566)

![rod_reorient](https://github.com/user-attachments/assets/f009605a-d93c-491c-b124-ff08606c63ec)


Memo:
I really enjoy working on this remake, and hopefully for whoever plan to
play and extend on this remake find it helpful and similarily joyful as
I did. I will be very excited to see what you got : ))

Octi


CAUTION:
Do Not Merge until the asset is uploaded to S3 bucket!

Fixes # (issue)
<!-- As you go through the list, delete the ones that are not
applicable. -->

- New feature (non-breaking change which adds functionality)

## Screenshots

Please attach before and after screenshots of the change if applicable.

<!--
Example:

| Before | After |
| ------ | ----- |
| _gif/png before_ | _gif/png after_ |

To upload images to a PR -- simply drag and drop an image while in edit
mode and it should upload the image directly. You can then paste that
source into the above before/after sections.
-->

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->
@ooctipus ooctipus deleted the dexsuite_state_only branch October 22, 2025 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants