Releases: leggedrobotics/rsl_rl
v3.1.0
Overview
Full Changelog: v3.0.1...v3.1.0
Added
- Adds state-dependent standard deviation for the PPO actor by @iakinola23 in #112
Fixed
- Allows torch device type to be a string in VecEnv by @kevinzakka in #109
New Contributors
- @iakinola23 made their first contribution in #112
v3.0.1
Overview
Full Changelog: v3.0.0...v3.0.1
Fixed
- Removes hardcoded policy obs group for symmetry by @ClemensSchwarke in #111
v3.0.0
Overview
RSL RL now supports observation dictionaries using the TensorDict library. Different observation groups with different shapes can thus be handled seemlessly, e.g., vision inputs. To assign different observation groups to the correct part of the policy, a dictionary maps
Additionally, the code has been refactored to be more modular and flexible. The main changes are:
- An additional runner class for student-teacher distillation
- An MLP class that can be used to build custom policies
- Normalization is now part of the policy and can be set for different parts, e.g., actor and critic, seperately.
Full Changelog: v2.3.3...v3.0.0
Added
- Adds support for observation dictionaries and refactors code for better modularity by @ClemensSchwarke in 6983041
- Renames observation types to observation sets by @Mayankm96 in 830fa98
- Allows the policy to be loaded on CPU. by @kevinzakka in #98
Breaking Changes
- Isaac Lab does not yet support the new observation handling. There is an open PR (isaac-sim/IsaacLab#2962) that can be used until the changes are merged.
New Contributors
- @kevinzakka made their first contribution in #98
v2.3.3
Overview
Full Changelog: v2.3.2...v2.3.3
Fixed
- Adds the RNN parameters to the optimizer for Distillation by @ClemensSchwarke in #94
v2.3.2
Overview
Full Changelog: v2.3.1...v2.3.2
Added
- Adds gradient cap for teacher student distillation by @alessandroassirelli98 in #91
Fixed
- Fixes unexpected keyword argument
learning_rate
withinRandomNetworkDistillation
by @ozhanozen in #87
New Contributors
- @alessandroassirelli98 made their first contribution in #91
- @ozhanozen made their first contribution in #87
v2.3.1
Overview
Full Changelog: v2.3.0...v2.3.1
Added
- Changes ETA to hh:mm:ss format by @renezurbruegg in #75
Fixed
- Fixes git repository code storage function by @Mayankm96 in #83
- Fixes padding shape in split_and_pad_trajectories to support arbitrary additional dimensions by @bikcrum in #77
- Disable distribution mean gradient propagation into action noise std for StudentTeacher by @flferretti in #82
New Contributors
- @renezurbruegg made their first contribution in #75
- @flferretti made their first contribution in #82
v2.3.0
Overview
RSL RL now supports distributed training. Additionally, a new distillation algorithm allows for student-teacher training.
Full Changelog: v2.2.4...v2.3.0
Added
- Adds Student-Teacher Distillation by @ClemensSchwarke in bbce4ef
- Adds Distillation for recurrent networks by @ClemensSchwarke in d3dbcc3
- Adds Multi-GPU training for PPO and Distillation by @Mayankm96 in 6f8460a
Fixed
- Changes WandB runner name to the log directory name by @Mayankm96 in b9f9e69
- Renames
rnn_hidden_size
tornn_hidden_dim
for naming consistency by @ClemensSchwarke
Breaking Changes
- Renamed
actor_critic
topolicy
to be more general and align with other architectures and algorithms by @ClemensSchwarke in bbce4ef
v2.2.4
Overview
Full Changelog: v2.2.3...v2.2.4
Fixed
- Accounts for start_iter when computing ETA by @PeterMitrano in #29
- Fixes parsing if
rnd
andsymmetry
configs not available by @pascal-roth in #72
New Contributors
- @PeterMitrano made their first contribution in #29
- @pascal-roth made their first contribution in #72
v2.2.3
Overview
This release adds some new parameters to PPO which help make the training more stable.
Full Changelog: v2.2.2...v2.2.3
Added
- Adds flag for per-batch advantage normalization by @Mayankm96 in #68
- Adds support for log-std parameter in ActorCritic by @Mayankm96 in #67
Fixed
- Fixes mean_entropy logging by dividing by num_updates by @bikcrum in #65
- Corrects disabling of arguments when creating Normal distribution by @Mayankm96 in #69
New Contributors
v2.2.2
Overview
Full Changelog: v2.2.1...v2.2.2
Fixed
- Fixes bug in ActorCriticRecurrent hidden state reset by @jnskkmhr in #50
- Stops gradient propagation through ActorCritic std-dev by @Mayankm96 in #66
- Removes unused attributes from VecEnv in 8818338
- Fixes weight schedule dict for RND in 6909a47