RAD Training Details and Implementation Notes

This issue is intended to **centralize the discussion and documentation of the RAD  training process**, including pipeline details, loss design, and common debugging tricks.  
> 本 Issue 用于集中讨论和整理 **RAD训练过程中的实现细节**，包括流程设计、损失计算，以及常见的调试技巧。

1. Recommendation to freeze the normalization layers  during RL fine-tuning.
See the reference implementation in [`rad/utils.py`](https://github.com/hustvl/RAD/blob/main/utils.py#L3)

2. **Learning rate is crucial for stable training.**  
   A well-chosen learning rate plays a key role in ensuring smooth convergence during RL fine-tuning.  If you are confident that the algorithm and implementation are correct, it's often effective to **tune the learning rate** when the model fails to converge or behaves unstably.  Both **an excessively large or too small learning rate** can significantly impact convergence quality.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RAD Training Details and Implementation Notes #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RAD Training Details and Implementation Notes #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions