-
Notifications
You must be signed in to change notification settings - Fork 36
Open
Milestone
Description
- Support generic multi-step RL scenarios (infra & algorithm)
- Further optimize utilization of computational resources
- Support automatic load balancing
- Improve robustness of
WorkflowRunner
--> Upgraded toScheduler
- Support training with Megatron-LM
- Implement more algorithms for off-policy / asynchronous RL
- Implement more advanced sampling strategies for task / experience buffer
- Further integrate RL process with advanced data processing functionalities
- Continue refining Trinity-Studio
Metadata
Metadata
Assignees
Labels
No labels