Skip to content

Conversation

tohtana
Copy link
Contributor

@tohtana tohtana commented Sep 28, 2025

This PR improves the usability of the leaf module feature.

Here are the changes:

  • Allow enabling the leaf module via both the DeepSpeed config and APIs.
  • Relax matching criteria to support class-based matching.
  • Support multiple ways of specifying the target module: class, class name (with or without package name), module name, or suffix.
  • Add documentation to the training guide, including config snippets and explanations of default behavior.
  • Add default classes (e.g., Mixtral, Qwen2/Qwen3) that automatically enable the leaf module feature. (Welcoming requests to add more classes)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant