-
-
Notifications
You must be signed in to change notification settings - Fork 793
Description
Feature request
Port AdamW8bit support for CPU from multi-backend-refactor branch to the main branch
Motivation
Public cloud providers' machines with GPUs are usually expensive while datacenter-grade CPUs are more readily available at lower prices. Towards the goal of making Deep Learning more accessible to developers & learners, the ability to finetune with AdamW8bit on CPU seems like a good milestone. TorchTune is currently unable to support full fine-tuning on CPU with AdamW8bit because it uses bitsandbytes' AdamW8bit optimizer.
#898 enabled AdamW8bit for CPU in multi-backend-refactor branch, but the main branch doesn't have it.
It'd be great if we could enable AdamW8bit for CPU in bitsandbytes main branch before TorchTune's next release (provided there would be a bitsandbytes release before that), so that users who'd install TorchTune would automatically end up installing a version of bitsandbytes that'd support AdamW8bit on CPU.
Thanks!
Your contribution
@jianan-gu could port over his code from multi-backend-refactor branch to the main branch.