-
Notifications
You must be signed in to change notification settings - Fork 31k
Description
Feature request
We would like to make the testing suite in this repository more device agnostic. It seems there has already been some work towards this, however the majority of tests will still only run on either GPU or CPU. This would require a number of changes to all tests present in the library, however it would not alter the behaviour of Huggingface's CI runners.
A non-exhaustive list of changes would be:
- Add a new test decorator
@require_torch_with_acceleratorthat largerly supersedes (but does not replace)@require_torch_gpu. This new decorator can be used for any test that is device agnostic that we would like to accelerate. We would keep@require_torch_gpufor tests that truly require CUDA features, such as ones that check device memory utilisation (such as in model parallelism or lower precision tests) or use custom CUDA kernels (such as Flash Attention). - Certain tests could be made device agnostic quite easily, such as tests that only check for CUDA devices to enable fp16, tests that use backend specific PRNG initialisation, or tests that clear cache before executing. This could be done by adding device agnostic variants to
testing_utils.pythat compare the current device in use and dispatch to the appropriate backend specific function if available.- For example, rather than the comparison
torch_device == 'cuda'to check if we can run with fp16, we could call a functiontesting_utils.accelerator_is_fp16_available(torch_device)or similar. Similar functions already exist to check for tf32 or bf16 support. - Crucially, in upstream we would only have settings for CUDA and CPU devices – as well as any other of your supported backends. However, we would expose functions to register your own device in user code so third parties can test custom backends without upstreaming changes.
- For example, rather than the comparison
Motivation
As Huggingface libraries and models make up a significant part of the current ML community, it makes sense when developing custom PyTorch backends to test against these model libraries as they cover a large proportion of the most users' use cases.
However, the current testing suite does not easily allow for custom devices – not without maintaining a custom private fork that needs to be continuously kept up to date with the upstream repository. This reason, and because the number of changes required is not especially significant, is why we are making this proposal.
Your contribution
We would write and submit a PR to implement these changes following discussion and approval with 🤗Transformers maintainers.