[`FA-2`] Final fix for FA2 dtype #26846

younesbelkada · 2023-10-16T18:48:14Z

What does this PR do?

Proposes a simpler fix for dealing with FA-2 + PEFT + quantization fine-tuning where users usually cast all other modules (e.g. LayerNorms) in fp32 for training stability.

With #26761 being introduced, it is now much simpler to retrieve model's original dtype, note also that self.config._pre_quantization_dtype remains the single source of truth as to is not supported for quantized models

cc @ArthurZucker @pacman100

Added also a nice test

HuggingFaceDocBuilderDev · 2023-10-16T19:05:17Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

ArthurZucker

Thanks, think we can simplify a bit and remove the warning ?

ArthurZucker · 2023-10-17T07:15:36Z

src/transformers/models/mistral/modeling_mistral.py

            logger.warning_once(
-                "The input hidden states seems to be silently casted in float32, this might be related to"
-                " the fact you have upcasted embedding or layer norm layers in float32. We will cast back the input in"
-                " float16."
+                f"The input hidden states seems to be silently casted in float32, this might be related to"
+                f" the fact you have upcasted embedding or layer norm layers in float32. We will cast back the input in"
+                f" {target_dtype}."
            )


I think we can remove this now no?

Hmm I think we need to keep it to inform users about that

src/transformers/models/falcon/modeling_falcon.py

Co-authored-by: Arthur <[email protected]>

* final fix for FA2 dtype * try * oops * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <[email protected]> * apply fix everywhere --------- Co-authored-by: Arthur <[email protected]>

final fix for FA2 dtype

d1743eb

younesbelkada mentioned this pull request Oct 16, 2023

[FA2] Cast to correct dtype #26560

Closed

younesbelkada added 2 commits October 17, 2023 01:01

try

2107305

oops

dfe9ddd

ArthurZucker approved these changes Oct 17, 2023

View reviewed changes

younesbelkada and others added 3 commits October 18, 2023 19:16

Update src/transformers/models/falcon/modeling_falcon.py

21ffe37

Co-authored-by: Arthur <[email protected]>

Merge remote-tracking branch 'upstream/main' into fa-2-final-fix

2ef54a2

apply fix everywhere

c0ce79a

younesbelkada merged commit 5a73316 into huggingface:main Oct 18, 2023

younesbelkada mentioned this pull request Oct 18, 2023

[WIP] Add FA2 for all Bart-like #26722

Closed

younesbelkada deleted the fa-2-final-fix branch October 18, 2023 21:13

younesbelkada mentioned this pull request Oct 18, 2023

[FA-2] Revert suggestion that broke FA2 fine-tuning with quantized models #26916

Merged

ArthurZucker mentioned this pull request Oct 23, 2023

[fix] llama_dtype_fix triggered when flash attention is on #26984

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[`FA-2`] Final fix for FA2 dtype #26846

[`FA-2`] Final fix for FA2 dtype #26846

Uh oh!

younesbelkada commented Oct 16, 2023 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Oct 16, 2023

Uh oh!

ArthurZucker left a comment

Uh oh!

ArthurZucker Oct 17, 2023

Uh oh!

younesbelkada Oct 18, 2023

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[FA-2] Final fix for FA2 dtype #26846

[FA-2] Final fix for FA2 dtype #26846

Uh oh!

Conversation

younesbelkada commented Oct 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 16, 2023

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Oct 17, 2023

Choose a reason for hiding this comment

Uh oh!

younesbelkada Oct 18, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[`FA-2`] Final fix for FA2 dtype #26846

[`FA-2`] Final fix for FA2 dtype #26846

younesbelkada commented Oct 16, 2023 •

edited

Loading