🔴 Remove head mask in generative models #35786

zucchini-nlp · 2025-01-20T10:11:30Z

What does this PR do?

Unblocks me on #35314 (comment). This PR raises error whenever one wants to use head_mask with SDPA/FA2, instead of raising a warning and redirecting to eager.

Note that we didn't change vision models like ViT in this PR, those models will simply fallback to eager and have no problems because they are not generative (no cache/no attn mask)

See the linked comment for reasons why we need this

gante

🙏

(I think raising an exception is indeed the right thing to do, as we were not respecting the head_mask at all)

ArthurZucker

actually happy to have mostly given that you are refactoring as well! TBH I don't mind ignoring head mask if the doc explicitly mentions that only eager uses it. Might be less work for us, less warning and aligned with kwargs that can be passed to attn integration functions

zucchini-nlp · 2025-02-13T17:41:16Z

@ArthurZucker yeah, just removing also works since no-one seems to use it anyways. I updated each model's docs with small note about head-mask and the models now silently ignore the head-mask

Maybe we can simply remove it from valid args in v5.0?

HuggingFaceDocBuilderDev · 2025-02-13T18:29:09Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

LGTM maybe a red light on the PR name!

zucchini-nlp requested review from ArthurZucker, Rocketknight1 and eustlb as code owners January 20, 2025 10:11

zucchini-nlp mentioned this pull request Jan 20, 2025

Bart: new cache format #35314

Merged

1 task

gante approved these changes Jan 29, 2025

View reviewed changes

ArthurZucker reviewed Feb 12, 2025

View reviewed changes

zucchini-nlp mentioned this pull request Feb 14, 2025

[CI] Check test if the GenerationTesterMixin inheritance is correct 🐛 🔫 #36180

Merged

zucchini-nlp mentioned this pull request Apr 18, 2025

[VLMs] support attention backends #37576

Merged

ArthurZucker approved these changes May 14, 2025

View reviewed changes

just squash into one commit

7ae5ef1

zucchini-nlp force-pushed the remove-head-mask branch from ca2fd1e to 7ae5ef1 Compare May 14, 2025 16:45

delete print

c500cbb

zucchini-nlp merged commit 955e61b into huggingface:main May 15, 2025
20 checks passed

zucchini-nlp changed the title ~~Remove head mask in generative models~~ 🔴 Remove head mask in generative models May 15, 2025

zucchini-nlp mentioned this pull request Jun 5, 2025

[generation] bring back tests on vision models #38603

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🔴 Remove head mask in generative models #35786

🔴 Remove head mask in generative models #35786

Uh oh!

zucchini-nlp commented Jan 20, 2025 •

edited

Loading

Uh oh!

gante left a comment

Uh oh!

ArthurZucker left a comment

Uh oh!

zucchini-nlp commented Feb 13, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Feb 13, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

🔴 Remove head mask in generative models #35786

🔴 Remove head mask in generative models #35786

Uh oh!

Conversation

zucchini-nlp commented Jan 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Feb 13, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zucchini-nlp commented Jan 20, 2025 •

edited

Loading

zucchini-nlp commented Feb 13, 2025 •

edited

Loading