Delete deprecations with end-cycle in v4.xx and v5.0 #41681

zucchini-nlp · 2025-10-17T08:16:05Z

What does this PR do?

Deletes deprecation for the v5 release. cc @yonigozlan and @eustlb to verify that we can really delete them, since it is mostly deprecated stuff from processors

For core maintainers:

The max-size and pad-pixels args can't be deleted due to many models on the hub still using it. Instead I decided to stop raising warning and silently load max size if present. However users cannot anymore use these args when calling the processor
For SDPA attention weights, the warning is reworded to push us to update those models. Usually if model has the new attention API, the warning is raised internally in sdpa_attention, but for old models we have to manually duplicate the waning message

yonigozlan

Love this so much 🥲, thanks for taking the time to do this @zucchini-nlp !
My only reservation is for max_size, I'll try to scan the hub to see how bad the damage would be right now, and open PRs where necessary

src/transformers/models/aimv2/modeling_aimv2.py

src/transformers/models/beit/modeling_beit.py

src/transformers/models/conditional_detr/image_processing_conditional_detr.py

src/transformers/models/deformable_detr/image_processing_deformable_detr.py

src/transformers/models/maskformer/image_processing_maskformer.py

src/transformers/models/sam/modeling_sam.py

eustlb

Thanks a lot for taking care of that!
General comment:

For every models that had something like

logger.warning_once(
                "MimiModel is using MimiSdpaAttention, but `torch.nn.functional.scaled_dot_product_attention` does not support `output_attentions=True`. Falling back to the manual attention implementation, "
                'but specifying the manual implementation will be required from Transformers version v5.0.0 onwards. This warning can be removed using the argument `attn_implementation="eager"` when loading the model.'
            )

because of being copied from Gemma, I think we should keep a warning. I've seen your answer above about that, but this is only when output_attentions is setted in the config, not in the forward, no? Therefore, until those implementations rely on ALL_ATTENTION_FUNCTIONS, we still have to handle the warning here to ensure we don't silently skip outputting attentions.

Review from my side:
core models (meaning ones that do not rely on modular)

moshi
mimi
clap
seamless_m4t
unispeech
unispeech_sat
wav2vec2
wav2vec2_bert
whisper

the rest are impacted by modular directly:

hubert (wav2vec2)
kyutai_speech_to_text (moshi)
sew (wav2vec2)
sew_d (wav2vec2)
wavlm (wav2vec2)
data2vec_audio (wav2vec2)

src/transformers/models/wav2vec2/modeling_wav2vec2.py

src/transformers/models/mimi/modeling_mimi.py

zucchini-nlp · 2025-10-17T15:59:58Z

Therefore, until those implementations rely on ALL_ATTENTION_FUNCTIONS

Ah, after looking I see that it's not going to raise any warning unless we use ALL_ATTENTION_FUNCTIONS. In that case I think we can reword the warning that "Attention weights will not be returned" and remove the workaround. Users have been warned about SDPA so I think it will not be a big problem

Also it will nudge us to incorporate ALL_ATTENTION_FUNCTIONS. WDYT?

eustlb · 2025-10-17T16:25:55Z

Yep, totally agree!

zucchini-nlp · 2025-10-20T14:11:50Z

src/transformers/models/bloom/modeling_bloom.py

        self,
        input_ids: Optional[torch.LongTensor] = None,
        attention_mask: Optional[torch.FloatTensor] = None,
-        position_ids: Optional[torch.LongTensor] = None,


model doesn't use positions, I believe it was simply copied from other models without checking

zucchini-nlp · 2025-10-20T16:36:13Z

Hub down? 😢

HuggingFaceDocBuilderDev · 2025-10-20T16:48:34Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

… else

github-actions · 2025-10-21T14:25:50Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: aimv2, altclip, beit, blip_2, bloom, bridgetower, clap, conditional_detr, data2vec, deepseek_v2, deformable_detr, detr, grounding_dino

zucchini-nlp · 2025-10-21T15:29:45Z

Oke, now it is done I think. @yonigozlan @eustlb if you want to take another look, I will also request core mantainer's review

zucchini-nlp added 3 commits October 15, 2025 10:57

remove deprecations from v4

eeaecf3

delete those for v5

57460cb

delete these also

af2bbc3

zucchini-nlp requested review from eustlb and yonigozlan October 17, 2025 08:16

yonigozlan approved these changes Oct 17, 2025

View reviewed changes

eustlb reviewed Oct 17, 2025

View reviewed changes

src/transformers/models/wav2vec2/modeling_wav2vec2.py Show resolved Hide resolved

src/transformers/models/mimi/modeling_mimi.py Show resolved Hide resolved

zucchini-nlp added 3 commits October 20, 2025 15:01

fix tests

8e7856c

add dummy test config

ac9852e

fix copies

4c6b821

zucchini-nlp commented Oct 20, 2025

View reviewed changes

zucchini-nlp added 2 commits October 20, 2025 16:21

SDPA raises warning but doesn't automatically change to eager

ad12d45

max size can't be deleted, sadly

657c653

Merge branch 'main' into deprecations-image-processor

fbe2c0f

zucchini-nlp added 4 commits October 21, 2025 12:17

Merge branch 'main' into deprecations-image-processor

874cb68

oke, this should allow loading from-pretrained, but delete everything…

02bf280

… else

style

6fdc52b

fix popping from kwargs

6ae3efd

zucchini-nlp mentioned this pull request Oct 21, 2025

Delete generation params from model config #41695

Open

zucchini-nlp added 3 commits October 21, 2025 15:44

audios rename

db0459a

padding defaults to self

0d3bae9

modular fix

b9f4845

zucchini-nlp requested a review from ArthurZucker October 21, 2025 15:29

Delete deprecations with end-cycle in v4.xx and v5.0 #41681

Are you sure you want to change the base?

Delete deprecations with end-cycle in v4.xx and v5.0 #41681

Uh oh!

Conversation

zucchini-nlp commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

yonigozlan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eustlb left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

zucchini-nlp commented Oct 17, 2025

Uh oh!

eustlb commented Oct 17, 2025

Uh oh!

zucchini-nlp Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp commented Oct 20, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Oct 20, 2025

Uh oh!

github-actions bot commented Oct 21, 2025

Uh oh!

zucchini-nlp commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zucchini-nlp commented Oct 17, 2025 •

edited

Loading

eustlb left a comment •

edited

Loading