Remove deprecated logic and warnings #30743

amyeroberts · 2024-05-10T13:38:39Z

What does this PR do?

Kills a bunch of deprecated code

HuggingFaceDocBuilderDev · 2024-05-10T14:02:29Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Thanks for the cleanup! A few nits here and there and we'll be good to go

src/transformers/models/cohere/modeling_cohere.py

ArthurZucker · 2024-05-15T08:08:15Z

src/transformers/models/falcon/modeling_falcon.py

Suggested change

**kwargs,

ArthurZucker · 2024-05-15T08:08:21Z

src/transformers/models/falcon/modeling_falcon.py

Suggested change

**kwargs,

ArthurZucker · 2024-05-15T08:08:28Z

src/transformers/models/falcon/modeling_falcon.py

Suggested change

**kwargs,

ArthurZucker · 2024-05-15T08:08:41Z

src/transformers/models/gemma/modeling_gemma.py

same here kwargs needs to be removed

ArthurZucker · 2024-05-15T08:09:03Z

src/transformers/models/llama/modeling_llama.py

the buffers that are registered need to be deleted as well

src/transformers/models/llama/modeling_llama.py

src/transformers/models/maskformer/modeling_maskformer.py

ArthurZucker · 2024-05-15T08:09:35Z

src/transformers/models/mistral/modeling_mistral.py

same comments as for llama!

ArthurZucker · 2024-05-15T08:09:54Z

src/transformers/models/phi3/modeling_phi3.py

same comment as llama

…lved

amyeroberts · 2024-05-16T09:57:48Z

@ArthurZucker Thanks for the review! I've removed all of the kwargs being passed now, and the _cos_cached, _sin_cached buffers

ArthurZucker

Thanks 🤗

- Comply with the the tensor building logic introduced in huggingface#30743 - Add referencing to the optimized Attention Factor equation - Remove Dynamic YaRN for a more agile deployment Co-authored-by: mig-mfreitas <[email protected]>

* Add YaRN and Dynamic-YaRN RoPE Scaling Methods YaRN (Yet another RoPE extension method) combines the NTK-By-Parts Interpolation and Attention Scaling methods, improving upon existing RoPE interpolation methods for longer context window sizes. Fine-tuned models maintain their original performance across benchmarks while enabling efficient extrapolation and transfer learning for quicker convergence, especially in compute-limited environments. We implement YaRN and Dynamic-YaRN for the following list of models: - LLaMA - Falcon - GPT-NeoX - Olmo - Persimmon - Phi - StableLM - OpenLLaMA New unit tests are added to assert YaRN's correct behavior on both short and long sequence inputs. For more details, please refer to https://arxiv.org/abs/2309.00071. Co-authored-by: Miguel Almeida <[email protected]> * Refactor YaRN implementation for LLaMA Iterate on YaRN implementation for LLaMA and remove diff from remaining models for increased PR modularity. This commit includes the following changes: - Merge 'yarn_rope_scaling' and 'rope_scaling' dictionaries - Remove unnecessary attributes ('extrapolation_factor' and 'finetuned') from YaRN classes - Inherit 'forward' method in YaRN classes from superclass - Rename 'yarn' method to 'compute_yarn_scaling' - Extend YaRN tests with further assertions - Fix style inconsistencies Co-authored-by: Miguel Monte e Freitas <[email protected]> * Refactor Tensor Building Logic for YaRN - Comply with the the tensor building logic introduced in #30743 - Add referencing to the optimized Attention Factor equation - Remove Dynamic YaRN for a more agile deployment Co-authored-by: mig-mfreitas <[email protected]> * remove unwanted file --------- Co-authored-by: Miguel Almeida <[email protected]> Co-authored-by: mig-mfreitas <[email protected]> Co-authored-by: Joao Gante <[email protected]>

amyeroberts requested a review from ArthurZucker May 13, 2024 16:48

ArthurZucker reviewed May 15, 2024

View reviewed changes

amyeroberts added 5 commits May 16, 2024 09:51

Remove deprecated logic and warnings

539fe1d

Add back some code that seems to be important...

02ca9a3

Let's just add all he nllb stuff back; removing it is a bit more invo…

a9ab7e3

…lved

Remove kwargs

d12bcd0

Remove more kwargs

76cfebe

amyeroberts force-pushed the remove-old-image-processor-warnings branch from fe89540 to 76cfebe Compare May 16, 2024 10:05

amyeroberts requested a review from ArthurZucker May 16, 2024 11:51

ArthurZucker approved these changes May 17, 2024

View reviewed changes

amyeroberts merged commit 57c965a into huggingface:main May 17, 2024

amyeroberts deleted the remove-old-image-processor-warnings branch May 17, 2024 11:16

ArthurZucker mentioned this pull request May 20, 2024

Llama Attention Call should not pass **kwargs #30523

Closed

4 tasks

younesbelkada mentioned this pull request May 31, 2024

Llama et al. / FSDP : Fix breaking change in 4.40 for FSDP #31161

Merged

gante mentioned this pull request Jun 17, 2024

Add YaRN and Dynamic-YaRN RoPE Scaling Methods #30910

Merged

5 tasks

Remove deprecated logic and warnings #30743

Remove deprecated logic and warnings #30743

Uh oh!

Conversation

amyeroberts commented May 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented May 10, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker May 15, 2024

Choose a reason for hiding this comment

Uh oh!

amyeroberts commented May 16, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

amyeroberts commented May 10, 2024 •

edited

Loading