Llama: RoPE refactor #32135

gante · 2024-07-22T10:23:48Z

What does this PR do?

Same as #31999, but with llama being the only changed model.

Confirmed: slow tests are "passing" (same failures as main)
👉 RUN_SLOW=1 py.test -vv tests/models/llama/test_modeling_llama.py
👉 RUN_SLOW=1 py.test -vv tests/utils/test_cache_utils.py
👉 RUN_SLOW=1 py.test -vv tests/utils/test_modeling_rope_utils.py (new tests)

Throughput benchmarks: No changes vs previous main 💔

src/transformers/models/jetmoe/modeling_jetmoe.py

gante · 2024-07-22T11:28:42Z

src/transformers/models/chameleon/modeling_chameleon.py

#31999, which propagates the changes to all models, will fix this.

HuggingFaceDocBuilderDev · 2024-07-22T11:35:43Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts

Thanks for all the work consolidating the rope logic!

Mostly some small questions and nits. Main comment is about the testing for all the compute functions

src/transformers/modeling_rope_utils.py

amyeroberts · 2024-07-22T11:52:54Z

src/transformers/modeling_rope_utils.py

Are all of the arguments expected, even if optional?

no, not at all :) the validation function exists to (among other things) detect incorrect parameter configurations

amyeroberts · 2024-07-22T12:26:39Z

src/transformers/modeling_rope_utils.py

All of these should be tested in a test rope utils module, including checks for taking rope_kwargs and config and their equivalence

Added "rope_kwargs and config and their equivalence" ✅

Numerical checks will be a todo for the post-release follow-up PR (#31999)

src/transformers/models/llama/configuration_llama.py

amyeroberts · 2024-07-22T12:36:12Z

tests/models/llama/test_modeling_llama.py

This works and is consistent with the other checks above. We should really make sure to check the rescaling values with specific numerical values in tests for the compute methods as well. This tests tells us things have changed, but not whether the change is in the right direction or magnitude

Fair, but that is a test that requires some numerical diving. Given our release goals -- would it be okay for me to add a todo/open an issue?

As long as it's actually done, then yes ;)

src/transformers/models/llama/modeling_llama.py

ArthurZucker

LGTM

ArthurZucker · 2024-07-22T11:52:08Z

src/transformers/models/llama/modeling_llama.py

Suggested change

self.rope_init_fn = ROPE_INIT_FUNCTIONS[self.rope_type]

self.rope_init_fn = ROPE_INIT_FUNCTIONS[self.rope_type]

should it be rope scaling rather than rope init? nit!

I'd rather go with init -- the default rope (i.e. not scaled) uses this path as well

ArthurZucker · 2024-07-22T17:18:55Z

src/transformers/models/llama/configuration_llama.py

Ok this should leave enough freedom

tho, the fact that we don't have a nested config makes it simpler, checks are run somwhere else so pretty much equivalent

ArthurZucker · 2024-07-22T17:19:33Z

src/transformers/models/llama/configuration_llama.py

nice to see that go aways!

src/transformers/models/llama/modeling_llama.py

amyeroberts

Beautiful - thanks for adding and iterating!

amyeroberts · 2024-07-22T17:17:15Z

tests/utils/test_modeling_rope_utils.py

YaRN (Yet another RoPE extension method) combines the NTK-By-Parts Interpolation and Attention Scaling methods, improving upon existing RoPE interpolation methods for longer context window sizes. Fine-tuned models maintain their original performance across benchmarks while enabling efficient extrapolation and transfer learning for quicker convergence, especially in compute-limited environments. We implement YaRN and Dynamic-YaRN for the following list of models: - LLaMA - Falcon - GPT-NeoX - Olmo - Persimmon - Phi - StableLM - OpenLLaMA New unit tests are added to assert YaRN's correct behavior on both short and long sequence inputs. For more details, please refer to https://arxiv.org/abs/2309.00071. Co-authored-by: Miguel Almeida <[email protected]>

Iterate on YaRN implementation for LLaMA and remove diff from remaining models for increased PR modularity. This commit includes the following changes: - Merge 'yarn_rope_scaling' and 'rope_scaling' dictionaries - Remove unnecessary attributes ('extrapolation_factor' and 'finetuned') from YaRN classes - Inherit 'forward' method in YaRN classes from superclass - Rename 'yarn' method to 'compute_yarn_scaling' - Extend YaRN tests with further assertions - Fix style inconsistencies Co-authored-by: Miguel Monte e Freitas <[email protected]>

- Comply with the the tensor building logic introduced in huggingface#30743 - Add referencing to the optimized Attention Factor equation - Remove Dynamic YaRN for a more agile deployment Co-authored-by: mig-mfreitas <[email protected]>

Co-authored-by: amyeroberts <[email protected]>

Co-authored-by: Arthur <[email protected]>

gante · 2024-07-23T09:10:14Z

merged the yarn PR (percursor), now merging this one as soon as CI goes green

amyeroberts · 2024-07-23T09:16:20Z

Yarn PR is failing code quality checks on main. Could you make sure to rebase and then run make fix-copies etc here before merge?

gante commented Jul 22, 2024

View reviewed changes

src/transformers/models/jetmoe/modeling_jetmoe.py Outdated Show resolved Hide resolved

gante commented Jul 22, 2024

View reviewed changes

gante requested review from ArthurZucker and amyeroberts July 22, 2024 11:30

amyeroberts reviewed Jul 22, 2024

View reviewed changes

ArthurZucker approved these changes Jul 22, 2024

View reviewed changes

amyeroberts approved these changes Jul 22, 2024

View reviewed changes

tests/utils/test_modeling_rope_utils.py Outdated

Copy link

Contributor

amyeroberts Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤗

gante mentioned this pull request Jul 23, 2024

Hub test: avoid repo creation deadlocks #32151

Closed

mig-mfreitas and others added 21 commits July 23, 2024 08:58

Refactor Tensor Building Logic for YaRN

f2122cd

- Comply with the the tensor building logic introduced in huggingface#30743 - Add referencing to the optimized Attention Factor equation - Remove Dynamic YaRN for a more agile deployment Co-authored-by: mig-mfreitas <[email protected]>

remove unwanted file

b8df7a2

all diff except the llama folder

0166869

add updated config

7e4e4d8

add updated rope class (and break related copies)

6564195

related classes

95304b5

llama attention

3904d32

fa2 (and break a few more copies)

c7837eb

sdpa (and break a few more copies)

e5e1cde

up to the model class

f68b9cd

up to ForSequenceClassification

2f5ace3

last set?

5d35287

missing this one

4c56e43

make fixup

f36ec3a

Update src/transformers/modeling_rope_utils.py

35699b3

Co-authored-by: amyeroberts <[email protected]>

Update src/transformers/modeling_rope_utils.py

b095ebb

Co-authored-by: amyeroberts <[email protected]>

rename 'type' and 'scaling_type' to a clearer 'rope_type'

3f6458b

abstract out key validation

6d315ca

safety getattr; explicit docstring

5809b5e

gante and others added 7 commits July 23, 2024 08:58

docstring nit

a7502ed

add tests

3bc7c52

remove external position_embeddings interface

39e216a

test nit

000aeba

Update src/transformers/models/llama/modeling_llama.py

80a0422

Co-authored-by: Arthur <[email protected]>

Update src/transformers/models/llama/modeling_llama.py

48ed251

Co-authored-by: Arthur <[email protected]>

make fixu

c824be0

gante force-pushed the llama_rope_refactor branch from 1416972 to c824be0 Compare July 23, 2024 08:58

Merge branch 'main' into llama_rope_refactor

fc1255e

make fixup and make fix-copies

75b2391

gante merged commit 2e11342 into huggingface:main Jul 23, 2024

gante deleted the llama_rope_refactor branch July 23, 2024 09:43

ArthurZucker mentioned this pull request Jul 24, 2024

Llama 3 - RuntimeError: shape '[-1, 0]' is invalid for input of size 41041920 #32170

Closed

4 tasks

gante mentioned this pull request Aug 1, 2024

RoPE: Add numerical tests ✨ #32380

Merged

Fazziekey mentioned this pull request Aug 5, 2024

support fixed ntk rope in modeling_rope_utils.py #32424

Closed

5 tasks

jonatanklosko mentioned this pull request Aug 6, 2024

Unify RoPE strategies elixir-nx/bumblebee#388

Open

ArthurZucker mentioned this pull request Aug 21, 2024

support qwen2-vl #32318

Merged

5 tasks

thepowerfuldeez mentioned this pull request Aug 22, 2024

Unrecognized keys in rope_scaling for 'rope_type'='dynamic': {'type'} #32916

Closed

4 tasks

gante mentioned this pull request Sep 12, 2024

Cohere: update RoPE structure #33408

Merged

SunMarc mentioned this pull request Oct 3, 2024

Fix tensors on "two devices" issue #32420 #33742

Closed

5 tasks

tengomucho mentioned this pull request Nov 27, 2024

Support for Llama 3.1 and 3.2 fine tuning huggingface/optimum-tpu#114

Open

ArthurZucker mentioned this pull request Jan 13, 2025

The argument "dim" is gone from LlamaRotaryEmbedding initializer. Intentional? #35621

Closed

4 tasks

MekkCyber mentioned this pull request Mar 12, 2025

Changing the test model in Quanto kv cache #36670

Merged

	self.rope_init_fn = ROPE_INIT_FUNCTIONS[self.rope_type]
	self.rope_init_fn = ROPE_INIT_FUNCTIONS[self.rope_type]

Llama: RoPE refactor #32135

Llama: RoPE refactor #32135

Uh oh!

Conversation

gante commented Jul 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Jul 22, 2024

Uh oh!

amyeroberts left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

amyeroberts left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gante commented Jul 23, 2024

Uh oh!

amyeroberts commented Jul 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

gante commented Jul 22, 2024 •

edited

Loading