Make default RoPE path explicit in RotaryEmbedding classes #41342
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
🧩 Pull Request: Make default RoPE path explicit in Llama RotaryEmbedding initialization
📜 Summary
This PR addresses issue #39753, ensuring that the default Rotary Positional Embedding (RoPE) initialization path is made explicit in the Llama model.
Previously,
inv_freq
was implicitly assigned, which went against the library’s philosophy of explicit initialization for reproducibility and clarity.This change:
"default"
RoPE path explicit by selectingROPE_INIT_FUNCTIONS["default"]
whenself.rope_type == "default"
.ROPE_INIT_FUNCTIONS[self.rope_type]
for non-default cases.inv_freq
,attention_scaling
) delegated to the RoPE function—no manual computation is done in the constructor.The modification is minimal, isolated, and maintains backward compatibility.
🧠 Motivation
Following the discussion in #39753, the goal is to make RoPE initialization behavior transparent and avoid implicit defaults.
This aligns with 🤗 Transformers’ philosophy of clarity and explicit model configuration.
🛠️ Changes Made
In
transformers/src/transformers/models/llama/modeling_llama.py
and more 20 models:inv_freq
andattention_scaling
initialization consistent with the rest of the library.✅ Checklist
🧪 Testing
"default"
and non-default RoPE types.💬 Additional Notes
This is a small but meaningful cleanup that improves code readability and aligns initialization logic with the library’s explicit philosophy.
No other files or RoPE logic were touched.