Fix RMSNormGated in Zamba2 #35943

pglorio · 2025-01-28T19:16:13Z

What does this PR do?

This PR fixes Zamba2RMSNormGated to allow for config.mamba_ngroups>1. The Zamba2 7B checkpoints have config.mamba_ngroups=2 so this change is necessary to have the correct forward pass.

I defined Zamba2RMSNormGated inside modular_zamba2.py instead of importing it, as this differs from the definition in modeling_mamba2.py. The implementation in this PR is the torch version of the mamba-ssm implementation of the original mamba2 (used here and torch implementation given here).

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@ArthurZucker @Cyrilvallez

…into zamba2

Rebase zamba2

rebase

Rebase

rebase on upstream

This reverts commit 9007a52.

Co-authored-by: Arthur <[email protected]>

vasqu · 2025-01-29T09:15:24Z

src/transformers/models/zamba2/modular_zamba2.py


-class Zamba2RMSNormGated(MambaRMSNormGated):
-    pass
+class Zamba2RMSNormGated(torch.nn.Module):


I think this will also affect the mamba2 code then (as codestral mamba also uses ngroups > 1) - so I'd be for implementing this in the mamba2 code and use modular then.

cc @molbap

@vasqu @molbap sounds good. Should I go ahead and update mamba2?

I think so, but as I'm no maintainer I leave the decision to the others 👀

Rocketknight1 · 2025-01-30T15:19:08Z

cc @molbap I think!

ArthurZucker

Hey! Just want to make sure, this is a fix and not a new feature specific to a model right?
@vasqu I know it is tempting to also add this for mamba2 but AFAIK this was not the original RMSNorm used then no?

TLDR; we don't make modeling changes unless it's bug fixes in general. If you have a new RMS norm it's a new model for use 😉

vasqu · 2025-02-04T08:36:22Z

@ArthurZucker I think it was an oversight in the original implementation for Mamba2 over here - @pglorio shows the relevant code snippets that ngroups is indeed used in the gated rms norm, e.g. here and here.

I can't estimate how it changes the model tho and if slow tests would need to be changed accordingly.

ArthurZucker · 2025-02-04T09:34:36Z

Interesting. We have 1-1 matching results for the codestral model, so my intuition would say we removed it because the input args made it a nop but I might be wrong.

vasqu · 2025-02-04T10:22:06Z

I'll probably check some other time if logits match - maybe they indeed are equivalent 👀

ArthurZucker

Okay, let's only apply it to zamba for now!

* First commit * Finish model implementation * First commit * Finish model implementation * Register zamba2 * generated modeling and configuration * generated modeling and configuration * added hybrid cache * fix attention_mask in mamba * dropped unused loras * fix flash2 * config docstrings * fix config and fwd pass * make fixup fixes * text_modeling_zamba2 * small fixes * make fixup fixes * Fix modular model converter * added inheritances in modular, renamed zamba cache * modular rebase * new modular conversion * fix generated modeling file * fixed import for Zamba2RMSNormGated * modular file cleanup * make fixup and model tests * dropped inheritance for Zamba2PreTrainedModel * make fixup and unit tests * Add inheritance of rope from GemmaRotaryEmbedding * moved rope to model init * drop del self.self_attn and del self.feed_forward * fix tests * renamed lora -> adapter * rewrote adapter implementation * fixed tests * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Fix torch_forward in mamba2 layer * Dropped adapter in-place sum * removed rope from attention init * updated rope * created get_layers method * make fixup fix * make fixup fixes * make fixup fixes * update to new attention standard * update to new attention standard * make fixup fixes * minor fixes * cache_position * removed cache_position postion_ids use_cache * remove config from modular * removed config from modular (2) * import apply_rotary_pos_emb from llama * fixed rope_kwargs * Instantiate cache in Zamba2Model * fix cache * fix @slow decorator * small fix in modular file * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <[email protected]> * several minor fixes * inherit mamba2decoder fwd and drop position_ids in mamba * removed docstrings from modular * reinstate zamba2 attention decoder fwd * use regex for tied keys * Revert "use regex for tied keys" This reverts commit 9007a52. * use regex for tied keys * add cpu to slow forward tests * dropped config.use_shared_mlp_adapter * Update docs/source/en/model_doc/zamba2.md Co-authored-by: Arthur <[email protected]> * re-convert from modular * extended Zamba2RMSNormGated to n_groups>1 * removed einops import * set _supports_sdpa = True * add use_mem_eff_path flag for fused mamba2 fwd * added docstring for use_mem_eff_ath flag --------- Co-authored-by: root <[email protected]> Co-authored-by: Arthur <[email protected]>

pglorio and others added 30 commits October 24, 2024 05:33

First commit

acd25b7

Finish model implementation

70639b8

First commit

d111b98

Finish model implementation

8f36dba

Merge branch 'zamba2' of https://github.com/Zyphra/transformers_zamba …

f0c547c

…into zamba2

Register zamba2

700fbf0

generated modeling and configuration

70a6021

Merge pull request #2 from Zyphra/main

88c4b26

Rebase zamba2

generated modeling and configuration

685906a

added hybrid cache

4da8d5f

fix attention_mask in mamba

6b5a9be

dropped unused loras

248350d

fix flash2

d1d2c66

Merge pull request #3 from Zyphra/main

eb6063e

rebase

config docstrings

5f5d01e

fix config and fwd pass

c1b7647

make fixup fixes

979b99b

text_modeling_zamba2

9d9b2eb

Merge pull request #4 from Zyphra/main

3a457f5

Rebase

small fixes

549d4cb

make fixup fixes

987bba9

Merge pull request #5 from Zyphra/main

ffc2a58

Rebase

Fix modular model converter

9adf85e

added inheritances in modular, renamed zamba cache

904da4e

Merge pull request #6 from Zyphra/main

4725983

rebase on upstream

modular rebase

0be27d7

Rebase

cc0c549

new modular conversion

ac77a09

fix generated modeling file

e59980e

fixed import for Zamba2RMSNormGated

73a647a

pglorio and others added 11 commits January 24, 2025 01:16

Revert "use regex for tied keys"

f701dbd

This reverts commit 9007a52.

use regex for tied keys

87b938b

add cpu to slow forward tests

5e09290

dropped config.use_shared_mlp_adapter

8ed2353

Update docs/source/en/model_doc/zamba2.md

a9bbd9c

Co-authored-by: Arthur <[email protected]>

rebase

1e82757

re-convert from modular

37bff34

resolve merge conflicts

8e0084c

extended Zamba2RMSNormGated to n_groups>1

cd304b5

removed einops import

8f2eb7b

set _supports_sdpa = True

be7d81a

pglorio changed the title ~~Zamba2~~ Fix RMSNormGated in Zamba2 Jan 28, 2025

vasqu reviewed Jan 29, 2025

View reviewed changes

yury-tokpanov mentioned this pull request Jan 29, 2025

[Model] Support Mamba2 (Codestral Mamba) vllm-project/vllm#9292

Merged

5 tasks

pglorio added 4 commits February 3, 2025 07:45

rebase

de9a442

add use_mem_eff_path flag for fused mamba2 fwd

84fbead

rebase

6a6ab33

added docstring for use_mem_eff_ath flag

355bb4c

ArthurZucker reviewed Feb 4, 2025

View reviewed changes

rebase

5af5954

ArthurZucker approved these changes Feb 4, 2025

View reviewed changes

ArthurZucker merged commit a93b805 into huggingface:main Feb 4, 2025
14 checks passed

tlrmchlsmth mentioned this pull request Jun 30, 2025

Enable V1 for Hybrid SSM/Attention Models vllm-project/vllm#20016

Merged

10 tasks

tdoublep mentioned this pull request Sep 13, 2025

Support n_groups>1 for mamba2 #40861

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix RMSNormGated in Zamba2 #35943

Fix RMSNormGated in Zamba2 #35943

Uh oh!

pglorio commented Jan 28, 2025 •

edited

Loading

Uh oh!

vasqu Jan 29, 2025

Uh oh!

pglorio Jan 29, 2025 •

edited

Loading

Uh oh!

vasqu Jan 30, 2025

Uh oh!

Rocketknight1 commented Jan 30, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

vasqu commented Feb 4, 2025

Uh oh!

ArthurZucker commented Feb 4, 2025

Uh oh!

vasqu commented Feb 4, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix RMSNormGated in Zamba2 #35943

Fix RMSNormGated in Zamba2 #35943

Uh oh!

Conversation

pglorio commented Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

vasqu Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

pglorio Jan 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vasqu Jan 30, 2025

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 commented Jan 30, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

vasqu commented Feb 4, 2025

Uh oh!

ArthurZucker commented Feb 4, 2025

Uh oh!

vasqu commented Feb 4, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pglorio commented Jan 28, 2025 •

edited

Loading

pglorio Jan 29, 2025 •

edited

Loading