WIP: Initial support for bnb 4bit on any nn.Parameter #39859

matthewdouglas · 2025-08-01T22:24:18Z

What does this PR do?

This PR adds a new option to BitsAndBytesConfig called target_parameters with the same spirit as target_parameters in huggingface/peft#2638. The intent is to allow quantization of nn.Parameter that are not within a nn.Linear, e.g. those found commonly in certain MoE model implementations.

Requires bitsandbytes-foundation/bitsandbytes#1720 which is released in bitsandbytes v0.48.0.

Example usage with a Granite MoE:

model = GraniteMoeForCausalLM.from_pretrained(
    "ibm-granite/granite-3.1-3b-a800m-base",
    torch_dtype=torch.bfloat16,
    device_map="cuda:0",
    quantization_config=BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.bfloat16,
        bnb_4bit_use_double_quant=False,
        target_parameters=["block_sparse_moe.input_linear.weight", "block_sparse_moe.output_linear.weight"],
        llm_int8_skip_modules=["lm_head", "block_sparse_moe.router"]
    ),
)

Memory Usage - BF16

Metric	Cur Usage	Peak Usage	Tot Alloc	Tot Freed
Allocated memory	6291 MiB	6292 MiB	12583 MiB	6292 MiB
Active memory	6291 MiB	6292 MiB	12583 MiB	6292 MiB
Requested memory	6291 MiB	6291 MiB	12583 MiB	6291 MiB

Memory Usage - Before PR

Metric	Cur Usage	Peak Usage	Tot Alloc	Tot Freed
Allocated memory	6019 MiB	6027 MiB	9935 MiB	3916 MiB
Active memory	6019 MiB	6027 MiB	9935 MiB	3916 MiB
Requested memory	6015 MiB	6024 MiB	9929 MiB	3913 MiB

Memory Usage - After PR

Metric	Cur Usage	Peak Usage	Tot Alloc	Tot Freed
Allocated memory	1894 MiB	2054 MiB	9424 MiB	7530 MiB
Active memory	1894 MiB	2054 MiB	9424 MiB	7530 MiB
Requested memory	1875 MiB	2035 MiB	9389 MiB	7513 MiB

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[ x ] Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
(See Slack discussion)
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@SunMarc @MekkCyber @BenjaminBossan

HuggingFaceDocBuilderDev · 2025-08-01T22:47:04Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Rocketknight1 · 2025-08-06T13:26:31Z

cc @MekkCyber

SunMarc

Nice it would be great to add some tests (inference / saving) with gptoss model !

matthewdouglas requested review from BenjaminBossan and SunMarc August 1, 2025 22:24

SunMarc reviewed Aug 7, 2025

View reviewed changes

matthewdouglas force-pushed the bnb-parametrize-4bit branch 2 times, most recently from 78d55b3 to 61fdac5 Compare August 14, 2025 19:46

matthewdouglas force-pushed the bnb-parametrize-4bit branch from 2131f9b to 19a008f Compare September 2, 2025 16:06

matthewdouglas force-pushed the bnb-parametrize-4bit branch from 19a008f to fbf3044 Compare September 10, 2025 14:47

matthewdouglas added 6 commits September 18, 2025 15:43

Initial support for bnb 4bit on any nn.Parameter

3b8add5

Fix style

923615b

Fix copies

d1434a9

Enable bnb 4bit nn.Parameter from prequantized checkpoint

5fa758b

typo

cd2c33f

Update config name

23934d5

matthewdouglas force-pushed the bnb-parametrize-4bit branch from fbf3044 to 23934d5 Compare September 18, 2025 19:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: Initial support for bnb 4bit on any nn.Parameter #39859

WIP: Initial support for bnb 4bit on any nn.Parameter #39859

Uh oh!

matthewdouglas commented Aug 1, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Aug 1, 2025

Uh oh!

Rocketknight1 commented Aug 6, 2025

Uh oh!

SunMarc left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

WIP: Initial support for bnb 4bit on any nn.Parameter #39859

Are you sure you want to change the base?

WIP: Initial support for bnb 4bit on any nn.Parameter #39859

Uh oh!

Conversation

matthewdouglas commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Aug 1, 2025

Uh oh!

Rocketknight1 commented Aug 6, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

matthewdouglas commented Aug 1, 2025 •

edited

Loading