Skip to content

Conversation

@Cyrilvallez
Copy link
Member

What does this PR do?

Adds GLM model.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nice, you are missing the test files, integration tests etc! (And readme etc)

initializer_range=0.02,
rms_norm_eps=0.00000015625,
use_rms_norm=True,
apply_residual_connection_post_layernorm=False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this false for all models? If so, to delete!

self.mlp = GlmMLP(config)
self.input_layernorm = (
GlmRMSNorm(config.hidden_size, eps=config.rms_norm_eps)
if config.use_rms_norm
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check what config uses, but we avoid that in general as well! (code path)

"""

hidden_states_after_norm = self.input_layernorm(hidden_states)
residual = hidden_states_after_norm if self.apply_residual_connection_post_layernorm else hidden_states
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here! check if any released models have both

self.layers = nn.ModuleList(
[GlmDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
)
if config.post_layer_norm:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArthurZucker and others added 26 commits September 30, 2024 16:03
* HQQ model serialization attempt

* fix hqq dispatch and unexpected keys

* style

* remove check_old_param

* revert to check HQQLinear in quantizer_hqq.py

* revert to check HQQLinear in quantizer_hqq.py

* update HqqConfig default params

* make ci happy

* make ci happy

* revert to HQQLinear check in quantizer_hqq.py

* check hqq_min version 0.2.0

* set axis=1 as default in quantization_config.py

* validate_env with hqq>=0.2.0 version message

* deprecated hqq kwargs message

* make ci happy

* remove run_expected_keys_check hack + bump to 0.2.1 min hqq version

* fix unexpected_keys hqq update

* add pre_quantized check

* add update_expected_keys to base quantizerr

* ci base.py fix?

* ci base.py fix?

* fix "quantization typo" src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <[email protected]>

* fix post merge

---------

Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Arthur <[email protected]>
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something went wrong with the rebasing / merging as you have unrelated changes!

}


class GlmDecoderLayer(nn.Module):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one looks fairly classic, I would have supposed you don't need the forward (unless the issue is with the name of layers?)

@Cyrilvallez
Copy link
Member Author

Something went wrong with the rebasing / merging as you have unrelated changes!

Yes, currently looking at it

@ArthurZucker
Copy link
Collaborator

Superseed by #33823

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants