Skip to content

Qauntization in glm4-9b failed #489

@Orion-zhen

Description

@Orion-zhen

As I run the convert.py with command: CUDA_VISIBLE_DEVICES=1 python convert.py -i /home/orion/ai/Models/glm4-9b -o ./tmp-file -cf /home/orion/ai/Models/glm4-9b-4-exl2 -r 256, it runs into an error saying TypeError: Value for eos_token_id is not of expected type <class 'int'>.

It seems that the architecture of glm4 hasn't been supported yet.

Steps to reproduce: Just download the glm4-9b model and run the convert.py as README says.

Full console log:

 !! Warning, unknown architecture: ChatGLMModel
 !! Loading as LlamaForCausalLM
Traceback (most recent call last):
  File "/home/orion/repo/exllamav2/convert.py", line 71, in <module>
    config.prepare()
  File "/home/orion/repo/exllamav2/exllamav2/config.py", line 187, in prepare
    self.eos_token_id = read(read_config, int, "eos_token_id", None)  # 2
  File "/home/orion/repo/exllamav2/exllamav2/config.py", line 40, in read
    raise TypeError(f"Value for {key} is not of expected type {expected_type}")
TypeError: Value for eos_token_id is not of expected type <class 'int'>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions