Qauntization in glm4-9b failed

As I run the `convert.py` with command: `CUDA_VISIBLE_DEVICES=1 python convert.py -i /home/orion/ai/Models/glm4-9b -o ./tmp-file -cf /home/orion/ai/Models/glm4-9b-4-exl2 -r 256`, it runs into an error saying `TypeError: Value for eos_token_id is not of expected type <class 'int'>`.

It seems that the architecture of glm4 hasn't been supported yet.

**Steps to reproduce**: Just download the [glm4-9b](https://huggingface.co/THUDM/glm-4-9b-chat-1m) model and run the `convert.py` as README says.

**Full console log**:

```shell
 !! Warning, unknown architecture: ChatGLMModel
 !! Loading as LlamaForCausalLM
Traceback (most recent call last):
  File "/home/orion/repo/exllamav2/convert.py", line 71, in <module>
    config.prepare()
  File "/home/orion/repo/exllamav2/exllamav2/config.py", line 187, in prepare
    self.eos_token_id = read(read_config, int, "eos_token_id", None)  # 2
  File "/home/orion/repo/exllamav2/exllamav2/config.py", line 40, in read
    raise TypeError(f"Value for {key} is not of expected type {expected_type}")
TypeError: Value for eos_token_id is not of expected type <class 'int'>
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Qauntization in glm4-9b failed #489

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Qauntization in glm4-9b failed #489

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions