-
Notifications
You must be signed in to change notification settings - Fork 31.2k
Closed
Description
System Info
transformersversion: 4.31.0- Platform: Linux-5.15.109+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.16.4
- Safetensors version: 0.3.1
- Accelerate version: not installed
- Accelerate config: not found
- PyTorch version (GPU?): 2.0.1+cu118 (False)
- Tensorflow version (GPU?): 2.12.0 (False)
- Flax version (CPU?/GPU?/TPU?): 0.7.0 (cpu)
- Jax version: 0.4.13
- JaxLib version: 0.4.13
- Using GPU in script?:
No - Using distributed or parallel set-up in script?:
No
Who can help?
@sanchit-gandhi @patrickvonplaten
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Put together a quick colab to run the model as mentioned in our documentation - colab notebook
code snippets:
Pipeline
from transformers import pipeline
model_id = "facebook/mms-1b-all"
target_lang = "fra"
pipe = pipeline(model=model_id, model_kwargs={"target_lang": target_lang, "ignore_mismatched_sizes": True})Error (full traceback in the colab notebook):
RuntimeError: Error(s) in loading state_dict for Wav2Vec2ForCTC:
size mismatch for lm_head.weight: copying a param with shape torch.Size([154, 1280]) from checkpoint, the shape in current model is torch.Size([314, 1280]).
size mismatch for lm_head.bias: copying a param with shape torch.Size([154]) from checkpoint, the shape in current model is torch.Size([314]).
You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.
Processor + Model
from transformers import Wav2Vec2ForCTC, AutoProcessor
model_id = "facebook/mms-1b-all"
target_lang = "fra"
processor = AutoProcessor.from_pretrained(model_id, target_lang=target_lang)
model = Wav2Vec2ForCTC.from_pretrained(model_id, target_lang=target_lang, ignore_mismatched_sizes=True)Error (full traceback in the colab notebook):
RuntimeError: Error(s) in loading state_dict for Wav2Vec2ForCTC:
size mismatch for lm_head.weight: copying a param with shape torch.Size([154, 1280]) from checkpoint, the shape in current model is torch.Size([314, 1280]).
size mismatch for lm_head.bias: copying a param with shape torch.Size([154]) from checkpoint, the shape in current model is torch.Size([314]).
You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.
Similar issues reported by @xenova here: #24223 (comment)
Expected behavior
The expected behaviour would be that dispite the mismatch the model weights are loaded and the mismatch is rectified via load_adapter for pipeline (as mentioned here:#24223 (comment))
xenova and el2e10
Metadata
Metadata
Assignees
Labels
No labels