-
Couldn't load subscription status.
- Fork 31k
Closed
Closed
Copy link
Description
System Info
@sanchit-gandhi - Can you please take a look at the below?
The documentation here doesn't seem to execute. It looks like there is a call being made to loss = model(**input_features).loss when input_features has not been initialized yet.
from transformers import AutoTokenizer, AutoFeatureExtractor, SpeechEncoderDecoderModel
from datasets import load_dataset
encoder_id = "facebook/wav2vec2-base-960h" # acoustic model encoder
decoder_id = "bert-base-uncased" # text decoder
feature_extractor = AutoFeatureExtractor.from_pretrained(encoder_id)
tokenizer = AutoTokenizer.from_pretrained(decoder_id)
# Combine pre-trained encoder and pre-trained decoder to form a Seq2Seq model
model = SpeechEncoderDecoderModel.from_encoder_decoder_pretrained(encoder_id, decoder_id)
model.config.decoder_start_token_id = tokenizer.cls_token_id
model.config.pad_token_id = tokenizer.pad_token_id
# load an audio input and pre-process (normalise mean/std to 0/1)
ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
input_values = feature_extractor(ds[0]["audio"]["array"], return_tensors="pt").input_values
# load its corresponding transcription and tokenize to generate labels
labels = tokenizer(ds[0]["text"], return_tensors="pt").input_ids
# the forward function automatically creates the correct decoder_input_ids
loss = model(**input_features).loss
loss.backward()
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Execute the code anywhere. It is an official example.
Expected behavior
To execute and show the backward loss. Instead, you get the error - NameError: name 'input_features' is not defined
Metadata
Metadata
Assignees
Labels
No labels