-
Notifications
You must be signed in to change notification settings - Fork 30.9k
Fix failing MusicgenTest .test_pipeline_text_to_audio
#26586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The documentation is not available anymore as the PR was closed or merged. |
tests/test_pipeline_mixin.py
Outdated
elif config_vocab_size is None and hasattr(model.config, "text_encoder"): | ||
config_vocab_size = getattr(model.config.text_encoder, "vocab_size", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to get the vocab size for musicgen (tiny) model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that for MusicGen, the input/output vocab sizes are different:
- Input vocab size is the vocab size of the text encoder model (
text_encoder.vocab_size
) - Output vocab size is the number of vq-codes the audio decoder can generate (
decoder.vocab_size
). These vq-codes then go through the codec model to recover the final waveform - Thus, the actual output is continuous (a 1-d audio waveform)
Either way, I don't think the config vocab size is required for the TTA test:
def run_pipeline_test(self, speech_generator, _): |
# Get `IndexError: index out of range in self` in `torch.embedding`. | ||
# TODO: check tiny model creation for this model. | ||
def is_pipeline_test_to_skip( | ||
self, pipeline_test_casse_name, config_class, model_architecture, tokenizer_name, processor_name | ||
): | ||
if pipeline_test_casse_name == "TextToAudioPipelineTests": | ||
return True | ||
|
||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still failing due to some index issue in embedding. We probably need to check how the tiny model for this architecture is created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because some of the input text contains token id's that are outside of the model's vocab size:
outputs = speech_generator("This is a test") |
=> the word test
is probably out-of-vocabulary, so is giving this index error in the embeddings
We can either update the tokenizer to include all the ids in the model test and expand the model's vocab size accordingly, or update the tests to only use tokens that are in the tokenizer/model vocab
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the model config might have small vocab_size than the tokenizer one. This is more from the tiny model creation script that can't handle something special in musicgen's config.
Manually modifying the tiny model's config is not recommended, as it fixes the test quickly but the underlying issue (tiny model creation) is still there.
I will check what I can do with the script, but in the meantime let's not fail main anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The config file shows
both have vocab_size 99: which indeed means the config file is not updated with the (possibly converted) tokenizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! Agree that we should't change the model's config on-the-fly, but instead we could change text inputs in the test to be valid tokens in the model's vocabulary?
This means we have two options: we could either update the model's vocab size in the tiny model on the Hub, or update the text inputs in the test to be part of the current model's vocabulary. Either one is good for me in a follow-up PR!
I have updated the tiny models on the Hub (config/model file) manually. I have to update the commit sha info. so the pipeline tests will take this new versions. Failing test passes now. I will rework the tiny model creation script in a separate PR. |
e4d314e
to
9350590
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this @ydshieh! 🎵
@LysandreJik Ready for a review and we might have a green CI during the weekend :-) 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks both! Onto green CI 💚
What does this PR do?
Fix failing
MusicgenTest .test_pipeline_text_to_audio