-
Notifications
You must be signed in to change notification settings - Fork 31.2k
TF: use the correct config with (...)EncoderDecoder models
#18097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(...)EncoderDecoder models
|
The documentation is not available anymore as the PR was closed or merged. |
07c1575 to
0dc94b8
Compare
sgugger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing! As for your question, it depends if they are all exact copies and can easily be refactored in one mixin or not.
|
I believe they are -- going to give it a go afterwards if @ydshieh also agrees :) |
|
I have limited connection at this moment in the mountain, so feel free to merge if you prefer. Regarding the common mixin, good for me. I see there are a few little things to address, like the input names (input_ids, pixel values etc). Thank you for the fix, @gante |
|
@ydshieh can I have a review plz 🙏 |
ydshieh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fix, @gante . Would you mind to rebase on main, and run test_pt_tf_model_equivalence for TF (Vision)EncoderDecoderModel? Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit, no need to feel strong:
I would personally prefer to use self.__name__ instead of str and test against EncoderDecoder. If I understand correctly, str gives the full path, which includes the module name like encoder_decoder or vision_encoder_decoder, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying self.__name__ on this line, and it seems like it isn't defined in some cases -- e.g. in tests/models/vision_encoder_decoder/test_modeling_tf_vision_encoder_decoder.py::TFViT2GPT2EncoderDecoderModelTest::test_encoder_decoder_model, it throws *** AttributeError: 'TFVisionEncoderDecoderModel' object has no attribute '__name__'
str(self) here includes the full import path for the object, like you wrote, which contains the class name -- '<transformers.models.vision_encoder_decoder.modeling_tf_vision_encoder_decoder.TFVisionEncoderDecoderModel object at 0x7fc7783b8a60>'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can update in the future, but since it is working for now I'm going with it :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks for the info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the improvement / cleanup!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice addition!
0dc94b8 to
250d5fd
Compare
|
@ydshieh rebased with main and reran tests -- all working 👍 |
250d5fd to
f59bbb7
Compare
What does this PR do?
Fixes #18071
Modifies
unpack_inputsto ignore the config file for(...)EncoderDecodermodels, mimicking the behavior in PT. If we don't ignore it, then unset options will get set with the config's default (Falsefor most of them), causing the inner models to ignore their own config files.EncoderDecodermodels. I then noticed that other(...)EncoderFecodertests have copy/pasted their ownEncoderDecoderMixin, so I've left the other classes for a follow-up PR with the following question: should a commonEncoderDecoderMixinbe defined and shared across(...)EncoderDecodertests, or should I add a similar test to all other classes individually?