Skip to content

Conversation

@ethanjperez
Copy link
Contributor

Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the model variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix).

Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the `model` variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix).
@ethanjperez
Copy link
Contributor Author

Tagging @srush @nateraw from the original Lightning GLUE PR to check I'm not missing something?

@codecov-io
Copy link

codecov-io commented Mar 25, 2020

Codecov Report

Merging #3437 into master will increase coverage by 0.04%.
The diff coverage is 88.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3437      +/-   ##
==========================================
+ Coverage   77.56%   77.60%   +0.04%     
==========================================
  Files         100      100              
  Lines       16970    16967       -3     
==========================================
+ Hits        13162    13167       +5     
+ Misses       3808     3800       -8     
Impacted Files Coverage Δ
src/transformers/data/processors/utils.py 24.68% <88.88%> (+2.94%) ⬆️
src/transformers/modeling_utils.py 91.85% <0.00%> (+0.13%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 83272a3...f12d585. Read the comment docs.

@nateraw
Copy link
Contributor

nateraw commented Mar 25, 2020

I'll check this out later tonight! I'm on mobile so I've just looked at your commit quickly...looks like you're right. I know in the past I've instantiated the model then called model.load_from_checkpoint(loaded_ckpt) so what you've got probably gets the same job done. The benefit of doin it the way I just mentioned is that if you already have a model object available from training, you can just load the best ckpt into that. Either way works though!

@nateraw
Copy link
Contributor

nateraw commented Mar 25, 2020

That was fast 😄 Looks good to me!

@ethanjperez
Copy link
Contributor Author

Thanks for checking :) I'm still not able to reproduce my in-training validation performance though with the --do_predict flag, any ideas? I'm getting identical validation accuracy on different runs now, but the accuracy is still near random

@nateraw
Copy link
Contributor

nateraw commented Mar 27, 2020

@ethanjperez I just checked the docs, and it looks like the way we were doing it originally was correct.

model = MyLightingModule.load_from_checkpoint(PATH)
model.eval()
y_hat = model(x)

The way that I was explaining to do it would require you to use torch.load on the checkpoint path, which you would then pass to model.load_state_dict. The above method (what we had originally) is probably supposed to do that for you.

I haven't had the chance to recreate the issue, so I'll have to take a look.

@ethanjperez
Copy link
Contributor Author

Cool thanks! Even with the original way, I was still not able to reproduce my in-training validation performance (just something to look out for when you try) - In particular, I'm loading/running an already trained model with the --do_predict flag without using the --do_train flag (I don't think you'd see the issue if you use both --do_predict and --do_train)

Copy link
Contributor

@sshleifer sshleifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch!

@sshleifer sshleifer merged commit e5c393d into huggingface:master Mar 30, 2020
@ethanjperez
Copy link
Contributor Author

@nateraw @sshleifer Are you guys able to load a trained model successfully with the pytorch-lightning scripts? Even after this patch, I am having issues loading an already trained model, i.e., if I just use --do_eval without also using --do_train

@sshleifer
Copy link
Contributor

sshleifer commented Apr 16, 2020

Sorry for taking so long. I will try to reproduce this today if there is no update on your end!

Filing an issue with what you ran/expected would help :) @ethanjperez

@ethanjperez
Copy link
Contributor Author

@sshleifer Just seeing this - were you able to reproduce the issue? I can't remember what exact command I ran, but it was a standard evaluation command (the same as the training command I used, but with a few flags tweaked, e.g. drop the --do-train flag and add the --do-eval flag)

@sshleifer
Copy link
Contributor

This is fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants