-
-
Notifications
You must be signed in to change notification settings - Fork 3
Restrict build options to only update model hyperparameters #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d5c650c
to
8b9234e
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #88 +/- ##
=======================================
Coverage 86.82% 86.82%
=======================================
Files 226 226
Lines 13629 13629
=======================================
Hits 11833 11833
Misses 1796 1796 ☔ View full report in Codecov by Sentry. |
So, batch size is really pretranslation_batch_size, which is the size of corpus we try to pretranslate at the same time. You don't want this to be editable by the user, hence not putting it in huggingface. |
max_steps goes under train_params, which is more appropriate. I agree. |
So, let's see if I follow this logic. a person can set max_steps from the root and it will work successfully, but the real place is to set it in |
So, let's see if I get this. In settings.yml, there is a model_type parameter, here Huggingface. That is taken from settings.yml without being tampered by any input arguments. Then, any |
Previously, johnml1135 (John Lambert) wrote…
I read over the import of the settings and I see what you are doing now. My main concern is the above concern to rename the |
If |
Why would this need to be 100 here? (see other comment on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r1, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @ddaspit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @johnml1135)
machine/jobs/build_nmt_engine.py
line 48 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
So, let's see if I get this. In settings.yml, there is a model_type parameter, here Huggingface. That is taken from settings.yml without being tampered by any input arguments. Then, any
build_options
given to Serval end up going under the currentmodel_type
, as seen a few lines below. This seems sensible enough, except that, why not renamehuggingface
tobuild_options
in settings.yml? Themodel_type
is already specified and it will be more clear in the documentation why thebuild_options
map to ...build_options
.
If we support multiple model types, we can include the parameters for the different model types in the same settings file.
machine/jobs/nmt_engine_build_job.py
line 89 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
If
pretranslation_batch_size
will not be changed in any meaningful way for any meaningful reason, why include it? It isn't the batch size that the GPU uses for doing Pretranslations but the number of segments that are pulled from the source and handled at one time. Why not just set a constant to 1024 and be done with it?
It could be a constant. I just made it a setting, so it would be slightly easier to tweak if we need to.
machine/jobs/settings.yaml
line 4 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
So, batch size is really pretranslation_batch_size, which is the size of corpus we try to pretranslate at the same time. You don't want this to be editable by the user, hence not putting it in huggingface.
That is correct. This is the number of source segments to buffer while generating translations. It is different than the training or inferencing batch size.
tests/jobs/test_nmt_engine_build_job.py
line 37 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Why would this need to be 100 here? (see other comment on
pretranslation_batch_size
It is mostly an arbitrary number for testing.
Previously, ddaspit (Damien Daspit) wrote…
I could go either way - ok. |
Previously, ddaspit (Damien Daspit) wrote…
ok. |
Previously, ddaspit (Damien Daspit) wrote…
ok. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @ddaspit)
This change is