Skip to content

Conversation

ddaspit
Copy link
Contributor

@ddaspit ddaspit commented Dec 13, 2023

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (5203d87) 86.82% compared to head (8b9234e) 86.82%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main      #88   +/-   ##
=======================================
  Coverage   86.82%   86.82%           
=======================================
  Files         226      226           
  Lines       13629    13629           
=======================================
  Hits        11833    11833           
  Misses       1796     1796           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@johnml1135
Copy link
Collaborator

machine/jobs/settings.yaml line 4 at r1 (raw file):

  model_type: huggingface
  data_dir: ~/machine
  pretranslation_batch_size: 1024

So, batch size is really pretranslation_batch_size, which is the size of corpus we try to pretranslate at the same time. You don't want this to be editable by the user, hence not putting it in huggingface.

@johnml1135
Copy link
Collaborator

machine/jobs/settings.yaml line 18 at r1 (raw file):

      fp16: true
      save_strategy: no
      max_steps: 20000

max_steps goes under train_params, which is more appropriate. I agree.

@johnml1135
Copy link
Collaborator

machine/jobs/huggingface/hugging_face_nmt_model_factory.py line 31 at r1 (raw file):

        # Use "max_steps" from root for backward compatibility
        if "max_steps" in self._config.huggingface:
            args["max_steps"] = self._config.huggingface.max_steps

So, let's see if I follow this logic. a person can set max_steps from the root and it will work successfully, but the real place is to set it in huggingface.tran_params.max_steps. The issue is that you took my updates at looking for max_steps in self._config.huggingface instead of just self._config. Am I missing something?

@johnml1135
Copy link
Collaborator

machine/jobs/build_nmt_engine.py line 48 at r1 (raw file):

        SETTINGS.update(args)
        model_type = cast(str, SETTINGS.model_type).lower()

So, let's see if I get this. In settings.yml, there is a model_type parameter, here Huggingface. That is taken from settings.yml without being tampered by any input arguments. Then, any build_options given to Serval end up going under the current model_type, as seen a few lines below. This seems sensible enough, except that, why not rename huggingface to build_options in settings.yml? The model_type is already specified and it will be more clear in the documentation why the build_options map to ... build_options.

@johnml1135
Copy link
Collaborator

machine/jobs/huggingface/hugging_face_nmt_model_factory.py line 31 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

So, let's see if I follow this logic. a person can set max_steps from the root and it will work successfully, but the real place is to set it in huggingface.tran_params.max_steps. The issue is that you took my updates at looking for max_steps in self._config.huggingface instead of just self._config. Am I missing something?

I read over the import of the settings and I see what you are doing now. My main concern is the above concern to rename the huggingface group just build_options.

@johnml1135
Copy link
Collaborator

machine/jobs/nmt_engine_build_job.py line 89 at r1 (raw file):

            current_inference_step = 0
            phase_progress(ProgressStatus.from_step(current_inference_step, inference_step_count))
            batch_size = self._config["pretranslation_batch_size"]

If pretranslation_batch_size will not be changed in any meaningful way for any meaningful reason, why include it? It isn't the batch size that the GPU uses for doing Pretranslations but the number of segments that are pulled from the source and handled at one time. Why not just set a constant to 1024 and be done with it?

@johnml1135
Copy link
Collaborator

tests/jobs/test_nmt_engine_build_job.py line 37 at r1 (raw file):

class _TestEnvironment:
    def __init__(self, decoy: Decoy) -> None:
        config = {"src_lang": "es", "trg_lang": "en", "pretranslation_batch_size": 100}

Why would this need to be 100 here? (see other comment on pretranslation_batch_size

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 7 files at r1, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @ddaspit)

Copy link
Contributor Author

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @johnml1135)


machine/jobs/build_nmt_engine.py line 48 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

So, let's see if I get this. In settings.yml, there is a model_type parameter, here Huggingface. That is taken from settings.yml without being tampered by any input arguments. Then, any build_options given to Serval end up going under the current model_type, as seen a few lines below. This seems sensible enough, except that, why not rename huggingface to build_options in settings.yml? The model_type is already specified and it will be more clear in the documentation why the build_options map to ... build_options.

If we support multiple model types, we can include the parameters for the different model types in the same settings file.


machine/jobs/nmt_engine_build_job.py line 89 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

If pretranslation_batch_size will not be changed in any meaningful way for any meaningful reason, why include it? It isn't the batch size that the GPU uses for doing Pretranslations but the number of segments that are pulled from the source and handled at one time. Why not just set a constant to 1024 and be done with it?

It could be a constant. I just made it a setting, so it would be slightly easier to tweak if we need to.


machine/jobs/settings.yaml line 4 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

So, batch size is really pretranslation_batch_size, which is the size of corpus we try to pretranslate at the same time. You don't want this to be editable by the user, hence not putting it in huggingface.

That is correct. This is the number of source segments to buffer while generating translations. It is different than the training or inferencing batch size.


tests/jobs/test_nmt_engine_build_job.py line 37 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Why would this need to be 100 here? (see other comment on pretranslation_batch_size

It is mostly an arbitrary number for testing.

@johnml1135
Copy link
Collaborator

machine/jobs/build_nmt_engine.py line 48 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

If we support multiple model types, we can include the parameters for the different model types in the same settings file.

I could go either way - ok.

@johnml1135
Copy link
Collaborator

machine/jobs/nmt_engine_build_job.py line 89 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

It could be a constant. I just made it a setting, so it would be slightly easier to tweak if we need to.

ok.

@johnml1135
Copy link
Collaborator

tests/jobs/test_nmt_engine_build_job.py line 37 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

It is mostly an arbitrary number for testing.

ok.

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @ddaspit)

@johnml1135 johnml1135 merged commit a105036 into main Dec 14, 2023
@ddaspit ddaspit deleted the build-options branch January 3, 2024 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Restrict build options to only update model hyperparameters

3 participants