Set HuggingFaceNmtEngine to not truncate by default #45

ddaspit · 2023-10-20T17:46:00Z

This change is

johnml1135

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit)

machine/jobs/huggingface/hugging_face_nmt_model_factory.py line 74 at r1 (raw file):

            num_beams=self._config.huggingface.generate_params.num_beams,
            batch_size=self._config.huggingface.generate_params.batch_size,
            truncation=TruncationStrategy.LONGEST_FIRST,

Wouldn't this cause memory issues with the odd super-long segment of 2000 tokens? Why the need for the change?

ddaspit

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @johnml1135)

machine/jobs/huggingface/hugging_face_nmt_model_factory.py line 74 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Wouldn't this cause memory issues with the odd super-long segment of 2000 tokens? Why the need for the change?

This doesn't change the behavior for Serval at all. It simply changes the default truncation strategy for the HuggingFaceNmtEngine back to not truncate. Serval jobs still use the longest first truncation strategy. We are just now setting it in the HuggingFaceNmtModelFactory, which is the more appropriate place.

johnml1135 · 2023-10-20T18:44:06Z

machine/jobs/huggingface/hugging_face_nmt_model_factory.py line 74 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

This doesn't change the behavior for Serval at all. It simply changes the default truncation strategy for the HuggingFaceNmtEngine back to not truncate. Serval jobs still use the longest first truncation strategy. We are just now setting it in the HuggingFaceNmtModelFactory, which is the more appropriate place.

ok.

johnml1135 · 2023-10-20T18:46:18Z

It looks like the tokenizer code is not mixed with this pull request.

codecov-commenter · 2023-10-20T18:57:46Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Files	Coverage Δ
...translation/huggingface/hugging_face_nmt_engine.py	`91.19% <100.00%> (ø)`

📢 Thoughts on this report? Let us know!.

johnml1135

Reviewable status: 0 of 10 files reviewed, all discussions resolved

johnml1135 requested changes Oct 20, 2023

View reviewed changes

ddaspit commented Oct 20, 2023

View reviewed changes

ddaspit force-pushed the truncation-strategy branch from 4a4fa4d to 167f3e9 Compare October 20, 2023 18:54

Set HuggingFaceNmtEngine to not truncate by default

4023940

ddaspit force-pushed the truncation-strategy branch from 167f3e9 to 4023940 Compare October 20, 2023 18:55

johnml1135 approved these changes Oct 20, 2023

View reviewed changes

johnml1135 merged commit 9188ba3 into main Oct 20, 2023

ddaspit deleted the truncation-strategy branch October 20, 2023 20:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Set HuggingFaceNmtEngine to not truncate by default #45

Set HuggingFaceNmtEngine to not truncate by default #45

Uh oh!

ddaspit commented Oct 20, 2023 •

edited

Loading

Uh oh!

johnml1135 left a comment

Uh oh!

ddaspit left a comment

Uh oh!

johnml1135 commented Oct 20, 2023

Uh oh!

johnml1135 commented Oct 20, 2023

Uh oh!

codecov-commenter commented Oct 20, 2023 •

edited

Loading

Uh oh!

johnml1135 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Set HuggingFaceNmtEngine to not truncate by default #45

Set HuggingFaceNmtEngine to not truncate by default #45

Uh oh!

Conversation

ddaspit commented Oct 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

johnml1135 left a comment

Choose a reason for hiding this comment

Uh oh!

ddaspit left a comment

Choose a reason for hiding this comment

Uh oh!

johnml1135 commented Oct 20, 2023

Uh oh!

johnml1135 commented Oct 20, 2023

Uh oh!

codecov-commenter commented Oct 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

johnml1135 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ddaspit commented Oct 20, 2023 •

edited

Loading

codecov-commenter commented Oct 20, 2023 •

edited

Loading