Skip to content

Do an Eflomal alignment on NMT build jobs #662

@johnml1135

Description

@johnml1135

This is an interim solution to the longer solution of actually integrating the refactor of Eflomal into machine (#625)
This involves a number of parts:

  • After pretranslations are complete in an NMT build job in machine.py, preform an Eflomal alignment (using the command line version - just slap in the code).
  • Return the source, pretranslations, the tokenized versions of them and the alignment (not weights, just matched numbers) in the pretranslations Json file. Have the file match the style of the word alignment jobs.
  • Add a new "phase" added to the NMT builds for this alignment. and report progress (if feasible) (should be coordinated with Separate progress for fine-tuning and inferencing  #477).
  • Add an option to generate and return alignments such as "--align-pretranslations". When enabled, the format of the resulting pretranslation file should be a superset of the existing pretranslations (that is, the one without the flag set).
  • This code should be structured in such a way as that when the real Eflomal job is complete, the changes, especially at the higher levels, will be minimal.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

Status

✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions