-
-
Notifications
You must be signed in to change notification settings - Fork 3
Marker placement update block handler #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #175 +/- ##
==========================================
+ Coverage 88.66% 88.87% +0.20%
==========================================
Files 280 282 +2
Lines 16708 16951 +243
==========================================
+ Hits 14814 15065 +251
+ Misses 1894 1886 -8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @Enkidu93)
tests/corpora/test_place_markers_usfm_update_block_handler.py
line 10 at r1 (raw file):
parse_usfm, ) from machine.corpora.place_markers_usfm_update_block_handler import PlaceMarkersUsfmUpdateBlockHandler
We should import from the corpora
package.
machine/corpora/place_markers_usfm_update_block_handler.py
line 7 at r1 (raw file):
from ..jobs.eflomal_aligner import to_word_alignment_matrix from ..jobs.translation_file_service import PretranslationInfo from ..translation import WordAlignmentMatrix
The convention for internal imports is to import directly from the appropriate module instead of the package.
machine/corpora/place_markers_usfm_update_block_handler.py
line 14 at r1 (raw file):
class PlaceMarkersUsfmUpdateBlockHandler(UsfmUpdateBlockHandler):
This class should be exposed in the machine.corpora
package exports.
machine/corpora/place_markers_usfm_update_block_handler.py
line 16 at r1 (raw file):
class PlaceMarkersUsfmUpdateBlockHandler(UsfmUpdateBlockHandler): def __init__(self, pt_info: Sequence[PretranslationInfo]) -> None:
Can you create a separate type to hold the alignment info for a verse? I don't want to create a dependency on the DTO that we use to pass pretranslations from a build job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 3 files reviewed, 4 unresolved discussions (waiting on @ddaspit and @Enkidu93)
machine/corpora/place_markers_usfm_update_block_handler.py
line 7 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
The convention for internal imports is to import directly from the appropriate module instead of the package.
Done.
machine/corpora/place_markers_usfm_update_block_handler.py
line 14 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
This class should be exposed in the
machine.corpora
package exports.
Done.
machine/corpora/place_markers_usfm_update_block_handler.py
line 16 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
Can you create a separate type to hold the alignment info for a verse? I don't want to create a dependency on the DTO that we use to pass pretranslations from a build job.
Done. Turns out I had to anyways because importing from jobs
was causing a circular dependency once I exported the handler.
tests/corpora/test_place_markers_usfm_update_block_handler.py
line 10 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
We should import from the
corpora
package.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r2, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @Enkidu93)
machine/corpora/place_markers_usfm_update_block_handler.py
line 12 at r2 (raw file):
class AlignmentInfo(TypedDict):
I would like to rename this to something that is more closely associated with the PlaceMarkersUsfmUpdateBlockHandler
class, so that its purpose is clearer. Maybe PlaceMarkersAlignmentInfo
.
machine/corpora/place_markers_usfm_update_block_handler.py
line 14 at r2 (raw file):
class AlignmentInfo(TypedDict): refs: List[str] source_toks: List[str]
I would rename this to source_tokens
.
machine/corpora/place_markers_usfm_update_block_handler.py
line 16 at r2 (raw file):
source_toks: List[str] translation_toks: List[str] alignment: str
This can be a WordAlignmentMatrix
or a collection of AlignedWordPair
s.
machine/corpora/place_markers_usfm_update_block_handler.py
line 21 at r2 (raw file):
class PlaceMarkersUsfmUpdateBlockHandler(UsfmUpdateBlockHandler): def __init__(self, align_info: Sequence[AlignmentInfo]) -> None:
The type can be an Iterable
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 3 files reviewed, 4 unresolved discussions (waiting on @ddaspit and @Enkidu93)
machine/corpora/place_markers_usfm_update_block_handler.py
line 12 at r2 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
I would like to rename this to something that is more closely associated with the
PlaceMarkersUsfmUpdateBlockHandler
class, so that its purpose is clearer. MaybePlaceMarkersAlignmentInfo
.
Done.
machine/corpora/place_markers_usfm_update_block_handler.py
line 14 at r2 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
I would rename this to
source_tokens
.
Done.
machine/corpora/place_markers_usfm_update_block_handler.py
line 16 at r2 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
This can be a
WordAlignmentMatrix
or a collection ofAlignedWordPair
s.
Done.
machine/corpora/place_markers_usfm_update_block_handler.py
line 21 at r2 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
The type can be an
Iterable
.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Excited to port this 😆
Reviewed 3 of 3 files at r3, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @ddaspit)
machine/corpora/place_markers_usfm_update_block_handler.py
line 14 at r2 (raw file):
Previously, isaac091 (Isaac Schifferer) wrote…
Done.
I don't know if you'd rather it were addressed in a separate PR, @ddaspit, but along the same lines, I noticed while working on sillsdev/serval#663 that source_toks
is also used in that pipeline of passing the alignments up with the pretranslations. It'd make our side cleaner if we want to use SourceTokens
in our data model if this were just called source_tokens
(and something similar for translation_toks
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93)
machine/corpora/place_markers_usfm_update_block_handler.py
line 14 at r2 (raw file):
Previously, Enkidu93 (Eli C. Lowry) wrote…
I don't know if you'd rather it were addressed in a separate PR, @ddaspit, but along the same lines, I noticed while working on sillsdev/serval#663 that
source_toks
is also used in that pipeline of passing the alignments up with the pretranslations. It'd make our side cleaner if we want to useSourceTokens
in our data model if this were just calledsource_tokens
(and something similar fortranslation_toks
).
Yeah, let's go ahead and make the change now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit)
machine/corpora/place_markers_usfm_update_block_handler.py
line 14 at r2 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
Yeah, let's go ahead and make the change now.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r4, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @isaac091)
Can one of you do a minor release so I can start working on plugging it into silnlp? Thanks! |
Yeah, I can do that: #177 |
* Marker placement update block handler * Refactor marker placement handler, small bug fixes * Extend and clean up tests, more code cleanup * Fix imports, use separate AlignmentInfo type * Adjust (PlaceMarkers)AlignmentInfo type * 'toks' --> 'tokens'
This change is