Skip to content

Conversation

isaac091
Copy link
Collaborator

@isaac091 isaac091 commented May 27, 2025

Also includes some improvements to the marker placement evaluation script.

There are a few smaller things still to come from machine.py, but this implementation is strictly better than the one doing everything in silnlp, so I'm making the PR now.

Closes #704, closes #705, closes #706, closes #708, closes #713, closes #721


This change is Reviewable

@isaac091 isaac091 requested review from benjaminking and ddaspit May 27, 2025 22:21
Copy link
Collaborator

@benjaminking benjaminking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 5 files reviewed, 2 unresolved discussions (waiting on @ddaspit)


silnlp/common/translator.py line 233 at r1 (raw file):

        pb = UpdateUsfmMarkerBehavior.PRESERVE if include_paragraph_markers else UpdateUsfmMarkerBehavior.STRIP
        sb = UpdateUsfmMarkerBehavior.PRESERVE if include_style_markers else UpdateUsfmMarkerBehavior.STRIP
        eb = UpdateUsfmMarkerBehavior.PRESERVE if include_embeds else UpdateUsfmMarkerBehavior.STRIP

It would be more readable if these variables had more descriptive names.


silnlp/common/compare_usfm_structure.py line 127 at r1 (raw file):

    # No verses with markers that should be evaluated
    if len(gold_sent_toks) == 0:
        return -1, -1

I wonder if it might be better to raise an exception in this case instead of returning an invalid value? Or maybe return a Union type with one of the types indicating an invalid case? I could imagine a case where a different caller of this function is averaging the results of several calls, but doesn't know that -1 is a possible result.

Copy link
Collaborator Author

@isaac091 isaac091 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 5 files reviewed, 2 unresolved discussions (waiting on @benjaminking and @ddaspit)


silnlp/common/compare_usfm_structure.py line 127 at r1 (raw file):

Previously, benjaminking (Ben King) wrote…

I wonder if it might be better to raise an exception in this case instead of returning an invalid value? Or maybe return a Union type with one of the types indicating an invalid case? I could imagine a case where a different caller of this function is averaging the results of several calls, but doesn't know that -1 is a possible result.

Done.


silnlp/common/translator.py line 233 at r1 (raw file):

Previously, benjaminking (Ben King) wrote…

It would be more readable if these variables had more descriptive names.

Done.

Copy link
Collaborator

@benjaminking benjaminking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 5 files reviewed, all discussions resolved (waiting on @ddaspit)

@isaac091 isaac091 merged commit fb9ce60 into master May 29, 2025
1 check was pending
@isaac091 isaac091 deleted the update_block branch May 29, 2025 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment