-
-
Notifications
You must be signed in to change notification settings - Fork 3
Marker behavior metadata #207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… to headers when stripping paragraphs
In general, could you elaborate on this issue: Why is not possible for
At the moment in Serval, we just don't pass the place markers handler when we want the 'end' behavior. Do you see any issues with that? I'm indifferent in general in regard to whether we add an option in the place marks handler or just don't pass that handler in the 'end' case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, the marked_for_removal
property of a marker does align with whether or not we want to place that marker (and in fact always does for style markers, as far as I know). The exception to this is paragraph markers that follow headers, which are always marked_for_removal=False
(because they are essentially the end marker for the header). So, in the case that we are not placing paragraph markers, where all other paragraph markers are marked_for_removal=True
, we need need to look at the actual marker placement setting.
There is no issue with not passing the place markers handler when we are only concerned with one type of marker behavior, as is currently the case because we officially support placing paragraph markers but not yet style markers. However, once we are handling multiple types of markers, it becomes an issue. If, for example, we want to place style markers but not paragraph markers, we need to create a place markers handler, which creates the environment for the bug I described above. Similarly, if we want to place style markers and put paragraph markers at the ends of verses, we would create the handler. But once we're in the handler, all of the paragraph markers will have marked_for_removal=False
, which signals to the current code that they need to be placed, when in fact we don't want them to be.
Reviewable status: 0 of 2 files reviewed, all discussions resolved (waiting on @ddaspit and @Enkidu93)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slight tangent:
An alternative approach to these problems is to create separate place markers handler instances for each type of marker. Since the tokenizations/alignments are generated beforehand and the placement of each marker is determined independent of other markers, it would not change the behavior of the algorithm, and then we would not have to consider the implications of each intersection of marker behaviors. The downside of this would be if we developed an alternate or improved algorithm in the future that does not place markers independently of one another, it would potentially not be implementable with this setup.
For example, if we wanted to place all paragraph markers first and then place style markers with the constraint that they must be placed in the same paragraph they originated in, that would be fine, but if we wanted to place markers left-to-right and say that each marker must come after the one before it (in the linear order), that would not be possible to do.
Reviewable status: 0 of 2 files reviewed, all discussions resolved (waiting on @ddaspit and @Enkidu93)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer this approach rather than creating separate handler classes.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)
… to headers when stripping paragraphs (#207)
This adds paragraph and style marker behavior to the metadata for marker placement and fixes the bug related to headers when stripping paragraphs. This will also enable us to have the third option for paragraph markers, which is to move them to the end of the verse.
A couple of questions I have about this:
style_behavior
to the metadata for completeness, it's not actually necessary, i.e. the desired behavior for style markers always matches themarked_for_removal
property. Should I remove it? I don't forsee it ever being needed, but I could be wrong.update_usfm
(so they aren't removed).After this is in I would like to do another minor release so I can add the "end of verse" option in silnlp.
Closes #206
This change is