-
Notifications
You must be signed in to change notification settings - Fork 21
support composite decorator #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👍 thank you, this is looking very solid! I'm just back from leave, not sure I'll need this either but it does look very useful so I'll look at reviewing / merging this soonish (some time over the next 2 weeks). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally got the time to review this :) It looks good to me, apart from one major thing: at the moment the generated HTML is treated as strings and then converted back to nodes with DOM.parse_html
.
I think this will lead to problems and the internals of the exporter shouldn't ever do that.
For user's code, they always have the ability to use DOM.parse_html
if they want to hand-write HTML or integrate with a templating language, but it should be an escape hatch and nothing else.
text = deco.process(text, parent=element) | ||
|
||
text_children = list(DOM.parse_html( | ||
'<textnode>' + text + '</textnode>').body.children) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this part of the code because it relies on the internals of DOM
(removing textnode
on exports, being built with BeautifulSoup and having access to .body.children
) instead of an explicit public API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, if we have to do this, it's better to move the internal detail in DOM
.
But maybe it's even better not to use string parsing at all. I'll try that.
@@ -57,12 +58,19 @@ def get_style_value(self): | |||
return ''.join(sorted(rules)) | |||
|
|||
def add_node(self, element, text): | |||
|
|||
for deco in self.composite_decorators: | |||
text = deco.process(text, parent=element) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, the HTML is just stored as text in a string. I think this would be problematic if there is more than one decorator: each decorator would be matching on the HTML of the previous one instead of just the text content.
For example, say you have a URL decorator and then a naive hashtag one parsing http://example.com/#test
, the 1st decorator will create <a href="http://example.com/#test">http://example.com/#test</a>
and the second one will match on both instances of #test
.
The Draft.js decorators match on contentBlock.getText()
, is there a way we can reproduce this behavior better and only match on the text nodes rather than the HTML?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh this is really a tricky problem! Even if we only replace in the text rather than the HTML, some nested decorators would still be problematic. In your example, if the hashtag decorator replaces the #test
with a link, of course we can avoid replacing the first one because it's not in the text, but the second one should not be replaced either.
I'll look into this, as soon as I got a little time.
class URLDecorator(CompositeDecorator): | ||
""" | ||
replace url in plain text to actual html link | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that big of a deal but when we merge this I think we will want to move this to another file. Right now the distinction between "the class that is meant to be implemented" and "a particular implementation" isn't clear enough.
For comments, they should be sentences starting with a capital and ending with a period.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's better for users to understand.
# Override this to describe the pattern to be replaced | ||
SEARCH_RE = re.compile('$^') | ||
|
||
# Implemente this to generate a html string to replace the matched text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo ("Implemente").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
# append other content | ||
result += text[final:begin] | ||
# append replaced content | ||
result += self.replace(i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is just the right implementation except for one thing – it treats HTML as a string instead of using the power of DOM
/ BeautifulSoup to create proper DOM nodes. Is there a way to change this so it uses nodes instead of string concat?
My naive attempt would be something like:
result = DOM.create_document_fragment()
[...]
DOM.append_child(result, DOM.create_text_node(text[final:begin]))
DOM.append_child(result, self.replace(i))
[...]
DOM.append_child(result, DOM.create_text_node(text[final:]))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, if we can implement the decorator feature by node operation, this should be changed as well.
import re | ||
|
||
|
||
class CompositeDecorator(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Draft.js implementation the decorator is divided into two separate things, a strategy
function and the component
to render. Here they are both grouped under one class – could you explain why you made that choice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in most cases, the strategies and the components are one-to-one, otherwise it may be too complicated, perhaps one shouldn't use composite decorator to do many-to-one replacing. So why not combine them into one class for better organizing? Plus, this keeps a similar form like Entities.
But I'm not very sure about the one-to-one thing, maybe many-to-one is useful in some cases, if that's true, we shouldn't combine them into one class.
result = '<a href="{href}"{target}>{text}</a>'.format( | ||
href=u_href, | ||
text=text, | ||
target='target="_blank"' if self.new_window else '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to encourage people to write their HTML as strings with interpolation – this should be written using DOM.create_element
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, this should be changed along with the implementation itself.
1 similar comment
I found that in draft.js, when multiple decorators match the same part of text, only the first one get rendered, this restriction prevented the complicated nested decorator problem. (https://github.com/facebook/draft-js/blob/master/src/model/decorators/__tests__/CompositeDraftDecorator-test.js#L159 ) It seems you have moved the Entities totally into example.py and test files, so I have moved the two composite decorators out there too. Since the In the |
ping @thibaudcolas ☕️ |
Sorry for the long wait, I've been fairly busy at work recently! I'll review this this weekend 👓. |
It's ok, please take your time ; ) |
I found I made a mistake :( |
I tried to rebase this onto the latest master a few weeks ago but failed hard, by trying to keep both this and the linebreak replacement. I have removed the linebreak replacement for now, dealing with it via composite decorators instead like you suggested. My progress is available here: https://github.com/springload/draftjs_exporter/tree/feature/composite-decorators. I haven't changed the underlying implementation much, but have refactored where the code gets called. Next step is to add some unit tests, and then I will be happy merging this. The one feature that is missing compared to my previous linebreak implementation is replacing linebreaks in entities. I will see whether there is a way to make that happen. |
Nice! Really looking forward to the final merging. 😄 I've thought about replacing in entities, finally I chose to leave it to the entity itself, because not all the kinds of entities want to be decorated by composite decorators, it might become so complicated. |
It's merged! It's merged! Phew, that took a while :) There are a few things you should be aware of:
|
Oh yeah! Finally! Thanks for the reminding, I'll remove the |
Hehe, happy new year to you too! Here is more info for you:
|
Sounds great! And hope draft.js would release its own 1.0 too. |
#16
I implemented the "composite decorators" myself, not sure if you need this, just post this as a suggestion, in case it's useful for your projects.
As to follow draft.js's terminology, I just use the "composite decorator" name here, think it's not very confusing.
I found it quite complicated to implement it as another kind of commands, obviously commands are not designed for this. Besides, draft.js describes it as a flexible light-weighted "scan with regex, render with component" strategy, so I think simple search and replace would be a nice way.