-
-
Notifications
You must be signed in to change notification settings - Fork 20
Closed
Description
beautifulsoup4 4.13 introduces a breaking change in the text processing module at /src/commoncode/text.py (Link), see #4129
as_unicode(s) returns bytes instead of str starting with 4.13, which in turn breaks is_markup(location)/is_markup_text(text) in scancode here.
From the Changelog:
- UnicodeDammit.markup is now always a bytestring representing the
original markup (sans BOM), and UnicodeDammit.unicode_markup is
always the converted Unicode equivalent of the original
markup. Previously, UnicodeDammit.markup was treated inconsistently
and would often end up containing Unicode. UnicodeDammit.markup was
not a documented attribute, but if you were using it, you probably
want to switch to using .unicode_markup instead.If
UnicodeDammit(s).unicode_markupis used here instead ofUnicodeDammit(s).markup, a unicode string is returned:
dd-jy, jloehel and AyanSinhaMahapatra
Metadata
Metadata
Assignees
Labels
No labels