Skip to content

Literals with xsd:base64Binary datatype serialize incorrectly #1257

@delocalizer

Description

@delocalizer

Looking at rdflib.term source this appears to be because the lexical is taken as the decoded value, because no base64 encoder is defined. This is in contrast to xsd:hexBinary where the lexical is the encoded value.

Example:

>>> byts = b'foo'
>>> enc = hexlify(byts)
>>> lit = Literal(enc, datatype=XSD.hexBinary)
>>> print(lit.toPython(), lit)
b'foo', 666f6f   # expected

vs

>>> enc = b64encode(byts)
>>> lit = Literal(enc, datatype=XSD.base64Binary)
>>> print(lit.toPython(), lit)
b'foo', foo  # unexpected

This means for instance that Graph containing this literal in the o position actually serializes like:

@prefix ns1: <http://example.com/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ns1:foo ns1:bar "foo"^^xsd:base64Binary,

when I expect:

@prefix ns1: <http://example.com/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ns1:foo ns1:bar "Zm9v"^^xsd:base64Binary,

I can work around by using rdflib.term.bind to handle this datatype in a custom fashion, but I believe the fix is easy enough

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions