Speedup ntriples and nquads parsing. #1323

rchateauneu · 2021-05-18T16:16:03Z

This is the same code change as in #1317 but on a brand new branch and #1300

The performance test is make with parsing:

g.parse("orkg.nt", format="nt")

The general idea is to use a single regular expression to parse the line into its three or four individual components.
nquads parsing reuse much more ntriples codes than before.

Windows:
14.71s => 12.20 : 17% faster
Linux: 13.12 => 12.45: 5% faster.

Please note that about 15% of time is spent in codecs.readline(), implying UTF-8 conversions due to opening the file in "rb" mode. handling only str instead of bytes would easily speed things up.

Speedup ntriples and nquads parsing.

dcb3998

rchateauneu closed this May 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speedup ntriples and nquads parsing. #1323

Speedup ntriples and nquads parsing. #1323

Uh oh!

rchateauneu commented May 18, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Speedup ntriples and nquads parsing. #1323

Speedup ntriples and nquads parsing. #1323

Uh oh!

Conversation

rchateauneu commented May 18, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant