-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
JabRef 5.2--2020-11-26--f1a2fa7
Windows 10 10.0 amd64
Java 14.0.2
- Mandatory: I have tested the latest development version from http://builds.jabref.org/master/ and the problem persists
Summary
JabRef produces fatal exceptions for files containing non-DOI-related strings such as 10:51 (a timestamp) and 10/B(C)/15 (an arbitrary designation/ID)
The ShortDOI parsing subsystem was improved in #6920 to fix failure cases, but it appears there are additional cases and strings (probably an arbitrarily high number) that produces fatal exceptions.
Given the issues associated with arbitrary strings in arbitrary documents, I suspect it is unlikely sustainable in the long-term to fixed pattern match, particularly if the behaviour of the system for failing cases, remains a fatal exception from which the user must manually recover (ie: identify the document containing the string, and exclude it from import).
I propose the behaviour be changed to fall-through (not fail). If it is desirable to not lose the failing semantics, files/entries may potentially be with a note or status that the parsing resulted in a null result, though I'm not sure that is particularly valuable.
Steps to reproduce the behavior
- Prepare local PDF files with contents that contain strings that produce exceptions (see below)
- Create
New library - Run
Tools->Search for unlinked local files Browseto folder containing local files ->Scan->Import
Log Files
Log File
java.lang.IllegalArgumentException: 10/B(C)/15 is not a valid DOI/Short DOI.
at [email protected]/org.jabref.model.entry.identifier.DOI.<init>(Unknown Source)
at [email protected]/org.jabref.model.entry.identifier.DOI.findInText(Unknown Source)
at [email protected]/org.jabref.logic.importer.fileformat.PdfContentImporter.importDatabase(Unknown Source)
Log File
java.lang.IllegalArgumentException: 10:51 is not a valid DOI/Short DOI.
at [email protected]/org.jabref.model.entry.identifier.DOI.<init>(Unknown Source)
at [email protected]/org.jabref.model.entry.identifier.DOI.findInText(Unknown Source)
at [email protected]/org.jabref.logic.importer.fileformat.PdfContentImporter.importDatabase(Unknown Source)