Skip to content

Additional Short DOI fatal exception cases: java.lang.IllegalArgumentException: <string> is not a valid DOI/Short  #7127

@koobs

Description

@koobs
JabRef 5.2--2020-11-26--f1a2fa7
Windows 10 10.0 amd64 
Java 14.0.2

Summary

JabRef produces fatal exceptions for files containing non-DOI-related strings such as 10:51 (a timestamp) and 10/B(C)/15 (an arbitrary designation/ID)

The ShortDOI parsing subsystem was improved in #6920 to fix failure cases, but it appears there are additional cases and strings (probably an arbitrarily high number) that produces fatal exceptions.

Given the issues associated with arbitrary strings in arbitrary documents, I suspect it is unlikely sustainable in the long-term to fixed pattern match, particularly if the behaviour of the system for failing cases, remains a fatal exception from which the user must manually recover (ie: identify the document containing the string, and exclude it from import).

I propose the behaviour be changed to fall-through (not fail). If it is desirable to not lose the failing semantics, files/entries may potentially be with a note or status that the parsing resulted in a null result, though I'm not sure that is particularly valuable.

Steps to reproduce the behavior

  1. Prepare local PDF files with contents that contain strings that produce exceptions (see below)
  2. Create New library
  3. Run Tools -> Search for unlinked local files
  4. Browse to folder containing local files -> Scan -> Import

Log Files

Log File
java.lang.IllegalArgumentException: 10/B(C)/15 is not a valid DOI/Short DOI.
	at [email protected]/org.jabref.model.entry.identifier.DOI.<init>(Unknown Source)
	at [email protected]/org.jabref.model.entry.identifier.DOI.findInText(Unknown Source)
	at [email protected]/org.jabref.logic.importer.fileformat.PdfContentImporter.importDatabase(Unknown Source)
Log File
java.lang.IllegalArgumentException: 10:51 is not a valid DOI/Short DOI.
	at [email protected]/org.jabref.model.entry.identifier.DOI.<init>(Unknown Source)
	at [email protected]/org.jabref.model.entry.identifier.DOI.findInText(Unknown Source)
	at [email protected]/org.jabref.logic.importer.fileformat.PdfContentImporter.importDatabase(Unknown Source)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions