-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed
Labels
[outdated] type: enhancementgood first issueAn issue intended for project-newcomers. Varies in difficulty.An issue intended for project-newcomers. Varies in difficulty.
Description
JabRef version 5.2--2020-09-06--c0b139a on Windows 10 10.0 amd64, Java 14.0.2
- Mandatory: I have tested the latest development version from http://builds.jabref.org/master/ and the problem persists
Steps to reproduce the behavior:
- Save the file
@Misc{TrustedSlind,
author = {Konrad Slind},
title = {Trusted Extensions of Interactive Theorem Provers: Workshop Summary},
date = {2010-08},
location = {Cambridge, England},
url = {http://www.cs.utexas.edu/users/kaufmann/itp-trusted-extensions-aug-2010/summary/summary.pdf},
}as a .bib file.
- Open this file in JabRef
- Click on the one entry to select it
- Click Quality -> Cleanup entries / Alt+F8
- Ensure that only the first item ("Move DOIs from note and URL field to DOI field and remove http prefix") is checked
- Click OK
- Double-click on the entry and click "BibTeX source"
Note that the new source is
@Misc{TrustedSlind,
author = {Konrad Slind},
title = {Trusted Extensions of Interactive Theorem Provers: Workshop Summary},
date = {2010-08},
doi = {10/summary},
location = {Cambridge, England},
}This url is not a DOI link, though! Presumably this is because the matcher code at
jabref/src/main/java/org/jabref/model/entry/identifier/DOI.java
Lines 30 to 77 in ba68c09
| // Regex | |
| // (see http://www.doi.org/doi_handbook/2_Numbering.html) | |
| private static final String DOI_EXP = "" | |
| + "(?:urn:)?" // optional urn | |
| + "(?:doi:)?" // optional doi | |
| + "(" // begin group \1 | |
| + "10" // directory indicator | |
| + "(?:\\.[0-9]+)+" // registrant codes | |
| + "[/:%]" // divider | |
| + "(?:.+)" // suffix alphanumeric string | |
| + ")"; // end group \1 | |
| private static final String FIND_DOI_EXP = "" | |
| + "(?:urn:)?" // optional urn | |
| + "(?:doi:)?" // optional doi | |
| + "(" // begin group \1 | |
| + "10" // directory indicator | |
| + "(?:\\.[0-9]+)+" // registrant codes | |
| + "[/:]" // divider | |
| + "(?:[^\\s]+)" // suffix alphanumeric without space | |
| + ")"; // end group \1 | |
| // Regex (Short DOI) | |
| private static final String SHORT_DOI_EXP = "" | |
| + "(?:urn:)?" // optional urn | |
| + "(?:doi:)?" // optional doi | |
| + "(" // begin group \1 | |
| + "10" // directory indicator | |
| + "[/:%]" // divider | |
| + "[a-zA-Z0-9]+" | |
| + ")"; // end group \1 | |
| private static final String FIND_SHORT_DOI_EXP = "" | |
| + "(?:urn:)?" // optional urn | |
| + "(?:doi:)?" // optional doi | |
| + "(" // begin group \1 | |
| + "10" // directory indicator | |
| + "[/:]" // divider | |
| + "[a-zA-Z0-9]+" | |
| + "(?:[^\\s]+)" // suffix alphanumeric without space | |
| + ")"; // end group \1 | |
| private static final String HTTP_EXP = "https?://[^\\s]+?" + DOI_EXP; | |
| private static final String SHORT_DOI_HTTP_EXP = "https?://[^\\s]+?" + SHORT_DOI_EXP; | |
| // Pattern | |
| private static final Pattern EXACT_DOI_PATT = Pattern.compile("^(?:https?://[^\\s]+?)?" + DOI_EXP + "$", Pattern.CASE_INSENSITIVE); | |
| private static final Pattern DOI_PATT = Pattern.compile("(?:https?://[^\\s]+?)?" + FIND_DOI_EXP, Pattern.CASE_INSENSITIVE); | |
| // Pattern (short DOI) | |
| private static final Pattern EXACT_SHORT_DOI_PATT = Pattern.compile("^(?:https?://[^\\s]+?)?" + SHORT_DOI_EXP, Pattern.CASE_INSENSITIVE); | |
| private static final Pattern SHORT_DOI_PATT = Pattern.compile("(?:https?://[^\\s]+?)?" + FIND_SHORT_DOI_EXP, Pattern.CASE_INSENSITIVE); |
considers all non-space text starting with
http:// or https://, followed by 10/ followed by any non-space text, to be a DOI. This is absurd. The character immediately preceding the 10, doi:, or urn: should at the very least be required to be a url separator character such as /, :, ?, &, or =.Metadata
Metadata
Assignees
Labels
[outdated] type: enhancementgood first issueAn issue intended for project-newcomers. Varies in difficulty.An issue intended for project-newcomers. Varies in difficulty.