Skip to content

Regex of University Institution too broad for citekey generation #6942

@TheDom42

Description

@TheDom42
JabRef 5.2--2020-09-22--129c36e
Windows 10 10.0 amd64 
Java 15
  • Mandatory: I have tested the latest development version from http://builds.jabref.org/master/ and the problem persists

  • Steps to reproduce the behavior:

    • Add an entry where the author is an institution (other than an university) and contains a term beginning with uni (see below for example)
    • Enclose the institution in {}
    • Let Jabref generate the citekey as an institution with default settings
  • Expected Behavior
    Generate the citekey as a regular institution abbrevation

  • Observed Behavior
    citekey is generated as a "University" citekey beginning with Uni and adding a short form of the abbrevation afterwards

  • Alternative Behavior
    Make use of the shortauthor field if present for that entry. biblatex-apa for example uses this field as the institution abbrevation, so if wished, one can add the shortauthor field to the respective entrytypes.

Longer description:
I tried to add the following two entries and let Jabref generate the citekey automatically.

% Encoding: UTF-8

@Report{ICAO2013,
  author      = {{International Civil Aviation Organization}},
  date        = {2013},
  institution = {{International Civil Aviation Organization}},
  location    = {Montréal, Quebec},
  publisher   = {International Civil Aviation Organization},
  shortauthor = {ICAO},
  title       = {Foo},
  type        = {resreport},
}

@Report{UniEuropeanAviationSafetyAgency2019,
  author       = {{European Union Aviation Safety Agency}},
  date         = {2019-12-18},
  institution  = {{European Union Aviation Safety Agency}},
  title        = {Bar},
  type         = {resreport},
  shortauthor  = {EASA},
  organization = {{European Union Aviation Safety Agency}},
}
@Comment{jabref-meta: databaseType:biblatex;}

I did not understand why the automatically generation would work for the ICAO entry but would not work for the EASA field (I would have been okay with having EUASA as the citekey, as I was aware that the automatic abbrevation would have used the U in the initials of the name). Insted, a completely different key was generated.
I tried to pinpoint the issue and stumbled upon this Regex in the key generation for institutions in brackets.

private enum Institution {
SCHOOL,
DEPARTMENT,
UNIVERSITY,
TECHNOLOGY;
/**
* Matches "uni" at the start of a string or after a space, case insensitive
*/
private static final Pattern UNIVERSITIES = Pattern.compile("^uni.*", Pattern.CASE_INSENSITIVE);

To me, the Regex seems a bit broad but maybe this was on purpose. If so, I would be happy if there was an option to somehow have a setting to use the optional shortauthor field if present.
Unfortunately, I'm not skilled enough to implement a fix in a PR but I wanted to point out that this might be an issue.

If someone asks: I do not use the suggested institution abbrevation mentioned here as this messes with the institution abbrevation in the biblatex-apa package which uses the shortauthor. And even with an added abbrevation behind the full name, the citekey is still wrong.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions