-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Improve CFF import/export and craft a round-trip test #10995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
f437d58
issue #10993 - feat: added ability to parse preferred-citation field …
jeanprbt b5298df
issue #10993 - feat: added all fields of JabRef/CITATION.cff to CffIm…
jeanprbt a6b62e1
issue #10993 - feat: rewrote CffExporter to parse Software, Dataset t…
jeanprbt cd94d33
issue #10993 - feat: added keywords and unknown fields support
jeanprbt ca0f887
issue #10993 - feat: added round-trip test
jeanprbt 0b1b578
issue #10993 - doc: updated CHANGELOG.md
jeanprbt d32d26f
Merge branch 'JabRef:main' into issue/10993
jeanprbt 56bf7e7
Convert RemoveBracesFormatterTest to @ParameterizedTest (#11033)
koppor c4b2328
Importing of BibDesk Groups and Linked Files (#10968)
Frequinzy 57f8a63
Speed up failure reporting (#11030)
koppor 7a4be6d
Fixes Zotero file handling for absolute paths (#11038)
Siedlerchr 7abf13d
Change copy-paste function to handle string constants (follow up PR) …
Siedlerchr 9587520
Bump gittools/actions from 0.13.4 to 1.1.1 (#11039)
dependabot[bot] 1ec6a6e
Bump com.googlecode.plist:dd-plist from 1.23 to 1.28 (#11040)
dependabot[bot] 930a9b4
Bump org.apache.pdfbox:xmpbox from 3.0.1 to 3.0.2 (#11041)
dependabot[bot] 5858598
Bump com.dlsc.gemsfx:gemsfx from 2.2.0 to 2.4.0 (#11044)
dependabot[bot] 7cb8885
Bump org.apache.pdfbox:fontbox from 3.0.1 to 3.0.2 (#11042)
dependabot[bot] 342cb24
Keep enclosing braces of authors (#11034)
koppor 5ab2a81
Improve citation relations (#11016)
ror3d 7a269d4
issue #10993 - doc: updated CHANGELOG.md
jeanprbt 8a8434a
Merge branch 'main' into issue/10993
jeanprbt 008472b
fix: fixed unit tests not passing due to name changes in Author inter…
jeanprbt 6f925ec
feat: changed CFFExporter to use YAML library snakeyaml instead (#10995)
jeanprbt 5a60aff
feat: added support for references and ALL possible CFF fields in imp…
jeanprbt 8fbdf26
Merge branch 'main' into issue/10993
jeanprbt 5e697a2
fix: added requested changes (#10995)
jeanprbt 88c42b8
fix: task rewriteDryRun fixed to pass by removing test in BibEntryTest
jeanprbt e1b1665
Merge branch 'main' into issue/10993
jeanprbt 9271368
refactor: deleted useless methods in CffImporter (#10995)
jeanprbt 69245be
doc: added decision MADR document for cff export (#10995)
jeanprbt ad2d600
feat: add a cites or related relationship between imported entries in…
jeanprbt 6978078
Merge branch 'main' into issue/10993
jeanprbt ca9c0dc
doc: updated MADR decision document for cff export to pass markdownli…
jeanprbt 359237d
fix: fixed round-trip test to use mock citatioKeyPatternPreferences c…
jeanprbt a8518b7
fix: fixed MADR document for CFF export decision to pass Jekyll CI ch…
jeanprbt 0264c03
fix: fixed requested changes (#10995)
jeanprbt c4bc13c
feat: finished CFFExporter logic and crafted working round-trip test …
jeanprbt de27eef
Merge branch 'main' into issue/10993
jeanprbt 2450c80
fix: fixed typos in MADR decision doc for CFF export and refactore Im…
jeanprbt 8d72c5f
Some code beautification
koppor bf9ff8b
Use existing method getEntryLinkList
koppor 447632b
Use getEntryLinkList
koppor c43d14a
Use JabRef's Date class for parsing
koppor 60904da
Fix indentation in new line
calixtus File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| --- | ||
| nav_order: 28 | ||
| parent: Decision Records | ||
| --- | ||
|
|
||
| <!-- we need to disable MD025, because we use the different heading "ADR Template" in the homepage (see above) than it is foreseen in the template --> | ||
| <!-- markdownlint-disable-next-line MD025 --> | ||
| # Exporting multiple entries to CFF | ||
|
|
||
| ## Context and Problem Statement | ||
|
|
||
| The need for an [exporter](https://github.com/JabRef/jabref/issues/10661) to [CFF format](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md) raised the following issue: How to export multiple entries at once? Citation-File-Format is intended to make software and datasets citable. It should contain one "main" entry of type `software` or `dataset`, a possible preferred citation and/or several references of any type. | ||
|
|
||
| ## Decision Drivers | ||
|
|
||
| * Make exported files compatible with official CFF tools | ||
| * Make exporting process logical for users | ||
|
|
||
| ## Considered Options | ||
|
|
||
| * When exporting: | ||
| * Export non-`software` entries with dummy topmost `sofware` and entries as `preferred-citation` | ||
| * Export non-`software` entries with dummy topmost `sofware` and entries as `references` | ||
| * Forbid exporting multiple entries at once | ||
| * Forbid exporting more than one software entry at once | ||
| * Export entries in several files (i.e. one / file) | ||
| * Export several `software` entries with one of them topmost and all others as `references` | ||
| * Export several `software` entries with a dummy topmost `software` element and all others as `references` | ||
| * When importing: | ||
| * Only create one entry / file, enven if there is a `preferred-citation` or `references` | ||
| * Add a JabRef `cites` relation from `software` entry to its `preferred-citation` | ||
| * Add a JabRef `cites` relation from `preferred-citation` entry to the main `software` entry | ||
| * Separate `software` entries from their `preferred-citation` or `references` | ||
|
|
||
| ## Decision Outcome | ||
|
|
||
| The decision outcome is the following. | ||
|
|
||
| * When exporting, JabRef will have a different behavior depending on entries type. | ||
| * If multiple non-`software` entries are selected, then exporter uses the `references` field with a dummy topmost `software` element. | ||
| * If several entries including a `software` or `dataset` one are selected, then exporter uses this one as topmost element and the others as `references`, adding a potential `preferred-citation` for the potential `cites` element of the topmost `software` entry. | ||
| * If several entries including several `software` ones are selected, then exporter uses a dummy topmost element, and selected entries are exported as `references`. The `cites` or `related` fields won't be exported in this case. | ||
| * JabRef will not handle `cites` or `related` fields for non-`software` elements. | ||
| * When importing, JabRef will create several entries: one main entry for the `software` and other entries for the potential `preferred-citation` and `references` fields. JabRef will link main entry to the preferred citation using a `cites` from the main entry, and wil link main entry to the references using a `related` from the main entry. | ||
|
|
||
| ### Positive Consequences | ||
|
|
||
| * Exported results comply with CFF format | ||
| * The export process is "logic" : an user who exports multiple files to CFF might find it clear that they are all marked as `references` | ||
| * Importing a CFF file and then exporting the "main" (software) created entry is consistent and will produce the same result | ||
|
|
||
| ### Negative Consequences | ||
|
|
||
| * Importing a CFF file and then exporting one of the `preferred-citation` or the `references` created entries won't result in the same file (i.e exported file will contain a dummy topmost `software` instead of the actual `software` that was imported) | ||
| * `cites` and `related` fields of non-`software` entries are not supported |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
263 changes: 263 additions & 0 deletions
263
src/main/java/org/jabref/logic/exporter/CffExporter.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,263 @@ | ||
| package org.jabref.logic.exporter; | ||
|
|
||
| import java.io.FileWriter; | ||
| import java.io.IOException; | ||
| import java.nio.charset.StandardCharsets; | ||
| import java.nio.file.Path; | ||
| import java.util.ArrayList; | ||
| import java.util.HashMap; | ||
| import java.util.LinkedHashMap; | ||
| import java.util.List; | ||
| import java.util.Map; | ||
| import java.util.Objects; | ||
| import java.util.Optional; | ||
|
|
||
| import org.jabref.logic.util.StandardFileType; | ||
| import org.jabref.model.database.BibDatabaseContext; | ||
| import org.jabref.model.entry.Author; | ||
| import org.jabref.model.entry.AuthorList; | ||
| import org.jabref.model.entry.BibEntry; | ||
| import org.jabref.model.entry.Date; | ||
| import org.jabref.model.entry.field.BiblatexSoftwareField; | ||
| import org.jabref.model.entry.field.Field; | ||
| import org.jabref.model.entry.field.StandardField; | ||
| import org.jabref.model.entry.field.UnknownField; | ||
| import org.jabref.model.entry.types.EntryType; | ||
| import org.jabref.model.entry.types.StandardEntryType; | ||
|
|
||
| import org.yaml.snakeyaml.DumperOptions; | ||
| import org.yaml.snakeyaml.Yaml; | ||
|
|
||
| public class CffExporter extends Exporter { | ||
| // Fields that are taken 1:1 from BibTeX to CFF | ||
| public static final List<String> UNMAPPED_FIELDS = List.of( | ||
| "abbreviation", "collection-doi", "collection-title", "collection-type", "commit", "copyright", | ||
| "data-type", "database", "date-accessed", "date-downloaded", "date-published", "department", "end", | ||
| "entry", "filename", "format", "issue-date", "issue-title", "license-url", "loc-end", "loc-start", | ||
| "medium", "nihmsid", "number-volumes", "patent-states", "pmcid", "repository-artifact", "repository-code", | ||
| "scope", "section", "start", "term", "thesis-type", "volume-title", "year-original" | ||
| ); | ||
|
|
||
| public static final Map<Field, String> FIELDS_MAP = Map.ofEntries( | ||
| Map.entry(StandardField.ABSTRACT, "abstract"), | ||
| Map.entry(StandardField.DATE, "date-released"), | ||
| Map.entry(StandardField.DOI, "doi"), | ||
| Map.entry(StandardField.KEYWORDS, "keywords"), | ||
| Map.entry(BiblatexSoftwareField.LICENSE, "license"), | ||
| Map.entry(StandardField.COMMENT, "message"), | ||
| Map.entry(BiblatexSoftwareField.REPOSITORY, "repository"), | ||
| Map.entry(StandardField.TITLE, "title"), | ||
| Map.entry(StandardField.URL, "url"), | ||
| Map.entry(StandardField.VERSION, "version"), | ||
| Map.entry(StandardField.EDITION, "edition"), | ||
| Map.entry(StandardField.ISBN, "isbn"), | ||
| Map.entry(StandardField.ISSN, "issn"), | ||
| Map.entry(StandardField.ISSUE, "issue"), | ||
| Map.entry(StandardField.JOURNAL, "journal"), | ||
| Map.entry(StandardField.MONTH, "month"), | ||
| Map.entry(StandardField.NOTE, "notes"), | ||
| Map.entry(StandardField.NUMBER, "number"), | ||
| Map.entry(StandardField.PAGES, "pages"), | ||
| Map.entry(StandardField.PUBSTATE, "status"), | ||
| Map.entry(StandardField.VOLUME, "volume"), | ||
| Map.entry(StandardField.YEAR, "year") | ||
| ); | ||
|
|
||
| public static final Map<EntryType, String> TYPES_MAP = Map.ofEntries( | ||
| Map.entry(StandardEntryType.Article, "article"), | ||
| Map.entry(StandardEntryType.Book, "book"), | ||
| Map.entry(StandardEntryType.Booklet, "pamphlet"), | ||
| Map.entry(StandardEntryType.Proceedings, "conference"), | ||
| Map.entry(StandardEntryType.InProceedings, "conference-paper"), | ||
| Map.entry(StandardEntryType.Misc, "misc"), | ||
| Map.entry(StandardEntryType.Manual, "manual"), | ||
| Map.entry(StandardEntryType.Software, "software"), | ||
| Map.entry(StandardEntryType.Dataset, "dataset"), | ||
| Map.entry(StandardEntryType.Report, "report"), | ||
| Map.entry(StandardEntryType.Unpublished, "unpublished") | ||
| ); | ||
|
|
||
| public CffExporter() { | ||
| super("cff", "CFF", StandardFileType.CFF); | ||
| } | ||
|
|
||
| @Override | ||
| public void export(BibDatabaseContext databaseContext, Path file, List<BibEntry> entries) throws Exception { | ||
| Objects.requireNonNull(databaseContext); | ||
| Objects.requireNonNull(file); | ||
| Objects.requireNonNull(entries); | ||
|
|
||
| // Do not export if no entries to export -- avoids exports with only template text | ||
| if (entries.isEmpty()) { | ||
| return; | ||
| } | ||
|
|
||
| // Make a copy of the list to avoid modifying the original list | ||
| final List<BibEntry> entriesToTransform = new ArrayList<>(entries); | ||
|
|
||
| // Set up YAML options | ||
| DumperOptions options = new DumperOptions(); | ||
| options.setWidth(Integer.MAX_VALUE); | ||
| options.setDefaultFlowStyle(DumperOptions.FlowStyle.BLOCK); | ||
| options.setPrettyFlow(true); | ||
| options.setIndentWithIndicator(true); | ||
| options.setIndicatorIndent(2); | ||
| Yaml yaml = new Yaml(options); | ||
|
|
||
| BibEntry main = null; | ||
| boolean mainIsDummy = false; | ||
| int countOfSoftwareAndDataSetEntries = 0; | ||
| for (BibEntry entry : entriesToTransform) { | ||
| if (entry.getType() == StandardEntryType.Software || entry.getType() == StandardEntryType.Dataset) { | ||
| main = entry; | ||
| countOfSoftwareAndDataSetEntries++; | ||
| } | ||
| } | ||
| if (countOfSoftwareAndDataSetEntries == 1) { | ||
| // If there is only one software or dataset entry, use it as the main entry | ||
| entriesToTransform.remove(main); | ||
| } else { | ||
| // If there are no software or dataset entries, create a dummy main entry holding the given entries | ||
| main = new BibEntry(StandardEntryType.Software); | ||
| mainIsDummy = true; | ||
| } | ||
|
|
||
| // Transform main entry to CFF format | ||
| Map<String, Object> cffData = transformEntry(main, true, mainIsDummy); | ||
|
|
||
| // Preferred citation | ||
| if (main.hasField(StandardField.CITES)) { | ||
| String citeKey = main.getField(StandardField.CITES).orElse("").split(",")[0]; | ||
| List<BibEntry> citedEntries = databaseContext.getDatabase().getEntriesByCitationKey(citeKey); | ||
| entriesToTransform.removeAll(citedEntries); | ||
| if (!citedEntries.isEmpty()) { | ||
| BibEntry citedEntry = citedEntries.getFirst(); | ||
| cffData.put("preferred-citation", transformEntry(citedEntry, false, false)); | ||
| } | ||
| } | ||
|
|
||
| // References | ||
| List<Map<String, Object>> related = new ArrayList<>(); | ||
| if (main.hasField(StandardField.RELATED)) { | ||
| main.getEntryLinkList(StandardField.RELATED, databaseContext.getDatabase()) | ||
| .stream() | ||
| .map(link -> link.getLinkedEntry()) | ||
| .filter(Optional::isPresent) | ||
| .map(Optional::get) | ||
| .forEach(entry -> { | ||
| related.add(transformEntry(entry, false, false)); | ||
| entriesToTransform.remove(entry); | ||
| }); | ||
| } | ||
|
|
||
| // Add remaining entries as references | ||
| for (BibEntry entry : entriesToTransform) { | ||
| related.add(transformEntry(entry, false, false)); | ||
| } | ||
| if (!related.isEmpty()) { | ||
| cffData.put("references", related); | ||
| } | ||
|
|
||
| try (FileWriter writer = new FileWriter(file.toFile(), StandardCharsets.UTF_8)) { | ||
| yaml.dump(cffData, writer); | ||
| } catch (IOException ex) { | ||
| throw new SaveException(ex); | ||
| } | ||
| } | ||
|
|
||
| private Map<String, Object> transformEntry(BibEntry entry, boolean main, boolean dummy) { | ||
| Map<String, Object> cffData = new LinkedHashMap<>(); | ||
| Map<Field, String> fields = new HashMap<>(entry.getFieldMap()); | ||
|
|
||
| if (main) { | ||
| // Mandatory CFF version field | ||
| cffData.put("cff-version", "1.2.0"); | ||
|
|
||
| // Mandatory message field | ||
| String message = fields.getOrDefault(StandardField.COMMENT, | ||
| "If you use this software, please cite it using the metadata from this file."); | ||
| cffData.put("message", message); | ||
| fields.remove(StandardField.COMMENT); | ||
| } | ||
|
|
||
| // Mandatory title field | ||
| String title = fields.getOrDefault(StandardField.TITLE, "No title specified."); | ||
| cffData.put("title", title); | ||
| fields.remove(StandardField.TITLE); | ||
|
|
||
| // Mandatory authors field | ||
| List<Author> authors = AuthorList.parse(fields.getOrDefault(StandardField.AUTHOR, "")) | ||
| .getAuthors(); | ||
| parseAuthors(cffData, authors); | ||
| fields.remove(StandardField.AUTHOR); | ||
|
|
||
| // Type | ||
| if (!dummy) { | ||
| cffData.put("type", TYPES_MAP.getOrDefault(entry.getType(), "misc")); | ||
| } | ||
|
|
||
| // Keywords | ||
| String keywords = fields.getOrDefault(StandardField.KEYWORDS, null); | ||
| if (keywords != null) { | ||
| cffData.put("keywords", keywords.split(",\\s*")); | ||
| } | ||
| fields.remove(StandardField.KEYWORDS); | ||
|
|
||
| // Date | ||
| String date = fields.getOrDefault(StandardField.DATE, null); | ||
| if (date != null) { | ||
| parseDate(cffData, date); | ||
| } | ||
| fields.remove(StandardField.DATE); | ||
|
|
||
| // Remaining fields not handled above | ||
| for (Field field : fields.keySet()) { | ||
| if (FIELDS_MAP.containsKey(field)) { | ||
| cffData.put(FIELDS_MAP.get(field), fields.get(field)); | ||
| } else if (field instanceof UnknownField) { | ||
| // Check that field is accepted by CFF format specification | ||
| if (UNMAPPED_FIELDS.contains(field.getName())) { | ||
| cffData.put(field.getName(), fields.get(field)); | ||
| } | ||
| } | ||
| } | ||
| return cffData; | ||
| } | ||
|
|
||
| private void parseAuthors(Map<String, Object> data, List<Author> authors) { | ||
| List<Map<String, String>> authorsList = new ArrayList<>(); | ||
| authors.forEach(author -> { | ||
| Map<String, String> authorMap = new LinkedHashMap<>(); | ||
| if (author.getFamilyName().isPresent()) { | ||
| authorMap.put("family-names", author.getFamilyName().get()); | ||
| } | ||
| if (author.getGivenName().isPresent()) { | ||
| authorMap.put("given-names", author.getGivenName().get()); | ||
| } | ||
| if (author.getNamePrefix().isPresent()) { | ||
| authorMap.put("name-particle", author.getNamePrefix().get()); | ||
| } | ||
| if (author.getNameSuffix().isPresent()) { | ||
| authorMap.put("name-suffix", author.getNameSuffix().get()); | ||
| } | ||
| authorsList.add(authorMap); | ||
| }); | ||
| data.put("authors", authorsList.isEmpty() ? List.of(Map.of("name", "/")) : authorsList); | ||
| } | ||
|
|
||
| private void parseDate(Map<String, Object> data, String date) { | ||
| Optional<Date> parsedDateOpt = Date.parse(date); | ||
| if (parsedDateOpt.isEmpty()) { | ||
| data.put("issue-date", date); | ||
| return; | ||
| } | ||
| Date parsedDate = parsedDateOpt.get(); | ||
| if (parsedDate.getYear().isPresent() && parsedDate.getMonth().isPresent() && parsedDate.getDay().isPresent()) { | ||
| data.put("date-released", parsedDate.getNormalized()); | ||
| return; | ||
| } | ||
| parsedDate.getMonth().ifPresent(month -> data.put("month", month.getNumber())); | ||
| parsedDate.getYear().ifPresent(year -> data.put("year", year)); | ||
| } | ||
| } | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.