- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 3k
Rework journal abbreviation caching #6304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 
 | 
| 
 On the one hand, this feature makes JabRef special. On the other hand, this is a hidden gem. This strongly relates to our improved typed  In case users demand for this feature, they can "just" use the "IEEE abbreviation list", can't they? 
 Please do not drop this feature. I know that currently no one committed updated these abbreviations YET. Nevertheless, I believe in use cases where we really need that. It should not be too hard to offer three different kind of journal name variants. 
 +1 for build process. I plan to keep https://abbrv.jabref.org/ and enhance the abbreviation service offered there. The tooling to handle MVStore format is not as common as tooling for handling CSV files (looking at Python, Java, LibreOffice, GitHub browsing). | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just small comments. Have to open the code in the IDE for a detailed feedback and possible code suggestions.
Added ADR-0010 nevertheless. Maybe, we have other reasons for H2?
        
          
                src/main/java/org/jabref/logic/journals/JournalAbbreviationLoader.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                src/test/java/org/jabref/logic/integrity/IntegrityCheckTest.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | Maybe, I should schedule a mob programming session at https://github.com/remotemobprogramming/upcoming-sessions to a) work on with-dot and dot-less and b) the three-state journal abbreviation? | 
| Thanks @jlaehne and @koppor for the input. I've now updated the code and the PR description accordingly. I ignored the doted version as well as the shortest abbreviation for now. In theory, this can be added but needs more work (need to worry about serialization in MvStore). Thus, this can be done at a later point if users really requests it (which I doubt to be honest). | 
        
          
                buildSrc/build.gradle
              
                Outdated
          
        
      | apply plugin: 'java' | ||
|  | ||
| repositories { | ||
| mavenLocal() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maven local can be removed (only useful if you have gradle + maven builds in your system), will speed up the build.
| return false; | ||
| } | ||
| AbbreviationViewModel that = (AbbreviationViewModel) o; | ||
| return getName().equals(that.getName()) && | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Objects.equals(...) to make it more consistent with other implementations
        
          
                src/main/java/org/jabref/gui/journals/ManageJournalAbbreviationsView.java
          
            Show resolved
            Hide resolved
        
      | URL url = Objects.requireNonNull(JournalAbbreviationRepository.class.getResource(Objects.requireNonNull(resourceFileName))); | ||
| readJournalList(new InputStreamReader(url.openStream(), StandardCharsets.UTF_8)); | ||
| InputStream stream = JournalAbbreviationRepository.class.getResourceAsStream(resourceFileName); | ||
| readJournalList(new BufferedReader(new InputStreamReader(stream, StandardCharsets.UTF_8))); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible resource leak? I don't see where the reader is closed.
| return abbreviations.stream().anyMatch(abbreviation -> isMatchedAbbreviated(journalName.trim(), abbreviation)); | ||
| String journal = journalName.trim(); | ||
|  | ||
| boolean isAbbreviated = customAbbreviations.stream().anyMatch(abbreviation -> isMatchedAbbreviated(journal, abbreviation)); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combine both returns with an OR?
| Thanks for the review @Siedlerchr. Feedback is now included. | 
| Codewise looks good, we just have to test the changes now in the jlink/jpackage variant to make sure the classpath resources etc are correctly found | 
| @Siedlerchr That means, the only thing left to test is to download the binary from https://builds.jabref.org/pull/6304/merge/ and to execute it? | 
| Does not work with  
 Is "MEDLINE" now forbidden? | 
| In short: When "unabbreviating", no changes are written. Just load  | 
| Good observation, there was indeed a small bug in the code. Unnabbreviation of names with dots should work now. As I said above, the built-in list does no longer contain names without dots, so medline unabbreviation does indeed no longer work (something for a new PR in my opinion). | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally had some time to walk through this PR.
Got two nitpicks on codestyle, one about injection and one about a comment.
| @FXML private ButtonType saveExporter; | ||
| @Inject private DialogService dialogService; | ||
| @Inject private PreferencesService preferences; | ||
| @Inject private final DialogService dialogService; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Inject does not seem right, as DialogService and PreferencesService are injected by the constructor in this class too...
| CreateModifyExporterDialogView dialog = new CreateModifyExporterDialogView(null, dialogService, preferences, | ||
| loader); | ||
| CreateModifyExporterDialogView dialog = new CreateModifyExporterDialogView(null, dialogService, preferences | ||
| ); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accidental line break?
| public LayoutFormatterPreferences getLayoutFormatterPreferences(JournalAbbreviationRepository repository) { | ||
| return new LayoutFormatterPreferences(getNameFormatterPreferences(), | ||
| getFileLinkPreferences(), | ||
| repository); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation seems off
| // Abbreviation equality is tested on name only, so we might have to remove an old abbreviation | ||
| abbreviations.remove(abbreviation); | ||
| abbreviations.add(abbreviation); | ||
| // We do not want to keep duplicates, thus remove the old abbreviation | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following two lines are a bit irritating for Java beginners, so I would suggest to keep the old comment on them.
| The files are generated every time on build time? So we probably need to add the -mv file to the gitignore file. Just noticed that | 
| At the moment yes, but I would propose to only add the mv file and remove the cls one: #5749 (comment) | 
* upstream/master: (166 commits) New Crowdin translations (#6382) Update code-howtos.md (#6393) Fix jstyle was invalid with default section at the start (#6386) Correcting file name for groups.uml (#6373) Fix underscore character being omitted from file name in Recent Libraries list (#6389) Rework journal abbreviation caching (#6304) Fix selecting custom export for copy to clipboard with uppercase file ext (#6290) New Crowdin translations (#6375) Squashed 'src/main/resources/csl-styles/' changes from 143464e..906cd6d Fixes #6357: File directory (#6377) Disable the generate button if the ID field is empty (#6371) Fix Preferences style value too long (#6372) Fix various Dark theme issues (#6368) Correct label name in dependabot Bump java-diff-utils from 4.5 to 4.7 (#6365) Try with info.plist.template also (#6366) Fix wrong button order (Apply and Cancel) in ManageProtectedTermsDialog. (#6358) Bump flexmark-ext-gfm-strikethrough from 0.61.6 to 0.61.20 (#6361) Bump checkstyle from 8.31 to 8.32 (#6360) Bump flexmark-ext-gfm-tasklist from 0.61.16 to 0.61.20 (#6364) ...
43566f2 Update molecular-oncology.csl (#6354) 38f2b5f Update san-francisco-estuary-and-watershed-science.csl (#6350) 724cb12 Update australasian-journal-of-philosophy.csl (#6344) df6af86 Fix webpage in-text citation for council-of-science-editors-author-date.csl (#6318) 6900b58 Add month to magazine for Bluebook e0a8148 Update cambridge-university-press-author-date-cambridge-a.csl (#6345) 0823448 Add space & comma before pp (#6343) ff38cd2 Update american-journal-of-respiratory-and-critical-care-medicine.csl (#6341) 8727dfb Update early-music-history.csl (#6323) 1154354 Update boletin-de-pediatria.csl (#6310) f25438e Update pravnik.csl (#6309) db7a4ae Update zoological-journal-of-the-linnean-society.csl (#6304) 6c043a7 Create polygraphia.csl (#6307) 255e00c Create intellect-newgen-books.csl (#6308) 323629f Update historical-materialism.csl (#6300) f62b70d Create european-review-of-international-studies.csl (#6301) dff2698 Update ucl-university-college-harvard.csl (#6298) 0ce09c9 Update sciences-po-ecole-doctorale-note-french.csl (#6290) bdd53ec Update sciences-po-ecole-doctorale-author-date.csl (#6291) efde4d4 Create journal-of-law-medicine-ethics.csl (#6296) 7539b2c Create theses-de-sorbonne-universite.csl (#6295) 905f25a Update biochemical-society-transactions.csl (#6292) a76a3f5 Update smithsonian-institution-scholarly-press author-date and note (#6294) e6b6c6c Create exploration-of-targeted-anti-tumor-therapy.csl (#6276) 024c9c8 Create nys-nydanske-studier.csl (#6331) d9ac8e1 Update vox-sanguinis.csl (#6327) 9a98e92 Remove DOI from Genetics & Molecular Biology git-subtree-dir: buildres/csl/csl-styles git-subtree-split: 43566f2

Rework the caching of the journal abbreviation list in order to reduce the memory footprint.
Instead of the plain in-memory cache, this PR now uses the h2 MVStore which is a light-weight file-based "database". I couldn't notice any performance degradation but a huge improvement in the memory department.
From:



To:
Thus about 30% reduction. Which leads to a very reasonable memory usage: 400mb with a db of 1k entries. For larger databases the auto completion still leads to problems (see below, which was before the changes of this PR) - but this will be a new PR.
(The first two entries belong to the classloader and thus represent all libraries that we use).
Further changes:
Further remarks: