-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Make the DOI Resolution Fetcher return nothing when the DOI leads to a host for which a tailored fetcher exists #6937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…a host for which a tailored fetcher exists.
|
As you noticed, the underlying problem is actually that the DOI fetcher has a higher trust value as the publishers. I think it would be a good idea to change it to "publishers > identifier-based resolution (doi, arXiv) > general search (google)". @JabRef/developers @Toromtomtom do you see any problem with this solution? |
I also think that this would be a better solution. |
|
|
|
I reverted my previous commits and decreased the trust level of the DOI resolution fetcher. This works for me, but maybe someone more involved in the project wants to weigh in on the ranking of the full text fetchers. |
|
Food for thought:
Proposal: Can we add a special handling for Springer? If a DOI directs to Springer, we use the Springer Fetcher. In all other cases, the functionality is untouched. In this way, we accept that this is a hack. To really judge, there would be a test needed retrieving 1000 papers and check whether the retrieval rate is higher or lower with this check. - Alternatively, can we add telemetry for that? |
| @Override | ||
| public TrustLevel getTrustLevel() { | ||
| return TrustLevel.SOURCE; | ||
| return TrustLevel.META_SEARCH; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This decreases the result quality of the DOI fetcher (always leading to the "right" paper) to the quality of Google Scholar. (From the highest to the lowest)
Can the solution of the title? 😇
Make the DOI Resolution Fetcher return nothing when the DOI leads to a host for which a tailored fetcher exists
|
The problem is that a DOI often does not lead to the fulltext version directly, but to the site where to find the fulltext. And our DOIResolution Fetcher does some magic guessing by looking at the first PDF-link the sourcecode of the website. |
|
The springer fetcher also only looks at the DOI, but uses the springer API to find the correct URL for the download.
|
|
Devcall decision: Use first solution. -- @koppor will do git magic |
|
All right, thanks for taking care of this! |
|
@koppor In addition, the SpringerLink should have a higher trust score as the DoiResolution fetcher, since it's also DOI-based but custom-tailored to Springer. I would also merge this class with the other springer fetcher. |
|
Is there anything I can do to move this forward? Reset the branch or something? |
|
Steps:
In parallel, I discuss with @stefan-kolb, because he invented the whole thing. My mistake was not to enforce that design decisions are documented (either as ADR or as other text files) |
|
I think, I collected all documentation and put it at the appropriate place at #6990. So, nothing to do for @Toromtomtom in this PR. |
|
The first two commits are in master now. See ce9f714. |
Fixes #6922