Inflection-63 Integrate ko Wikidata into Unicode Inflection Inflection-62 Integrate ar Wikidata into Unicode Inflection Inflection-61 Integrate he Wikidata into Unicode Inflection Inflection-60 Integrate hi Wikidata into Unicode Inflection Inflection-58 Integrate nb Wikidata into Unicode Inflection Inflection-56 Integrate nl Wikidata into Unicode Inflection Inflection-55 Integrate tr Wikidata into Unicode Inflection Inflection-54 Integrate ru Wikidata into Unicode Inflection Inflection-53 Integrate it Wikidata into Unicode Inflection Inflection-52 Integrate pt Wikidata into Unicode Inflection Inflection-51 Integrate fr Wikidata into Unicode Inflection Inflection-50 Integrate de Wikidata into Unicode Inflection #167
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #63
Fixes #62
Fixes #61
Fixes #60
Fixes #58
Fixes #56
Fixes #55
Fixes #54
Fixes #53
Fixes #52
Fixes #51
Fixes #50
These changes transition the remainder of the languages from stub test data to lexical dictionaries based on Wikidata.
The oldest commit replaces the dictionaries.
The middle commit are the code changes to consume and use the new lexical dictionaries.
The newest commit disables a few tests in Arabic and Hebrew until the data or code can be changed to pass the tests. It's also possible that the tests are bad, but that requires further review.
Here are some other highlights with these changes in addition to the data transition:
std::less<>
for some string based sets and mapsquantify
method toquantifyFormatted
CommonConceptFactory so that it doesn't conflict with the 2 argumentquantify
method../ParseWikidata --all ~/Downloads/wikidata-20250716-lexemes.json
. There were some warnings about the data, but the number of issues is small. Most of the issues involve unknown grammemes. The warnings indicate the following: