Skip to content

Integrate fr Wikidata into Unicode Inflection #51

@grhoten

Description

@grhoten

The revised dictionary-parser can parse Wikidata, but some issues need to be resolved.

The initial issues include:

  • The dictionary-parser output needs to be addressed
  • The unit tests need to be fixed.

Tool output that needs to be addressed:

Line 2415: Q1050744 is not a known part of speech grammeme for L19397(duquel)
Line 89172: Q10343770 is not a known grammeme for L738468(IP)
Line 167868: Q82955 is not a known part of speech grammeme for L1373953(Raymond Lemieux)
Line 345406: Q2824480 is not a known part of speech grammeme for L9203(ce)
Line 345414: Q420020 is not a known grammeme for L9288(nous)
Line 345625: Q1050744 is not a known part of speech grammeme for L11158(lequel)
Line 346100: Q3618903 is not a known part of speech grammeme for L15026(aucun)
Line 432895: Q10343770 is not a known grammeme for L738472(ADSL)
Line 522421: Q11655558 is not a known part of speech grammeme for L57947(lors même que)
Line 687814: Q650250 is not a known grammeme for L9094(je)
Line 687940: Q3618903 is not a known part of speech grammeme for L10023(chaque)
Line 859126: Q650250 is not a known grammeme for L9096(tu)
Line 860426: Q1050744 is not a known part of speech grammeme for L19396(auquel)
Line 1030832: Q3618903 is not a known part of speech grammeme for L9275(quelque)
Line 1179165: Q4116295 is not a known part of speech grammeme for L1232738(Canapé)
Line 1201046: Q114092330 is not a known grammeme for L2770(le)
Line 1201584: Q114092330 is not a known grammeme for L7026(beau)
Line 1201947: Q3618903 is not a known part of speech grammeme for L10017(tout)

Here is the current generated lexical dictionary files to debug the test failures.
fr.zip

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions