Skip to content

Inflection 85: Updating Malayalam grammar.xml and following files #138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
e359163
Add Malayalam dictionary support
deonajulary06 May 27, 2025
a109f4e
Add Malayalam language support in LocaleUtils.hpp
deonajulary06 May 27, 2025
9fd9e01
Add Malayalam locale support to LocaleUtils
deonajulary06 May 27, 2025
72e7c28
Add Malayalam language to the tests
deonajulary06 May 27, 2025
95b483d
Add Malayalam locale group ml_IN
deonajulary06 May 27, 2025
681b568
ADD: Malayalam tokenizer configuration file
deonajulary06 May 27, 2025
bfa7ae6
Inflection-85: Add Git LFS config for Malayalam dictionary and XML files
deonajulary06 May 27, 2025
5aa7ec4
Add Malayalam inflection and pronoun tests
deonajulary06 Jun 4, 2025
f20350f
Updated copyright line
deonajulary06 Jun 5, 2025
9cc0a44
Updated copyright message
deonajulary06 Jun 5, 2025
4bb9964
Updated copright message
deonajulary06 Jun 5, 2025
a46c301
Updated language grammar to include Malayalam
deonajulary06 Jun 5, 2025
df82a8b
Added pronouns for Malayalam
deonajulary06 Jun 5, 2025
8fcfb16
Add ll GrammarSynthesizer files
deonajulary06 Jun 5, 2025
0c50978
Add Malayalam grammar synthesizer
deonajulary06 Jun 10, 2025
9523646
Add Malayalam-specific CommonConceptFactory with lists and quantities
deonajulary06 Jun 10, 2025
a8a7f2d
Update document on how to add a new language, fixed errors
deonajulary06 Jun 10, 2025
e60ea6f
Updated grammar.xml for Malayalam
deonajulary06 Jun 26, 2025
80af5bf
Update pronoun_ml.csv
deonajulary06 Jun 26, 2025
a70ad5e
Updated all grammar synthesizer component for Malayalam
deonajulary06 Jun 26, 2025
6a191b6
Update Common Concept Factory files
deonajulary06 Jun 26, 2025
421d0e4
Updated tests for Malayalam
deonajulary06 Jun 26, 2025
11d35cc
Fix Malayalam grammar synthesis and remove count lookup function
deonajulary06 Jul 13, 2025
de22098
Updated Grammeme Constants files to include sociative case
deonajulary06 Jul 13, 2025
367f745
Temporary fix for GitHub
deonajulary06 Jul 14, 2025
4786b2d
Modified files to fix more test errors
deonajulary06 Jul 21, 2025
f2b59c6
Update files to fix errors
deonajulary06 Jul 25, 2025
b7528a8
Same file as before but with corrected indentations
deonajulary06 Jul 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions documents/how_to_add_new_language.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ NOTE: Take a look at [PR #40](https://github.com/unicode-org/inflection/pull/40)
In general, to bootstrap your progress look for grammatically similar language that's already supported, e.g. if you are adding Serbian look for existing Russian implementation.
This will help you find most of the files you need to add/change and will speed up implementation of the rules and lexicons.

Before you add new language support, go to the README.md in the inflection subfolder (inflection/inflection/README.md), build the project, and make sure all the tests run on your computer.

## Mark your language as supported
* UPDATE: inflection/src/inflection/util/LocaleUtils.hpp
* UPDATE: inflection/src/inflection/util/LocaleUtils.cpp
Expand All @@ -29,13 +31,13 @@ TODO: We need to expand what each of these do.
* ADD: inflection/src/inflection/grammar/synthesis/*Xx*GrammarSynthesizer.hpp
* ADD: inflection/src/inflection/grammar/synthesis/*Xx*GrammarSynthesizer.cpp
* ADD: inflection/src/inflection/grammar/synthesis/*Xx*GrammarSynthesizer_*Xx*DisplayFunction.hpp
* ADD: inflection/src/inflection/grammar/synthesis/*Xx*GrammarSynthesizer_*Xx*DisplayFunction.hpp
* ADD: inflection/src/inflection/grammar/synthesis/*Xx*GrammarSynthesizer_*Xx*DisplayFunction.cpp
* UPDATE: inflection/src/inflection/grammar/synthesis/GrammarSynthesizerFactory.cpp
* UPDATE: inflection/src/inflection/grammar/synthesis/fwd.hpp

## Add language specific properties for lists, quantities and related topics
* ADD: inflection/src/inflection/dialog/language/*Xx*CommonConceptFactory.hpp
* ADD: inflection/src/inflection/dialog/language/*Xx*CommonConceptFactory.hpp
* ADD: inflection/src/inflection/dialog/language/*Xx*CommonConceptFactory.cpp
* UPDATE: inflection/src/inflection/dialog/language/fwd.hpp

## Define and create lexion
Expand Down
748,739 changes: 748,739 additions & 0 deletions inflection/resources/org/unicode/inflection/dictionary/dictionary_ml.lst

Large diffs are not rendered by default.

7,715 changes: 7,715 additions & 0 deletions inflection/resources/org/unicode/inflection/dictionary/inflectional_ml.xml

Large diffs are not rendered by default.

91 changes: 91 additions & 0 deletions inflection/resources/org/unicode/inflection/features/grammar.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1624,6 +1624,97 @@
</category>
</grammar>
</language>
<language id="ml">
<grammar>
<category name="case">
<grammeme name="nominative"/> <!-- no explicit marker; subject form -->
<grammeme name="accusative"/> <!-- -യെ, -ായെ, marks direct object -->
<grammeme name="genitive"/> <!-- -ന്റെ, -യുടെ (possessive) -->
<grammeme name="dative"/> <!-- -ക്ക്, -ന് (to/for) -->
<grammeme name="instrumental"/> <!-- -ആല് (by means of) -->
<grammeme name="locative"/> <!-- -യില് (in/at) -->
<grammeme name="sociative"/> <!-- -ഓടു് (along with) -->
</category>
<category name="number">
<grammeme name="singular"/>
<grammeme name="plural"/>
</category>
<category name="pos">
<grammeme name="pronoun"/>
<grammeme name="verb"/>
<grammeme name="noun"/>
<grammeme name="adjective"/>
</category>
<category name="person">
<restrictions>
<restriction name="pos" value="pronoun"/>
<restriction name="pos" value="verb"/>
</restrictions>
<grammeme name="first"/>
<grammeme name="second"/>
<grammeme name="third"/>
</category>
<category name="gender">
<restrictions>
<restriction name="pos" value="pronoun"/>
<restriction name="pos" value="verb"/>
<restriction name="pos" value="noun"/>
</restrictions>
<grammeme name="masculine"/>
<grammeme name="feminine"/>
<grammeme name="neuter"/> <!-- e.g. for objects or animals -->
</category>
<category name="tense">
<restrictions>
<restriction name="pos" value="verb"/>
</restrictions>
<grammeme name="past"/>
<grammeme name="present"/>
<grammeme name="future"/>
</category>
<category name="determination">
<restrictions>
<restriction name="pos" value="pronoun"/>
<restriction name="case" value="genitive"/>
</restrictions>
<grammeme name="independent"/> <!-- e.g. mine -->
<grammeme name="dependent"/> <!-- e.g. my {object} -->
</category>
<category name="mood">
<restrictions>
<restriction name="pos" value="verb"/>
</restrictions>
<grammeme name="indicative"/>
<grammeme name="imperative"/>
<grammeme name="subjunctive"/>
</category>
<category name="pronounType">
<restrictions>
<restriction name="pos" value="pronoun"/>
</restrictions>
<grammeme name="personal"/> <!-- regular pronouns like ഞാൻ, നീ -->
<grammeme name="reflexive"/> <!-- e.g. താൻ, തങ്ങൾ -->
<grammeme name="proximal"/> <!-- e.g. ഇവൻ, ഇവൾ, ഇത് -->
<grammeme name="distal"/> <!-- e.g. അവൻ, അവൾ, അത് -->
<grammeme name="interrogative"/> <!-- e.g. എവൻ, എവൾ, ഏത് -->
</category>
<category name="formality">
<restrictions>
<restriction name="pos" value="verb"/>
<restriction name="pos" value="pronoun"/>
</restrictions>
<grammeme name="formal"/>
<grammeme name="informal"/>
</category>
<category name="clusivity">
<restrictions>
<restriction name="pos" value="pronoun"/>
</restrictions>
<grammeme name="inclusive"/>
<grammeme name="exclusive"/>
</category>
</grammar>
</language>
<language id="ms">
<grammar>
<category name="clusivity">
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
എനിക്ക്,first,singular,dative,personal
ഞാൻ,first,singular,nominative,exclusive,personal
എന്നെ,first,singular,accusative,exclusive,personal
എന്റെ,first,singular,genitive,determination=dependent,exclusive,personal
എന്റേത്,first,singular,genitive,determination=independent,exclusive,personal
Comment on lines +5 to +6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The determination= is unnecessary. You can remove it.

നമുക്ക്,first,plural,dative,inclusive,personal
ഞങ്ങൾ,first,plural,nominative,exclusive,personal
ഞങ്ങളെ,first,plural,accusative,exclusive,personal
ഞങ്ങൾക്ക്,first,plural,dative,exclusive,personal
ഞങ്ങളുടെ,first,plural,genitive,exclusive,determination=dependent,personal
ഞങ്ങളുടേത്,first,plural,genitive,exclusive,determination=independent,personal
നമ്മുടെ,first,plural,genitive,inclusive,determination=dependent,personal
നമ്മുടേതു്,first,plural,genitive,inclusive,determination=independent,personal
നിനക്ക്,second,singular,dative,informal,personal
നീ,second,singular,nominative,informal,personal
നിനെ,second,singular,accusative,informal,personal
നിന്റെ,second,singular,genitive,informal,determination=dependent,personal
നിന്റേതു്,second,singular,genitive,informal,determination=independent,personal
താങ്കൾ,second,singular,nominative,formal,personal
താങ്കളെ,second,singular,accusative,formal,personal
താങ്കൾക്ക്,second,singular,dative,formal,personal
താങ്കളുടെ,second,singular,genitive,formal,determination=dependent,personal
താങ്കളുടേതു്,second,singular,genitive,formal,determination=independent,personal
നിങ്ങൾ,second,plural,nominative,formal,personal
നിങ്ങളെ,second,plural,accusative,formal,personal
നിങ്ങൾക്ക്,second,plural,dative,formal,personal
നിങ്ങളുടെ,second,plural,genitive,formal,determination=dependent,personal
നിങ്ങളുടേതു്,second,plural,genitive,formal,determination=independent,personal
അവൻ,third,singular,nominative,masculine,personal,distal
അവനെ,third,singular,accusative,masculine,personal,distal
അവന്റെ,third,singular,genitive,masculine,determination=dependent,personal,distal
അവന്റെത്,third,singular,genitive,masculine,determination=independent,personal,distal
അവൾ,third,singular,nominative,feminine,personal,distal
അവളെ,third,singular,accusative,feminine,personal,distal
അവളുടെ,third,singular,genitive,feminine,determination=dependent,personal,distal
അവളുടേതു്,third,singular,genitive,feminine,determination=independent,personal,distal
അത്,third,singular,nominative,neuter,personal,distal
അതിനെ,third,singular,accusative,neuter,personal,distal
അതിന്റെ,third,singular,genitive,neuter,determination=dependent,personal,distal
അതിന്റേതു്,third,singular,genitive,neuter,determination=independent,personal,distal
അവർ,third,plural,nominative,personal,distal
അവരെ,third,plural,accusative,personal,distal
അവരുടെ,third,plural,genitive,determination=dependent,personal,distal
അവരുടേതു്,third,plural,genitive,determination=independent,personal,distal
എന്നിൽ,first,singular,locative,personal
എന്നാൽ,first,singular,instrumental,personal
എന്നോടു്,first,singular,sociative,personal
ഞങ്ങളിലു്,first,plural,locative,exclusive,personal
ഞങ്ങളാൽ,first,plural,instrumental,exclusive,personal
ഞങ്ങളോടു്,first,plural,sociative,exclusive,personal
നിനിൽ,second,singular,locative,informal,personal
നിനാൽ,second,singular,instrumental,informal,personal
നിനോടു്,second,singular,sociative,informal,personal
താങ്കളിൽ,second,singular,locative,formal,personal
താങ്കളാൽ,second,singular,instrumental,formal,personal
താങ്കളോടു്,second,singular,sociative,formal,personal
നിങ്ങളിൽ,second,plural,locative,formal,personal
നിങ്ങളാൽ,second,plural,instrumental,formal,personal
നിങ്ങളോടു്,second,plural,sociative,formal,personal
അവനിൽ,third,singular,locative,masculine,personal,distal
അവനാൽ,third,singular,instrumental,masculine,personal,distal
അവനോടു്,third,singular,sociative,masculine,personal,distal
അവളിൽ,third,singular,locative,feminine,personal,distal
അവളാൽ,third,singular,instrumental,feminine,personal,distal
അവളോടു്,third,singular,sociative,feminine,personal,distal
അതിൽ,third,singular,locative,neuter,personal,distal
അതാൽ,third,singular,instrumental,neuter,personal,distal
അതോടു്,third,singular,sociative,neuter,personal,distal
അവരിൽ,third,plural,locative,personal,distal
അവരാൽ,third,plural,instrumental,personal,distal
അവരോടു്,third,plural,sociative,personal,distal
താൻ,third,singular,nominative,reflexive,personal
തങ്ങൾ,third,plural,nominative,formal,reflexive,personal
ഇവൻ,third,singular,nominative,masculine,proximal,personal
ഇവൾ,third,singular,nominative,feminine,proximal,personal
ഇത്,third,singular,nominative,neuter,proximal,personal
ഇവർ,third,plural,nominative,proximal,personal
എവൻ,third,singular,nominative,masculine,interrogative
എവൾ,third,singular,nominative,feminine,interrogative
എവർ,third,plural,nominative,interrogative
ഏത്,third,singular,nominative,neuter,interrogative
നമ്മൾ,first,plural,nominative,inclusive,personal
നമ്മെ,first,plural,accusative,inclusive,personal
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ locale.group.it=it_IT,it_CH
locale.group.ja=ja_JP
locale.group.ko=ko_KR
locale.group.ms=ms_MY
locale.group.ml=ml_IN
locale.group.nb=nb_NO
locale.group.nl=nl_NL,nl_BE
locale.group.pt=pt_BR,pt_PT
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#
# Copyright 2025 Unicode Incorporated and others. All rights reserved.
#
tokenizer.implementation.class=DefaultTokenizer

Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
/*
* Copyright 2025 Unicode Incorporated and others. All rights reserved.
*/

#include <inflection/dialog/language/MlCommonConceptFactory.hpp>

#include <inflection/dialog/SemanticFeatureConceptBase.hpp>
#include <inflection/grammar/synthesis/GrammemeConstants.hpp>
#include <inflection/npc.hpp>

using inflection::grammar::synthesis::GrammemeConstants;

namespace inflection::dialog::language {

MlCommonConceptFactory::MlCommonConceptFactory(const ::inflection::util::ULocale& language)
: super(language)
{
}

MlCommonConceptFactory::~MlCommonConceptFactory()
{
}

} // namespace inflection::dialog::language

Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
/*
* Copyright 2025 Unicode Incorporated and others. All rights reserved.
*/
#pragma once

#include <inflection/dialog/language/fwd.hpp>
#include <inflection/dialog/CommonConceptFactoryImpl.hpp>
#include <inflection/grammar/synthesis/fwd.hpp>
#include <inflection/dialog/Plurality.hpp>

namespace inflection::dialog::language {

class MlCommonConceptFactory : public CommonConceptFactoryImpl {
using super = CommonConceptFactoryImpl;

public:
explicit MlCommonConceptFactory(const ::inflection::util::ULocale& language);
~MlCommonConceptFactory() override;

protected:
::inflection::dialog::SpeakableString quantifyType(
const ::inflection::dialog::SpeakableString& formattedNumber,
const ::inflection::dialog::SemanticFeatureConceptBase& semanticConcept,
bool useDefault,
::inflection::dialog::Plurality::Rule countType) const override;
};

} // namespace inflection::dialog::language

3 changes: 2 additions & 1 deletion inflection/src/inflection/dialog/language/fwd.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2017-2024 Apple Inc. All rights reserved.
* Copyright 2017-2025 Apple Inc. All rights reserved.
*/
// Forward declarations for inflection.dialog.language
#pragma once
Expand Down Expand Up @@ -28,6 +28,7 @@ namespace inflection
class JaCommonConceptFactory;
class KoCommonConceptFactory;
class KoCommonConceptFactory_KoAndList;
class MlCommonConceptFactory;
class MsCommonConceptFactory;
class NbCommonConceptFactory;
class NlCommonConceptFactory;
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2017-2024 Apple Inc. All rights reserved.
* Copyright 2017-2025 Apple Inc. All rights reserved.
*/
#include <inflection/grammar/synthesis/GrammarSynthesizerFactory.hpp>

Expand All @@ -13,6 +13,7 @@
#include <inflection/grammar/synthesis/HiGrammarSynthesizer.hpp>
#include <inflection/grammar/synthesis/ItGrammarSynthesizer.hpp>
#include <inflection/grammar/synthesis/KoGrammarSynthesizer.hpp>
#include <inflection/grammar/synthesis/MlGrammarSynthesizer.hpp>
#include <inflection/grammar/synthesis/NbGrammarSynthesizer.hpp>
#include <inflection/grammar/synthesis/NlGrammarSynthesizer.hpp>
#include <inflection/grammar/synthesis/PtGrammarSynthesizer.hpp>
Expand Down Expand Up @@ -41,6 +42,7 @@ static const ::std::map<::inflection::util::ULocale, addSemanticFeatures>& GRAMM
{::inflection::util::LocaleUtils::HINDI(), &HiGrammarSynthesizer::addSemanticFeatures},
{::inflection::util::LocaleUtils::ITALIAN(), &ItGrammarSynthesizer::addSemanticFeatures},
{::inflection::util::LocaleUtils::KOREAN(), &KoGrammarSynthesizer::addSemanticFeatures},
{::inflection::util::LocaleUtils::MALAYALAM(), &MlGrammarSynthesizer::addSemanticFeatures},
{::inflection::util::LocaleUtils::NORWEGIAN(), &NbGrammarSynthesizer::addSemanticFeatures},
{::inflection::util::LocaleUtils::DUTCH(), &NlGrammarSynthesizer::addSemanticFeatures},
{::inflection::util::LocaleUtils::PORTUGUESE(), &PtGrammarSynthesizer::addSemanticFeatures},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,12 @@ const ::std::u16string& GrammemeConstants::CASE_PREPOSITIONAL()
return *npc(CASE_PREPOSITIONAL_);
}

const ::std::u16string& GrammemeConstants::CASE_SOCIATIVE()
{
static auto CASE_SOCIATIVE_ = new ::std::u16string(u"sociative");
return *npc(CASE_SOCIATIVE_);
}

const ::std::u16string& GrammemeConstants::CASE_TRANSLATIVE()
{
static auto CASE_TRANSLATIVE_ = new ::std::u16string(u"translative");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ class inflection::grammar::synthesis::GrammemeConstants final
static const ::std::u16string& CASE_OBLIQUE();
static const ::std::u16string& CASE_PARTITIVE();
static const ::std::u16string& CASE_PREPOSITIONAL();
static const ::std::u16string& CASE_SOCIATIVE();
static const ::std::u16string& CASE_TRANSLATIVE();
static const ::std::u16string& CASE_VOCATIVE();

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
/*
* Copyright 2025 Apple Inc. All rights reserved.
*/
#include <inflection/grammar/synthesis/MlGrammarSynthesizer.hpp>

#include <inflection/dialog/SemanticFeatureModel.hpp>
#include <inflection/grammar/synthesis/MlGrammarSynthesizer_NumberLookupFunction.hpp>
#include <inflection/grammar/synthesis/MlGrammarSynthesizer_GenderLookupFunction.hpp>
#include <inflection/grammar/synthesis/MlGrammarSynthesizer_CaseLookupFunction.hpp>
#include <inflection/grammar/synthesis/MlGrammarSynthesizer_MlDisplayFunction.hpp>
#include <inflection/grammar/synthesis/GrammemeConstants.hpp>

namespace inflection::grammar::synthesis {

void MlGrammarSynthesizer::addSemanticFeatures(::inflection::dialog::SemanticFeatureModel& featureModel)
{
featureModel.putDefaultFeatureFunctionByName(GrammemeConstants::NUMBER, new MlGrammarSynthesizer_NumberLookupFunction());
featureModel.putDefaultFeatureFunctionByName(GrammemeConstants::GENDER, new MlGrammarSynthesizer_GenderLookupFunction());
featureModel.putDefaultFeatureFunctionByName(GrammemeConstants::CASE, new MlGrammarSynthesizer_CaseLookupFunction());

featureModel.setDefaultDisplayFunction(new MlGrammarSynthesizer_MlDisplayFunction(featureModel));
}

} // namespace inflection::grammar::synthesis

Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
/*
* Copyright 2025 Apple Inc. All rights reserved.
*/
#pragma once

#include <inflection/dialog/fwd.hpp>
#include <inflection/grammar/synthesis/fwd.hpp>
#include <string>

class inflection::grammar::synthesis::MlGrammarSynthesizer final
{
public:
static void addSemanticFeatures(::inflection::dialog::SemanticFeatureModel& featureModel);
private:
MlGrammarSynthesizer() = delete;
};

Loading
Loading