BhashaMind: A Bengali Language Intelligence Platform

BhashaMind (ভাষাMind) is an open-source Bengali language processing prototype focused on text summarization and zero-shot classification using state-of-the-art deep learning and LLM technologies. It integrates a full-stack application built with FastAPI, Spring Boot, and React, designed for real-world applications.

🌟 Project Highlights

🔤 Low-Resource Bengali NLP: Tailored for Bengali, one of the least-resourced languages in NLP.
🧠 Transformer-powered: Uses multilingual LLMs like xlm-roberta and optionally fine-tuned BanglaT5.
🧩 Microservices Architecture: Spring Boot API gateway + FastAPI backend + React frontend.
🧪 Full-stack Pipeline: Data ingestion → Summarization/Classification → Evaluation → UI interaction.

🏗️ Architecture

Frontend: ReactJS + TailwindCSS
API Gateway: Java Spring Boot
NLP Backend: Python FastAPI with HuggingFace Transformers
Model Training: PyTorch-based scripts for fine-tuning
Dataset: Placeholder Bengali datasets (summarization/classification)

🧪 Sample Usage

🔹 Summarization API

Request (POST /api/summarize):

{
  "text": "জাতিসংঘের মহাসচিব আন্তোনিও গুতেরেস বলেছেন, জলবায়ু পরিবর্তনের প্রভাব এখন বৈশ্বিক সংকটের রূপ নিয়েছে। আফ্রিকা, এশিয়া ও লাতিন আমেরিকার বহু দেশ ভয়াবহ খরার সম্মুখীন হচ্ছে, যেখানে খাদ্য ও পানির তীব্র সংকট দেখা দিয়েছে। গুতেরেস উন্নত দেশগুলোকে কার্বন নিঃসরণ কমাতে জরুরি পদক্ষেপ নেওয়ার আহ্বান জানান।"
}

Response:

{
  "summary": "জাতিসংঘ মহাসচিব গুতেরেস জানান, জলবায়ু পরিবর্তন বৈশ্বিক সংকটের রূপ নিয়েছে এবং উন্নত দেশগুলোকে কার্বন নিঃসরণ কমাতে হবে।"
}

🔹 Classification API

Request (POST /api/classify):

{
  "text": "বিশ্বব্যাপী অর্থনৈতিক প্রবৃদ্ধি ধীর হয়ে পড়েছে। আন্তর্জাতিক মুদ্রা তহবিল (IMF) জানিয়েছে যে মুদ্রাস্ফীতি, উচ্চ সুদের হার এবং রাশিয়া-ইউক্রেন যুদ্ধের প্রভাব বৈশ্বিক অর্থনীতিতে দীর্ঘমেয়াদি নেতিবাচক প্রভাব ফেলছে। উন্নয়নশীল দেশগুলোতে খাদ্য ও জ্বালানির দাম বেড়ে যাওয়ায় সাধারণ মানুষের উপর চাপ বৃদ্ধি পাচ্ছে।"
}

Response:

{
  "label": "economy"
}

🤖 Models Used

Task	Model Name	Reference/Link
Summarization	`csebuetnlp/banglat5-small`	BanglaT5 - Hugging Face
Classification	`joeddav/xlm-roberta-large-xnli`	XLM-RoBERTa - Hugging Face

Note: Bengali is supported by XLM-R via multilingual training and subword tokenization.

📚 References & Resources

📜 Citing

If you use this, please include the following citation:

Maity, A. (2025) “A Low-Resource Bengali Language Intelligence Platform for Summarization and Zero-Shot Classification”. Zenodo. DOI:10.5281/zenodo.16434434

@misc{maity2025bhashamind,
  author       = {Maity, Abhishek},
  title        = {A Low-Resource Bengali Language Intelligence Platform for Summarization and Zero-Shot Classification},
  year         = {2025},
  publisher    = {Zenodo and {CERN}},
  doi          = {10.5281/zenodo.16434434},
  url          = {https://doi.org/10.5281/zenodo.16434434},
  language     = {en}
}

📄 License

This project is licensed under the MIT License.

Disclaimer

Google Jules was used for fixing errors in the Continuous Integration (CI) workflow of this repository.

Made with ❤️ and code.

Name		Name	Last commit message	Last commit date
Latest commit History 289 Commits
.github/workflows		.github/workflows
backend-java		backend-java
backend-python		backend-python
data		data
docker		docker
docs		docs
frontend-react		frontend-react
models		models
.gitignore		.gitignore
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BhashaMind: A Bengali Language Intelligence Platform

🌟 Project Highlights

🏗️ Architecture

🧪 Sample Usage

🔹 Summarization API

🔹 Classification API

🤖 Models Used

📚 References & Resources

📜 Citing

📄 License

Disclaimer

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

abhishekmaity/BhashaMind

Folders and files

Latest commit

History

Repository files navigation

BhashaMind: A Bengali Language Intelligence Platform

🌟 Project Highlights

🏗️ Architecture

🧪 Sample Usage

🔹 Summarization API

🔹 Classification API

🤖 Models Used

📚 References & Resources

📜 Citing

📄 License

Disclaimer

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages