Multi-label Movie Genres Classification From Original Movie Overview

Members

This project is a part of the course Natural Language Processing at the University of Information Technology

No	Student ID	Full name	Email
1	23520179	Phùng Minh Chí	[email protected]
2	23520183	Nguyễn Hữu Minh Chiến	[email protected]
3	23521467	Lê Ngọc Phương Thảo	[email protected]

Course Information

Course Natural Language Processing
Course code: CS221
Class code: CS221.P22
Semester: HK2 (2024 - 2025)
Instructor: TS Nguyễn Trọng Chỉnh

Instruction

Clone the repository:

git clone hhttps://github.com/chisphung/CS221-GenresPrediction-from-Overview

Install dependencies:
```
pip install -r requirements.txt
```

Data preprocessing:

To preprocess the dataset, run the following command:

python tools/preprocess.py

You can also download the preprocessed dataset with the following command:

python tools/download.py

Model training:

To train the BERT models, run the following command:

python - m tools.train <pretrained_model_name> <dataset_path>

Replace <pretrained_model_name> with the name of the pretrained model you want to use (e.g., bert-base-uncased) and <dataset_path> with the path to your dataset.

Evaluation:

To evaluate the model, run the following command:

python -m src.evaluate

Modify the target list path and weights path to match your setup

Pretrained model:

To save your time, we are current support 3 pretrained models:

bert-base-uncased trained on preprocessded + undersampled dataset
distilled-bert-base-uncased trained on preprocessded dataset
bert-base-cased trained on raw + undersampled dataset

You can download them from the following links:

After downloading, you can place them in the weights folder.

Prediction:

To make a single prediction using the trained model, run the following command:

python -m src.main

Deployment:

To deploy the model using streamlit, run the following command:

streamlit run src/app.py

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
datasets		datasets
docs		docs
notebooks		notebooks
src		src
tools		tools
weights		weights
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-label Movie Genres Classification From Original Movie Overview

Members

Course Information

Instruction

Data preprocessing:

Model training:

Evaluation:

Pretrained model:

Prediction:

Deployment:

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

chisphung/CS221-GenresPrediction-from-Overview

Folders and files

Latest commit

History

Repository files navigation

Multi-label Movie Genres Classification From Original Movie Overview

Members

Course Information

Instruction

Data preprocessing:

Model training:

Evaluation:

Pretrained model:

Prediction:

Deployment:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages