A curated repository of end-to-end machine learning projects demonstrating the application of classification, regression, and natural language processing (NLP) techniques using real-world datasets.
Each notebook showcases not just model training and evaluation, but also proper data preprocessing, visualization, and pipeline design—ideal for applying ML in practical environments.
- Supervised ML (Classification & Regression)
- Text preprocessing & sentiment analysis (NLP)
- Feature engineering and selection
- Model evaluation with precision, recall, F1, ROC-AUC, and MSE
- Clean and annotated Jupyter Notebooks
Tool/Library | Purpose |
---|---|
scikit-learn |
ML models, pipelines, metrics |
pandas / numpy |
Data preprocessing and manipulation |
matplotlib |
Visualization |
nltk / re |
Text preprocessing and tokenization |
Jupyter Notebook |
Interactive experimentation environment |
machine-learning-classification-regression-nlp ├── classfication_regression_nlp-Report-Samuel-Pillai.ipynb # Final report notebook with analysis ├── classification_regression_nlp.ipynb # Cleaned version for public viewing ├── README.md
Recommended environment: Python 3.9+ with Jupyter installed.
bash:
python -m venv venv source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txt
jupyter notebook
⸻
• Predicting numeric targets using regression (e.g., housing prices)
• Classifying sentiments or labels from text
• Text cleaning, tokenization, stopword removal
• Model comparisons: Logistic Regression, Decision Trees, Random Forests
⸻
This project is released under the MIT License.
⸻
Samuel Pillai Email: [email protected]