Data Visualizer Tool is a Python application designed to assist users in performing exploratory data analysis (EDA). The application provides a user-friendly interface built with Tkinter, allowing users to easily load datasets, visualize data, and apply various transformations.
This project was created during my early years at university, and while it serves its purpose, Since Tkinter is considered an outdated tool for building GUIs, I plan to redo it using more modern frameworks like Flask, Django, or FastAPI as well as include Machine Learning capabilities. (If I have the time that is).
-
Load CSV Files: Users can select and load CSV files into the application for analysis.
-
Exploratory Data Analysis (EDA): The application provides detailed EDA capabilities, including:
- Summary statistics of the dataset.
- Visualization of missing values.
- Detailed analysis of individual columns, including histograms and common values.
-
Data Transformation: Users can perform various data transformations, including:
- Handling missing values (mean, median, or removal).
- Encoding categorical columns using label encoding.
- Renaming and removing columns.
- Removing duplicates from the dataset.
-
Data Visualization: The application includes several visualization options:
- Histograms for numerical data.
- Stacked bar charts for categorical data.
- Scatter plots to visualize interactions between two columns.
-
Correlation Analysis: Users can visualize correlations between different columns in the dataset using heatmaps.
-
Interaction Analysis: Users can explore interactions between different features in the dataset through scatter plots.
-
User-Friendly Interface: The application is designed with a simple and intuitive interface, making it accessible for users with varying levels of expertise in data science.
- Python 3.x
- Required libraries:
- NumPy
- Pandas
- Matplotlib
- Seaborn
- Tkinter
- scikit-learn
- Pillow
- Clone the repository:
git clone https://github.com/yourusername/Supervised_ML_Helper.git
- Navigate to the project directory:
cd Supervised_ML_Helper
- Install the required libraries:
pip install -r requirements.txt
- You can run the application using the provided Python file for the GUI:
python Supervised_ML_Classifier_python.py
- Alternatively, there is a Jupyter Notebook available for additional analysis and exploration of the dataset.
- Follow the on-screen instructions to load your dataset and explore the various features of the application.
Here are some screenshots of the application in action: