ConvolutionLab

ConvolutionLab is a research-driven project that leverages machine learning to develop and test predictive models for financial trading. At its core, the project focuses on Leavitt Convolution, a linear regression-based smoothing technique, to forecast market trends and improve trading decisions.

Description

The project integrates residual error correction models to enhance the accuracy of Leavitt-based projections, providing robust tools for price forecasting and trend-following strategies. By combining these techniques with state-of-the-art machine learning methods, ConvolutionLab aims to deliver actionable insights for traders and quantitative researchers.

Clear pipeline-based architecture for training and prediction.
REST API and web applications for serving predictions.
Robust testing for ensuring reliability.

Features

Modular pipelines for data ingestion, transformation, model selection, and prediction.
REST API and Flask-based web UI for prediction services.
Logging and error handling with customizable configurations.
Comprehensive test suite with coverage reports.
Leavitt Convolution Integration: Implements advanced smoothing and projection techniques for 1-bar ahead forecasting.
Residual Error Correction: Enhances the accuracy of projections through machine learning-based adjustments.
Trend Analysis: Identifies turning points and market direction using convolution probability functions.
Backtesting and Evaluation: Tests the performance of predictive models on historical data to validate trading strategies.

Datasets

Source: [Provide dataset source here, e.g., Hugging Face Dataset Library]
Preprocessing: Includes normalization, handling missing values, and feature engineering.
Licensing: [Specify dataset licensing details here]

Model

Algorithms: Implements Random Forest, Linear Regression, XGBoost, CatBoost, and more.
Hyperparameters: Configurable via YAML/JSON files.
Evaluation: Selects the best model based on R² scores across various algorithms.

Results

Performance Metrics:
- Random Forest: R² = 0.845
- Linear Regression: R² = 0.885
- XGBoost: R² = 0.875
Artifacts:
- Best Model: Saved as model.pkl in the artifacts/ directory.
- Preprocessor: Saved as preprocessor.pkl.

Installation

Clone the repository:

git clone [repository-url]
cd ml_project_template

Install dependencies:
```
pip install -r requirements.txt
```

(Optional) Set up a virtual environment:

python -m venv venv  
source venv/bin/activate  # On Linux/Mac  
venv\Scripts\activate     # On Windows

Set environment file

Copy or rename the example_env file to .env before running
```
cp example_env .env
```

Usage

The project supports two primary workflows:

Data Ingestion (Download & Prepare Datasets)
Model Training (Train ML Models)

These workflows can be executed via command-line arguments.

1. Running Data Ingestion

To ingest data, use the following command:

python launch.py ingest --config config/ingestion_config.yaml --debug

Available Options

Argument	Description	Default
`--config`	(Optional) Path to ingestion configuration file	None
`--debug`	(Optional) Enable debug mode	`False`

2. Running Model Training

To train a model, run:

python launch.py train --config config/model_config.yaml --debug

Available Options

Argument	Description	Default
`--config`	(Optional) Path to training configuration file	`config/model_config.yaml`
`--debug`	(Optional) Enable debug mode	`False`
`--model-type`	(Optional) Specify one or more models to train (e.g., `"RandomForest DecisionTree"`)	Runs all models if not provided
`--best-of-all`	(Optional) If set, overrides `--model-type` and trains all models to find the best one	`False`
`--save-best`	(Optional) If set, saves the best-performing model after training	`False`

Example Runs

Run data ingestion with a custom config file:

python launch.py ingest --config my_custom_ingestion.yaml

Run model training in debug mode:
```
python launch.py train --debug
```

Train a specific model (e.g., CatBoost and Linear Regression):

python launch.py train --model-type "CatBoosting Regressor" "Linear Regression"

Train all models and pick the best one automatically:
```
python launch.py train --best-of-all
```

Train all models and save the best one:

python launch.py train --best-of-all --save-best

Notes

If no --config is provided, default configurations will be used.
If --model-type is not provided, all models will be trained.
Using --best-of-all will override --model-type and automatically determine the best model.
The --save-best flag ensures that the best model is stored after training.
The script logs all activity for debugging and monitoring.

Configuration

Logging: Configurable via environment variables (LOG_LEVEL, LOG_JSON).
Hyperparameters: Adjustable in config/params.yaml.
Artifacts Directory: Defined in config/pipeline.yaml.

Data Ingestion Formats

The data ingestion process supports two formats:

OANDA API format:

time,volume,mid_o,mid_h,mid_l,mid_c,bid_o,bid_h,bid_l,bid_c,ask_o,ask_h,ask_l,ask_c
2018-01-01 22:00:00-05:00,35630,1.20039,1.20812,1.20019,1.2058,1.20009,1.20806,1.19975,1.2055,1.20069,1.20819,1.20051,1.2061

DOHLCV format:

Date,Volume,Open,High,Low,Close
2018-01-01 22:00:00+00:00,95988,1.10107,1.10309,1.098,1.10226

During the ingestion process, OANDA files will be transformed into DOHLCV format for further processing.

Project Structure

ml_project_template/  
├── artifacts/              # Model and preprocessor artifacts  
├── config/                 # Configuration files  
├── src/                    # Core project source code  
│   ├── pipeline/           # Pipelines for training and prediction  
│   ├── services/           # Modular services for ingestion, transformation, etc.  
│   ├── utils/              # Utility functions for file handling and ML helpers  
├── tests/                  # Test cases  
├── requirements.txt        # Python dependencies  
└── README.md               # Project documentation

Technologies Used

Python
Scikit-learn, XGBoost, CatBoost
FastAPI, Flask
Pytest
Docker (optional)

Automated Test Suite Documentation

For a detailed overview of the testing framework, including test categories, execution instructions, and coverage reports, refer to the Test Suite Documentation. This document provides insights into how the system is validated for robustness, correctness, and reliability across different components.

Sample Apps

For a comprehensive guide on the available sample applications, including their functionality, usage instructions, and integration details, refer to the Sample Apps Readme. This document provides an in-depth overview of each application, explaining how they interact with the system and facilitate predictions through different interfaces.

Contributing

Fork the repository.

Create a new feature branch:

git checkout -b feature/your-feature-name

Commit your changes:
```
git commit -m "Add your message"  
```

Push to the branch:

git push origin feature/your-feature-name

Submit a pull request.

License

[Specify the license here, e.g., MIT License.]

Acknowledgements

Scikit-learn documentation for algorithm support.
Community contributors for feedback and improvements.

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.vscode		.vscode
config		config
notebook		notebook
screenshots		screenshots
src		src
templates		templates
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
LSR_VS_PINV.md		LSR_VS_PINV.md
MODEL_EVAL_WORKFLOW.md		MODEL_EVAL_WORKFLOW.md
README.md		README.md
README_MOVEMENT_CLASS.md		README_MOVEMENT_CLASS.md
README_PREDICT.md		README_PREDICT.md
SAMPLE_APP_README.md		SAMPLE_APP_README.md
example_env		example_env
launch_host.py		launch_host.py
predict.py		predict.py
predict.sh		predict.sh
predict_app.py		predict_app.py
predict_fasthtml_app.py		predict_fasthtml_app.py
predict_rest_api.py		predict_rest_api.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ConvolutionLab

Table of Contents

Description

Features

Datasets

Model

Results

Installation

Usage

1. Running Data Ingestion

Available Options

2. Running Model Training

Available Options

Example Runs

Notes

Configuration

Data Ingestion Formats

Project Structure

Technologies Used

Automated Test Suite Documentation

Sample Apps

Contributing

License

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

kjpou1/ConvolutionLab

Folders and files

Latest commit

History

Repository files navigation

ConvolutionLab

Table of Contents

Description

Features

Datasets

Model

Results

Installation

Usage

1. Running Data Ingestion

Available Options

2. Running Model Training

Available Options

Example Runs

Notes

Configuration

Data Ingestion Formats

Project Structure

Technologies Used

Automated Test Suite Documentation

Sample Apps

Contributing

License

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages