Click here to view the results
HLA-PepClust is a CLI
tool designed for clustering peptide sequences based on their HLA binding motifs.
Ensure your system meets the following requirements:
- Python 3.9 or higher
pip
(Python package manager)
git clone https://github.com/Sanpme66/HLA-PepClust.git
cd HLA-PepClust/
-
Create a virtual environment
python3 -m venv hlapepclust-env
-
Activate the virtual environment
- macOS / Linux:
source hlapepclust-env/bin/activate
- Windows:
.\hlapepclust-env\Scripts\activate
- macOS / Linux:
-
Upgrade
pip
pip install --upgrade pip
-
Navigate to the project directory (if not already in it)
cd HLA-PepClust/
-
Install the package and dependencies
pip install -e .
To see available options and usage details:
clust-search -h
Example of running the help clust-search -h
command:
clust-search <input_data_path> <reference_data_path> \
--hla_types <hla_types> \
--n_clusters <number_of_clusters> \
--output <output_path> \
--species <Human or Mouse> \
--threshold <similarity_threshold (default: 0.5)> \
--log \
--processes <number_of_threads>
clust-search data/D90_HLA_3844874 data/ref_data/Gibbs_motifs_human/output_matrices_human \
--hla_types A0201,A0101,B1302,B3503,C0401 \
--n_clusters 6 \
--species human \
--output test_results \
--processes 4 \
--threshold 0.6
Argument | Type | Description | Default |
---|---|---|---|
gibbs_folder |
str |
Path to test folder containing matrices. | Required |
reference_folder |
str |
Path to reference folder containing matrices. | Required |
-o, --output |
str |
Path to output folder. | "output" |
-hla, --hla_types |
list |
List of HLA types to search. | All |
-p, --processes |
int |
Number of parallel processes to use. | 4 |
-n, --n_clusters |
int |
Number of clusters to search for. | "all" |
-t, --threshold |
float |
Motif similarity threshold. | 0.5 |
-s, --species |
str |
Species to search [Human, Mouse]. | "human" |
-db, --database |
str |
Generate a motif database from a configuration file. | "data/config.json" |
-k, --best_KL |
bool |
Find the best KL divergence only. | False |
-l, --log |
bool |
Enable logging. | False |
-im, --immunolyser |
bool |
Enable immunolyser output. | False |
-c, --credits |
bool |
Show credits for the motif database pipeline. | False |
-v, --version |
bool |
Show the version of the pipeline. | False |
Example of running the clust-search
command:
After finishing, deactivate the virtual environment with:
deactivate
You can try out HLA-PepClust directly in Google Colab without installing anything on your local system!
- Click the "Open In Colab" button above.
- Once the notebook opens in Colab, go to Runtime → Run All to execute all cells.
- Modify input parameters if needed and run the pipeline in Colab.
Example of input
folder path:
More detailed instructions coming soon... 🚀