A Snakemake pipeline designed to identify APOBEC-style mutations from a multiple sequence alignment in FASTA format (notice that all of the sequences must be of the same length), using SNPs inferred from the alignment. The workflow consists of calling snipit to generate the CSV file with SNP information from the multiple sequence alignment and the APOBEC analysis which allows to identify potential APOBEC-driven mutations.
If you found this pipeline useful, please consider citing our work: Understanding the evolutionary dynamics of Monkeypox virus through less explored pathways (see Citation).
Snakemake is required.
The configuration file is already set to be run with the provided sequences. The general structure is provided below.
fasta_aln: # path/to/aligned_sequences.fasta
output_folder: # output folder, e.g results
snipit:
ref: # Reference sequence ID in FASTA file
apobec:
metadata: # path/to/metadata.csv
phase: # An integer to be set according to the codon phase
Please find the documentation here: snipit.
The apobec.py
script combines the FASTA file and SNP data to identify APOBEC mutation signatures (TC>TT, GA>AA, GG>AG).
To run the Snakemake pipeline use this command:
snakemake --use-conda
Results will be in the folder specified as output_folder
in config.yaml
:
mutations.txt
: List of mutations per sampleapobec.txt
: APOBEC-related mutationsapobec_count_with_metadata.txt
: APOBEC counts merged with metadatasnps.csv
: snipit output
We would like to thank and cite:
- Snipit: Aine O'Toole, snipit (2024) GitHub repository, https://github.com/aineniamh/snipit
- Snakemake: Johannes Köster, Sven Rahmann, Snakemake — a scalable bioinformatics workflow engine, Bioinformatics, Volume 28, Issue 19, October 2012, Pages 2520–2522, https://doi.org/10.1093/bioinformatics/bts480
@article{DePascali2025,
author = {Mistral De Pascali, Alessandra and Ingletto, Ludovica and Brandolini, Martina and Rocchi, Ettore and Tarozzi, Martina and Turba, Maria Elena and Casadio, Rita and Gentilini, Fabio and Gatti, Giulia and Dionisi, Laura and Colosimo, Claudia and Guerra, Massimiliano and Zannoli, Silvia and Dirani, Giorgio and Montanari, Maria Sofia and Marzucco, Anna and Grumiro, Laura and Rossini, Giada and Lazzarotto, Tiziana and Cricca, Monica and Castellani, Gastone and Sambri, Vittorio and Scagliarini, Alessandra},
title = {Understanding the evolutionary dynamics of Monkeypox virus through genomic characterization of Clade IIb strains from Emilia‑Romagna, Italy},
journal = {Scientific Reports},
volume = {15},
number = {25849},
year = {2025},
doi = {10.1038/s41598-025-11855-5},
}