Skip to content

Physics4MedicineLab/APOBECSeeker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

APOBECSeeker

A Snakemake pipeline designed to identify APOBEC-style mutations from a multiple sequence alignment in FASTA format (notice that all of the sequences must be of the same length), using SNPs inferred from the alignment. The workflow consists of calling snipit to generate the CSV file with SNP information from the multiple sequence alignment and the APOBEC analysis which allows to identify potential APOBEC-driven mutations.

If you found this pipeline useful, please consider citing our work: Understanding the evolutionary dynamics of Monkeypox virus through less explored pathways (see Citation).

Prerequisites

Snakemake is required.

Configuration File (config.yaml)

The configuration file is already set to be run with the provided sequences. The general structure is provided below.

fasta_aln: # path/to/aligned_sequences.fasta
output_folder: # output folder, e.g results

snipit:
  ref: # Reference sequence ID in FASTA file

apobec:
  metadata: # path/to/metadata.csv
  phase: # An integer to be set according to the codon phase

snipit

Please find the documentation here: snipit.

apobec.py script

The apobec.py script combines the FASTA file and SNP data to identify APOBEC mutation signatures (TC>TT, GA>AA, GG>AG).

Usage

To run the Snakemake pipeline use this command:

snakemake --use-conda

Outputs

Results will be in the folder specified as output_folder in config.yaml:

  • mutations.txt: List of mutations per sample
  • apobec.txt: APOBEC-related mutations
  • apobec_count_with_metadata.txt: APOBEC counts merged with metadata
  • snps.csv: snipit output

Acknowledgements

We would like to thank and cite:

Citation

@article{DePascali2025,
  author = {Mistral De Pascali, Alessandra and Ingletto, Ludovica and Brandolini, Martina and Rocchi, Ettore and Tarozzi, Martina and Turba, Maria Elena and Casadio, Rita and Gentilini, Fabio and Gatti, Giulia and Dionisi, Laura and Colosimo, Claudia and Guerra, Massimiliano and Zannoli, Silvia and Dirani, Giorgio and Montanari, Maria Sofia and Marzucco, Anna and Grumiro, Laura and Rossini, Giada and Lazzarotto, Tiziana and Cricca, Monica and Castellani, Gastone and Sambri, Vittorio and Scagliarini, Alessandra},
  title = {Understanding the evolutionary dynamics of Monkeypox virus through genomic characterization of Clade IIb strains from Emilia‑Romagna, Italy},
  journal = {Scientific Reports},
  volume = {15},
  number = {25849},
  year = {2025},
  doi = {10.1038/s41598-025-11855-5},
}

About

Pipeline for the identification of APOBEC-style mutations from multiple sequence alignment

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages