agrep_for_crossreactivity

Tool for identifying immunopeptides which may lead to cross-reactive T cell responses

This tool is designed to help assess the potential of immunopeptides (e.g. derived from pathogens) to elicit T-cell cross-reactivity due to molecular mimicry, defined as a closely related sequence including a maximum of 2 amino acid mismatches. Essentially, it is a bash script which runs an agrep command to match a list of sequences in a text file against a user-specified proteome, then runs a program called matchfinder to curate and collate the results into a readable spreadsheet. Using the agrep command for this purpose was reported previously in Knierman et al., however, no code was available to facilitate the analysis, so we made this in-house solution to assess the potential cross-reactivity of COVID immunopeptides which had been identified by mass spectrometry. While still rough around the edges, we hope this proves useful for others as well. If you use the script/program, please cite our paper:

Braun, A., Rowntree, L.C., Huang, Z. et al. Mapping the immunopeptidome of seven SARS-CoV-2 antigens across common HLA haplotypes. Nat Commun 15, 7547 (2024). https://doi.org/10.1038/s41467-024-51959-6

Compilation of matchfinder.o

cc matchfinder.c –o matchfinder.o

Instructions for running agrep_for_crossreactivity.sh - requires Linux

Copy agrep_for_crossreactivity.sh and matchfinder.o to somewhere convenient on your computer
Open agrep_for_crossreactivity.sh with a text editor and modify line 42 so that the filepath points to where you just put matchfinder.o
Make a txt file with the list of peptides of interest e.g. peptides.txt. The script has only been tested when this txt file is in the same folder as the agrep_for_crossreactivity.sh script (it probably works with a filepath to a txt file in a different location, but I haven't checked).
Get a proteome fasta file of interest.
Run with a command like the template:

./agrep_for_crossreactivity.sh peptides proteome.fasta

Note, don't write the '.txt' extension on the peptides filename; do put the .fasta extension on the proteome filename. Sorry this is a bit rough-n-ready.

This should run with some messaging until it says 'Finished'.

Output:

A folder with the name x_output where x is the name of the file containing the peptide list. This will contain a txt file of agrep output for each peptide, plus a summary table in a file named output.csv with the precise match and location for all peptides that found matches in the proteome.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENCE		LICENCE
README.md		README.md
agrep_for_crossreactivity.sh		agrep_for_crossreactivity.sh
matchfinder.c		matchfinder.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

agrep_for_crossreactivity

Tool for identifying immunopeptides which may lead to cross-reactive T cell responses

Compilation of matchfinder.o

Instructions for running agrep_for_crossreactivity.sh - requires Linux

Output:

About

Uh oh!

Releases 1

Packages

Languages

License

PurcellLab/agrep_for_crossreactivity

Folders and files

Latest commit

History

Repository files navigation

agrep_for_crossreactivity

Tool for identifying immunopeptides which may lead to cross-reactive T cell responses

Compilation of matchfinder.o

Instructions for running agrep_for_crossreactivity.sh - requires Linux

Output:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages