pepXML modifications are offset by one

Hi,
I think there is an issue in parsing peptidoform for pepXML file.

in this peptide hit exemple :

> <search_hit peptide="AHTMVHDQVSR" massdiff="-6.103515625E-4" calc_neutral_pep_mass="1295.604" peptide_next_aa="F" num_missed_cleavages="0" num_tol_term="2" protein_descr="gene=gltA;locus_tag=19A2747_02138;inference=ab initio prediction:Prodigal:002006,similar to AA sequence:UniProtKB:P14165;product=Citrate synthase" num_tot_proteins="1" tot_num_ions="20" hit_rank="1" num_matched_ions="6" protein="19A2747_02138_gene" peptide_prev_aa="R" is_rejected="0">
> <modification_info modified_peptide="AHTM[147.0354]VHDQVSR">
> <mod_aminoacid_mass mass="147.0354" position="4"/>
> </modification_info>
> <search_score name="hyperscore" value="15.15"/>
> <search_score name="nextscore" value="0.0"/>
> <search_score name="expect" value="3.868121e-04"/>
> </search_hit>
> 

the psm_utils.io.read_file command returns:

> AHTMV[+147.0354]HDQVSR/3

The oxidation(M) on position 4 is offset to position 5.

This might be due to the modification parsing occuring in the function "_parse_peptidoform"; specifically the line
`sequence = [(aa, modifications_dict[i] or None) for i, aa in enumerate(peptide)]`
I could be wrong but I think, this should be:
`sequence = [(aa, modifications_dict[i+1] or None) for i, aa in enumerate(peptide)]`

I hope this helps.
Thanks,




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pepXML modifications are offset by one #100

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

pepXML modifications are offset by one #100

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions