Unsupervised Mining of HLA-I Peptidomes Reveals New Binding Motifs and Potential False Positives in the Community Database

Sricharoensuk, Chatchapon; Boonchalermvichien, Tanupat; Muanwien, Phijitra; Somparn, Poorichaya; Pisitkun, Trairak; Sriswasdi, Sira

doi:10.3389/fimmu.2022.847756

Cited by 7 publications

(3 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In addition to determining binding motifs and peptide length distributions for the different alleles expressed in a sample, motif deconvolution is useful to identify potential sources of noise in the data. 18 , 23 , 34 Noise in HLA-I peptidomics data can consist of peptides from the same sample (e.g., contaminants pulled down together with HLA-I ligands, but not binding to HLA-I molecules), peptides from other samples (e.g., contaminants due to suboptimal cleaning of MS equipment), or wrongly identified peptides occurring during the computational annotation of mass spectra.…”

Section: Resultsmentioning

confidence: 99%

Improved predictions of antigen presentation and TCR recognition with MixMHCpred2.2 and PRIME2.0 reveal potent SARS-CoV-2 CD8+ T-cell epitopes

et al. 2023

View full text Add to dashboard Cite

Section: Resultsmentioning

confidence: 99%

Improved predictions of antigen presentation and TCR recognition with MixMHCpred2.2 and PRIME2.0 reveal potent SARS-CoV-2 CD8+ T-cell epitopes

et al. 2023

View full text Add to dashboard Cite

“…venomics [90,94,95], and metaproteomics [78,97]. A variety of other studies use de novo sequencing tools to detect various classes of short or unexpected peptide sequences [80,82,83,85,92,93,96,101,102]. We also note that all seven of the tools have been used at least once in a study published independently of the original authors, suggesting that the software can be successfully used by others.…”

Section: Applications Of Deep Learning De Novo Sequencing Methodsmentioning

confidence: 78%

Deep learning methods for de novo peptide sequencing

Bittremieux,

Ananth,

Fondrie

et al. 2024

Preprint

View full text Add to dashboard Cite

Protein tandem mass spectrometry data is most often interpreted by matching observed mass spectra to a protein database derived from the reference genome of the sample being analyzed. In many application domains, however, a relevant protein database is unavailable or incomplete, and in such settings de novo sequencing is required. Since the introduction of the DeepNovo algorithm in 2017, the field of de novo sequencing has been dominated by deep learning methods, which use large amounts of labeled mass spectrometry data to train multi-layer neural networks to translate from observed mass spectra to corresponding peptide sequences. Here, we describe these deep learning methods, outline procedures for evaluating their performance, and discuss the challenges in the field, both in terms of methods development and evaluation protocols.

show abstract

“…The first HLA binding dataset comes from combining several mass spectrometry-based mono-allelic HLA peptidomics studies ( Abelin et al 2017 , 2019, Marco et al 2017 , Solleder et al 2019 , Sarkizova et al 2020 ) with peptide–HLA pairs curated by the Immune Epitope Database (IEDB) ( Vita et al 2018 ). Duplicated peptide–HLA pairs and peptides with modifications were removed.…”

Section: Methodsmentioning

confidence: 99%

MHCSeqNet2—improved peptide-class I MHC binding prediction for alleles with low data

Wongklaew,

Sriswasdi,

Chuangsuwanich

2023

Bioinformatics

Self Cite

View full text Add to dashboard Cite

Motivation The binding of a peptide antigen to a class I major histocompatibility complex (MHC) protein is part of a key process that lets the immune system recognize an infected cell or a cancer cell. This mechanism enabled the development of peptide-based vaccines that can activate the patient’s immune response to treat cancers. Hence, the ability of accurately predict peptide-MHC binding is an essential component for prioritizing the best peptides for each patient. However, peptide-MHC binding experimental data for many MHC alleles are still lacking, which limited the accuracy of existing prediction models. Results In this study, we presented an improved version of MHCSeqNet that utilized sub-word-level peptide features, a 3D structure embedding for MHC alleles, and an expanded training dataset to achieve better generalizability on MHC alleles with small amounts of data. Visualization of MHC allele embeddings confirms that the model was able to group alleles with similar binding specificity, including those with no peptide ligand in the training dataset. Furthermore, an external evaluation suggests that MHCSeqNet2 can improve the prioritization of T cell epitopes for MHC alleles with small amount of training data. Availability and implementation The source code and installation instruction for MHCSeqNet2 is available at https://github.com/cmb-chula/MHCSeqNet2 Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

Unsupervised Mining of HLA-I Peptidomes Reveals New Binding Motifs and Potential False Positives in the Community Database

Cited by 7 publications

References 35 publications

Improved predictions of antigen presentation and TCR recognition with MixMHCpred2.2 and PRIME2.0 reveal potent SARS-CoV-2 CD8+ T-cell epitopes

Improved predictions of antigen presentation and TCR recognition with MixMHCpred2.2 and PRIME2.0 reveal potent SARS-CoV-2 CD8+ T-cell epitopes

Deep learning methods for de novo peptide sequencing

MHCSeqNet2—improved peptide-class I MHC binding prediction for alleles with low data

Contact Info

Product

Resources

About