In the past decade, the substantial achievements of therapeutic cancer vaccines have shed a new light on cancer immunotherapy. The major challenge for designing potent therapeutic cancer vaccines is to identify neoantigens capable of inducing sufficient immune responses, especially involving major histocompatibility complex (MHC)-II epitopes. However, most previous studies on T-cell epitopes were focused on either ligand binding or antigen presentation by MHC rather than the immunogenicity of T-cell epitopes. In order to better facilitate a therapeutic vaccine design, in this study, we propose a revolutionary new tool: a convolutional neural network model named FIONA (Flexible Immunogenicity Optimization Neural-network Architecture) trained on IEDB datasets. FIONA could accurately predict the epitopes presented by the given specific MHC-II subtypes, as well as their immunogenicity. By leveraging the human leukocyte antigen allele hierarchical encoding model together with peptide dense embedding fusion encoding, FIONA (with AUC = 0.94) outperforms several other tools in predicting epitopes presented by MHC-II subtypes in head-to-head comparison; moreover, FIONA has unprecedentedly incorporated the capacity to predict the immunogenicity of epitopes with MHC-II subtype specificity. Therefore, we developed a reliable pipeline to effectively predict CD4+ T-cell immune responses against cancer and infectious diseases.
Summary: Genetic modifications that cause pivotal protein inactivation or abnormal activation may lead to cell signaling pathway change or even dysfunction, resulting in cancer and other diseases. In turn, dysfunction will further produce 'novel proteins' that do not exist in the canonical human proteome. Identification of novel proteins is meaningful for identifying promising drug targets and developing new therapies. In recent years, several tools have been developed for identifying DNA or RNA variants with the extensive application of nucleotide sequencing technology. However, these tools mainly focus on point mutation and have limited performance in identifying large-scale variants as well as the integration of mutations. Here we developed a hybrid Sequencing Analysis bioinformatic pipeline by integrating all relevant detection Kits(SAKit): this pipeline fully integrates all variants at the genomic and transcriptomic level that may lead to the production of novel proteins defined as proteins with novel sequences compare to all reference sequences by comprehensively analyzing the long and short reads. The analysis results of SAKit demonstrate that large-scale mutations have more contribution to the production of novel proteins than point mutations, and long-read sequencing has more advantages in large-scale mutation detection. Availability and implementation: SAKit is freely available on docker image (https://hub.docker.com/repository/docker/therarna/sakit), which is mainly implemented within a Snakemake framework in Python language.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.