The quantitative characterization of the transcriptional control by histone modifications has been challenged by many computational studies, but most of them only focus on narrow and linear genomic regions around promoters, leaving a room for improvement. We present Chromoformer, a transformer-based, three-dimensional chromatin conformation-aware deep learning architecture that achieves the state-of-the-art performance in the quantitative deciphering of the histone codes in gene regulation. The core essence of Chromoformer architecture lies in the three variants of attention operation, each specialized to model individual hierarchy of transcriptional regulation involving from core promoters to distal elements in contact with promoters through three-dimensional chromatin interactions. In-depth interpretation of Chromoformer reveals that it adaptively utilizes the long-range dependencies between histone modifications associated with transcription initiation and elongation. We also show that the quantitative kinetics of transcription factories and Polycomb group bodies can be captured by Chromoformer. Together, our study highlights the great advantage of attention-based deep modeling of complex interactions in epigenomes.
Motivation Biological pathway is an important curated knowledge of biological processes. Thus, cancer subtype classification based on pathways will be very useful to understand differences in biological mechanisms among cancer subtypes. However, pathways include only a fraction of the entire gene set, only one-third of human genes in KEGG, and pathways are fragmented. For this reason, there are few computational methods to use pathways for cancer subtype classification. Results We present an explainable deep-learning model with attention mechanism and network propagation for cancer subtype classification. Each pathway is modeled by a graph convolutional network. Then, a multi-attention-based ensemble model combines several hundreds of pathways in an explainable manner. Lastly, network propagation on pathway–gene network explains why gene expression profiles in subtypes are different. In experiments with five TCGA cancer datasets, our method achieved very good classification accuracies and, additionally, identified subtype-specific pathways and biological functions. Availability and implementation The source code is available at http://biohealth.snu.ac.kr/software/GCN_MAE. Supplementary information Supplementary data are available at Bioinformatics online.
Some of the recent studies on drug sensitivity prediction have applied graph neural networks to leverage prior knowledge on the drug structure or gene network, and other studies have focused on the interpretability of the model to delineate the mechanism governing the drug response. However, it is crucial to make a prediction model that is both knowledge-guided and interpretable, so that the prediction accuracy is improved and practical use of the model can be enhanced. We propose an interpretable model called DRPreter (drug response predictor and interpreter) that predicts the anticancer drug response. DRPreter learns cell line and drug information with graph neural networks; the cell-line graph is further divided into multiple subgraphs with domain knowledge on biological pathways. A type-aware transformer in DRPreter helps detect relationships between pathways and a drug, highlighting important pathways that are involved in the drug response. Extensive experiments on the GDSC (Genomics of Drug Sensitivity and Cancer) dataset demonstrate that the proposed method outperforms state-of-the-art graph-based models for drug response prediction. In addition, DRPreter detected putative key genes and pathways for specific drug–cell-line pairs with supporting evidence in the literature, implying that our model can help interpret the mechanism of action of the drug.
Cells survive and proliferate through complex interactions among diverse molecules across multi-omics layers. Conventional experimental approaches for identifying these interactions have built a firm foundation for molecular biology, but their scalability is gradually becoming inadequate compared to the rapid accumulation of multi-omics data measured by highthroughput technologies. Therefore, the need for data-driven computational modeling of interactions within cells has been highlighted in recent years. The complexity of multi-omics interactions is primarily due to their nonlinearity. That is, their accurate modeling requires intricate conditional dependencies, synergies, or antagonisms between considered genes or proteins, which retard experimental validations. Artificial intelligence (AI) technologies, including deep learning models, are optimal choices for handling complex nonlinear relationships between features that are scalable and produce large amounts of data. Thus, they have great potential for modeling multi-omics interactions. Although there exist many AIdriven models for computational biology applications, relatively few explicitly incorporate the prior knowledge within model architectures or training procedures. Such guidance of models by domain knowledge will greatly reduce the amount of data needed to train models and constrain their vast expressive powers to focus on the biologically relevant space. Therefore, it can enhance a model's interpretability, reduce spurious interactions, and prove its validity and utility. Thus, to facilitate further development of knowledge-guided AI technologies for the modeling of multi-omics interactions, here we review representative bioinformatics applications of deep learning models for multi-omics interactions developed to date by categorizing them by guidance mode.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.