Purpose: Integrative analysis combining diagnostic imaging and genomic information can uncover biological insights into lesions that are visible on radiologic images. We investigate techniques for interrogating a deep neural network trained to predict quantitative image (radiomic) features and histology from gene expression in non-small cell lung cancer (NSCLC).Approach: Using 262 training and 89 testing cases from two public datasets, deep feedforward neural networks were trained to predict the values of 101 computed tomography (CT) radiomic features and histology. A model interrogation method called gene masking was used to derive the learned associations between subsets of genes and a radiomic feature or histology class [adenocarcinoma (ADC), squamous cell, and other].Results: Overall, neural networks outperformed other classifiers. In testing, neural networks classified histology with area under the receiver operating characteristic curves (AUCs) of 0.86 (ADC), 0.91 (squamous cell), and 0.71 (other). Classification performance of radiomics features ranged from 0.42 to 0.89 AUC. Gene masking analysis revealed new and previously reported associations. For example, hypoxia genes predicted histology (>0.90 AUC). Previously published gene signatures for classifying histology were also predictive in our model (>0.80 AUC). Gene sets related to the immune or cardiac systems and cell development processes were predictive (>0.70 AUC) of several different radiomic features. AKT signaling, tumor necrosis factor, and Rho gene sets were each predictive of tumor textures.Conclusions: This work demonstrates neural networks' ability to map gene expressions to radiomic features and histology types in NSCLC and to interpret the models to identify predictive genes associated with each feature or type.
Radiogenomic studies have suggested that biological heterogeneity of tumors is reflected radiographically through visible features on magnetic resonance (MR) images. We apply deep learning techniques to map between tumor gene expression profiles and tumor morphology in pre-operative MR studies of glioblastoma patients. A deep autoencoder was trained on 528 patients, each with 12,042 gene expressions. Then, the autoencoder’s weights were used to initialize a supervised deep neural network. The supervised model was trained using a subset of 109 patients with both gene and MR data. For each patient, 20 morphological image features were extracted from contrast-enhancing and peritumoral edema regions. We found that neural network pre-trained with an autoencoder and dropout had lower errors than linear regression in predicting tumor morphology features by an average of 16.98% mean absolute percent error and 0.0114 mean absolute error, where several features were significantly different (adjusted p-value < 0.05). These results indicate neural networks, which can incorporate nonlinear, hierarchical relationships between gene expressions, may have the representational power to find more predictive radiogenomic associations than pairwise or linear methods.
The growing amount of longitudinal data for a large population of patients has necessitated the application of algorithms that can discover patterns to inform patient management. This study demonstrates how temporal patterns generated from a combination of clinical and imaging measurements improve residual survival prediction in glioblastoma patients. Temporal patterns were identified with sequential pattern mining using data from 304 patients. Along with patient covariates, the patterns were incorporated as features in logistic regression models to predict 2-, 6-, or 9-month residual survival at each visit. The modeling approach that included temporal patterns achieved test performances of 0.820, 0.785, and 0.783 area under the receiver operating characteristic curve for predicting 2-, 6-, and 9-month residual survival, respectively. This approach significantly outperformed models that used tumor volume alone (p < 0.001) or tumor volume combined with patient covariates (p < 0.001) in training. Temporal patterns involving an increase in tumor volume above 122 mm3/day, a decrease in KPS across multiple visits, moderate neurologic symptoms, and worsening overall neurologic function suggested lower residual survival. These patterns are readily interpretable and found to be consistent with known prognostic indicators, suggesting they can provide early indicators to clinicians of changes in patient state and inform management decisions.
Motivation:Cancer heterogeneity is observed at multiple biological levels. To improve our understanding of these differences and their relevance in medicine, approaches to link organ-and tissue-level information from diagnostic images and cellular-level information from genomics are needed. However, these "radiogenomic" studies often use linear, shallow models, depend on feature selection, or consider one gene at a time to map images to genes. Moreover, no study has systematically attempted to understand the molecular basis of imaging traits based on the interpretation of what the neural network has learned. These current studies are thus limited in their ability to understand the transcriptomic drivers of imaging traits, which could provide additional context for determining clinical traits, such as prognosis.Results: We present an approach based on neural networks that takes high-dimensional gene expressions as input and performs nonlinear mapping to an imaging trait. To interpret the models, we propose gene masking and gene saliency to extract learned relationships from radiogenomic neural networks. In glioblastoma patients, our models outperform comparable classifiers (>0.10 AUC) and our interpretation methods were validated using a similar model to identify known relationships between genes and molecular subtypes. We found that imaging traits had specific transcription patterns, e.g., edema and genes related to cellular invasion, and 15 radiogenomic associations were predictive of survival. We demonstrate that neural networks can model transcriptomic heterogeneity to reflect differences in imaging and can be used to derive radiogenomic associations with clinical value.Availability and implementation: https://github.com/novasmedley/deepRa diogenomics.* an MSigDB collection, where H=hallmarks, GO=Gene Ontology, and C=Canonical. † Gene sets queried from MSigDB using gene names or functions reported by previous work as keyword(s). neg.=negative.AP, see Supp. Fig. S11b. Similar to the enhancement model, vasculature, immune system and EGFR-related processes (albeit through different GO terms) were apart of the most predictive gene sets.Growth and metastasis was also found to be predictive of edema in single gene masking. RAI2, ANXA2, POSTN genes, all related to cell growth, were the top three single most predictive
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.