Recent developments in spatial transcriptomics (ST) technologies have enabled the profiling of transcriptome-wide gene expression while retaining the location information of measured genes within tissues. Moreover, the corresponding high-resolution hematoxylin and eosin-stained histology images are readily available for the ST tissue sections. Since histology images are easy to obtain, it is desirable to leverage information learned from ST to predict gene expression for tissue sections where only histology images are available. Here we present HisToGene, a deep learning model for gene expression prediction from histology images. To account for the spatial dependency of measured spots, HisToGene adopts Vision Transformer, a state-of-the-art method for image recognition. The well-trained HisToGene model can also predict super-resolution gene expression. Through evaluations on 32 HER2+ breast cancer samples with 9,612 spots and 785 genes, we show that HisToGene accurately predicts gene expression and outperforms ST-Net both in gene expression prediction and clustering tissue regions using the predicted expression. We further show that the predicted super-resolution gene expression also leads to higher clustering accuracy than observed gene expression. Gene expression predicted from HisToGene enables researchers to generate virtual transcriptomics data at scale and can help elucidate the molecular signatures of tissues.
The Human BioMolecular Atlas Program (HuBMAP) aims to create a multi-scale spatial atlas of the healthy human body at single-cell resolution by applying advanced technologies and disseminating resources to the community. As the HuBMAP moves past its first phase, creating ontologies, protocols and pipelines, this Perspective introduces the production phase: the generation of reference spatial maps of functional tissue units across many organs from diverse populations and the creation of mapping tools and infrastructure to advance biomedical research.HuBMAP was founded with the goal of establishing state-of-the-art frameworks for building spatial multiomic maps of non-diseased human organs at single-cell resolution 1 . During the first phase (2018)(2019)(2020)(2021)(2022), the priorities of the project included the validation and development of assay platforms; workflows for data processing, management, exploration and visualization; and the establishment of protocols, quality control standards and standard operating procedures. Extensive infrastructure was established through a coordinated effort among the various HuB-MAP integration, visualization and engagement teams, tissue-mapping centres, technology and tools development and rapid technology implementation teams and working groups 1 . Single-cell maps, predominantly consisting of two-dimensional (2D) spatial data as well as data from dissociated cells, were generated for several organs. The HuBMAP Data Portal (https://portal.hubmapconsortium.org) was established for open access to experimental tissue data and reference atlas data.The infrastructure was augmented with software tools for tissue data registration, processing, annotation, visualization, cell segmentation and automated annotation of cell types and cellular neighbourhoods from spatial data. Computational methods were developed for integrating multiple data types across scales and interpretation 2 . Standard reference terminology and a common coordinate framework spanning anatomical to biomolecular scales were established to ensure interoperability across organs, research groups and consortia 3 . Guidelines to capture high-quality multiplexed spatial data 4 were established including validated panels of cell-and structure-specific antibodies 5 . The first phase produced a large number of manuscripts (https://commonfund.nih.gov/ publications?pid=43) including spatially resolved single-cell maps [6][7][8][9][10][11] .The production phase of HuBMAP was launched in the autumn of 2022. The focus is on scaling data production spanning diverse biological variables (for example, age and ethnicity) and deployment and enhancement of analytical, visualization and navigational tools to generate high-resolution 3D accessible maps of major functional tissue units from more than 20 organs. This phase involves over 60 institutions and 400 researchers with opportunities for active intra-and inter-consortia collaborations and building a foundational resource for new biological insights and precision medicine. Below, ...
Recent progress in machine learning provides competitive methods for bioinformatics in many traditional topics, such as transcriptomes sequence and single-cell analysis. However, discovering biomedical correlation of cells that are present across large-scale data sets remains challenging. Our attention-based neural network module with 300 million parameters is able to capture biological knowledge in a data-driven way. The module contains high-quality embedding, taxonomy analysis and similarity measurement. We tested the model on Mouse Brain Atlas, which consists of 160,000 cells and 25,000 genes. Our module obtained some interesting findings that have been verified by biologists and got better performance when benchmarked against autoencoder and principal components analysis.
Recent developments in spatially resolved transcriptomics (SRT) technologies have enabled the profiling of transcriptome-wide gene expression while retaining spatial location information of each measured spot within a tissue. Meanwhile, the corresponding histopathology images of tissue sections are readily available and can be aligned to the measured spots. Given that the histology images are practically more convenient and affordable to obtain, we designed HOPE2Net, a multi-layer perceptron architecture, that leverages information provided by SRT data to predict gene expression and pathway activities from histology images. Through systematic evaluations of different approaches for extracting deep image features and cellular morphology features, HOPE2Net performs feature selections from a pre-trained Vision Transformer, which is the state-of-art deep learning model for image recognition. After extracting histological image features, HOPE2Net further integrates with position embeddings, to optimize the gene expression and pathway activity prediction tasks. Through analyzing breast cancer and prostate cancer SRT datasets obtained from numerous tissue sections in multiple patients, we demonstrate that HOPE2Net can accurately predict the gene expression patterns for highly variable genes and the activities for significantly enriched domain-specific pathways. We further show that the predicted gene expression and pathway activities can help detect cancer subtypes and aid in treatment decision-makings. Given the growing interest in applying SRT in cancer genomics, we believe HOPE2Net holds the potential in identifying biomarkers from direct screenings of tissue histology images, which may be implemented in clinical studies for cancer diagnoses and decision-making processes. Citation Format: Kenong Su, Minxing Pang, Mingyao Li. HOPE2Net: Integrating histological features and position embeddings in spatially resolved transcriptomics to predict gene expression and pathway activities from histology images in tumors [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 1218.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.