Background Accurate and robust pathological image analysis for colorectal cancer (CRC) diagnosis is time-consuming and knowledge-intensive, but is essential for CRC patients’ treatment. The current heavy workload of pathologists in clinics/hospitals may easily lead to unconscious misdiagnosis of CRC based on daily image analyses. Methods Based on a state-of-the-art transfer-learned deep convolutional neural network in artificial intelligence (AI), we proposed a novel patch aggregation strategy for clinic CRC diagnosis using weakly labeled pathological whole-slide image (WSI) patches. This approach was trained and validated using an unprecedented and enormously large number of 170,099 patches, > 14,680 WSIs, from > 9631 subjects that covered diverse and representative clinical cases from multi-independent-sources across China, the USA, and Germany. Results Our innovative AI tool consistently and nearly perfectly agreed with (average Kappa statistic 0.896) and even often better than most of the experienced expert pathologists when tested in diagnosing CRC WSIs from multicenters. The average area under the receiver operating characteristics curve (AUC) of AI was greater than that of the pathologists (0.988 vs 0.970) and achieved the best performance among the application of other AI methods to CRC diagnosis. Our AI-generated heatmap highlights the image regions of cancer tissue/cells. Conclusions This first-ever generalizable AI system can handle large amounts of WSIs consistently and robustly without potential bias due to fatigue commonly experienced by clinical pathologists. It will drastically alleviate the heavy clinical burden of daily pathology diagnosis and improve the treatment for CRC patients. This tool is generalizable to other cancer diagnosis based on image recognition.
Machine-assisted pathological recognition has been focused on supervised learning (SL) that suffers from a significant annotation bottleneck. We propose a semi-supervised learning (SSL) method based on the mean teacher architecture using 13,111 whole slide images of colorectal cancer from 8803 subjects from 13 independent centers. SSL (~3150 labeled, ~40,950 unlabeled; ~6300 labeled, ~37,800 unlabeled patches) performs significantly better than the SL. No significant difference is found between SSL (~6300 labeled, ~37,800 unlabeled) and SL (~44,100 labeled) at patch-level diagnoses (area under the curve (AUC): 0.980 ± 0.014 vs. 0.987 ± 0.008, P value = 0.134) and patient-level diagnoses (AUC: 0.974 ± 0.013 vs. 0.980 ± 0.010, P value = 0.117), which is close to human pathologists (average AUC: 0.969). The evaluation on 15,000 lung and 294,912 lymph node images also confirm SSL can achieve similar performance as that of SL with massive annotations. SSL dramatically reduces the annotations, which has great potential to effectively build expert-level pathological artificial intelligence platforms in practice.
Our findings support that ASB16-AS1 and SYN2 may represent two novel functional genes underlying BMD variation. The findings provide a basis for further functional mechanistic studies.
A major challenge in translating findings from genome-wide association studies (GWAS) to biological mechanisms is pinpointing functional variants because only a very small percentage of variants associated with a given trait actually impact the trait. We used an extensive epigenetics, transcriptomics, and genetics analysis of the TBX15/WARS2 neighbourhood to prioritize this region’s best-candidate causal variants for the genetic risk of osteoporosis (estimated bone density, eBMD) and obesity (waist-hip ratio or waist circumference adjusted for body mass index). TBX15 encodes a transcription factor that is important in bone development and adipose biology. Manual curation of 692 GWAS-derived variants gave eight strong candidates for causal SNPs that modulate TBX15 transcription in subcutaneous adipose tissue (SAT) or osteoblasts, which highly and specifically express this gene. None of these SNPs were prioritized by Bayesian fine-mapping. The eight regulatory causal SNPs were in enhancer or promoter chromatin seen preferentially in SAT or osteoblasts at TBX15 intron-1 or upstream. They overlap strongly predicted, allele-specific transcription factor binding sites. Our analysis suggests that these SNPs act independently of two missense SNPs in TBX15 . Remarkably, five of the regulatory SNPs were associated with eBMD and obesity and had the same trait-increasing allele for both. We found that WARS2 obesity-related SNPs can be ascribed to high linkage disequilibrium with TBX15 intron-1 SNPs. Our findings from GWAS index, proxy, and imputed SNPs suggest that a few SNPs, including three in a 0.7-kb cluster, act as causal regulatory variants to fine-tune TBX15 expression and, thereby, affect both obesity and osteoporosis risk.
Multimodal fusion benefits disease diagnosis by providing a more comprehensive perspective. Developing algorithms is challenging due to data heterogeneity and the complex withinand between-modality associations. Deep-network-based datafusion models have been developed to capture the complex associations and the performance in diagnosis has been improved accordingly. Moving beyond diagnosis prediction, evaluation of disease mechanisms is critically important for biomedical research. Deep-network-based data-fusion models, however, are difficult to interpret, bringing about difficulties for studying biological mechanisms. In this work, we develop an interpretable multimodal fusion model, namely gCAM-CCL, which can perform automated diagnosis and result interpretation simultaneously. The gCAM-CCL model can generate interpretable activation maps, which quantify pixel-level contributions of the input features. This is achieved by combining intermediate feature maps using gradientbased weights. Moreover, the estimated activation maps are class-specific, and the captured cross-data associations are interest/label related, which further facilitates class-specific analysis and biological mechanism analysis. We validate the gCAM-CCL model on a brain imaging-genetic study, and show gCAM-CCL's performed well for both classification and mechanism analysis. Mechanism analysis suggests that during task-fMRI scans, several object recognition related regions of interests (ROIs) are first activated and then several downstream encoding ROIs get involved. Results also suggest that the higher cognition performing group may have stronger neurotransmission signaling while the lower cognition performing group may have problem in brain/neuron development, resulting from genetic variations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.