ImportancePreprints have been widely adopted to enhance the timely dissemination of research across many scientific fields. Concerns remain that early, public access to preliminary medical research has the potential to propagate misleading or faulty research that has been conducted or interpreted in error.ObjectiveTo evaluate the concordance among study characteristics, results, and interpretations described in preprints of clinical studies posted to medRxiv that are subsequently published in peer-reviewed journals (preprint-journal article pairs).Design, Setting, and ParticipantsThis cross-sectional study assessed all preprints describing clinical studies that were initially posted to medRxiv in September 2020 and subsequently published in a peer-reviewed journal as of September 15, 2022.Main Outcomes and MeasuresFor preprint-journal article pairs describing clinical trials, observational studies, and meta-analyses that measured health-related outcomes, the sample size, primary end points, corresponding results, and overarching conclusions were abstracted and compared. Sample size and results from primary end points were considered concordant if they had exact numerical equivalence.ResultsAmong 1399 preprints first posted on medRxiv in September 2020, a total of 1077 (77.0%) had been published as of September 15, 2022, a median of 6 months (IQR, 3-8 months) after preprint posting. Of the 547 preprint-journal article pairs describing clinical trials, observational studies, or meta-analyses, 293 (53.6%) were related to COVID-19. Of the 535 pairs reporting sample sizes in both sources, 462 (86.4%) were concordant; 43 (58.9%) of the 73 pairs with discordant sample sizes had larger samples in the journal publication. There were 534 pairs (97.6%) with concordant and 13 pairs (2.4%) with discordant primary end points. Of the 535 pairs with numerical results for the primary end points, 434 (81.1%) had concordant primary end point results; 66 of the 101 discordant pairs (65.3%) had effect estimates that were in the same direction and were statistically consistent. Overall, 526 pairs (96.2%) had concordant study interpretations, including 82 of the 101 pairs (81.2%) with discordant primary end point results.Conclusions and RelevanceMost clinical studies posted as preprints on medRxiv and subsequently published in peer-reviewed journals had concordant study characteristics, results, and final interpretations. With more than three-fourths of preprints published in journals within 24 months, these results may suggest that many preprints report findings that are consistent with the final peer-reviewed publications.
Although deep learning analysis of diagnostic imaging has shown increasing effectiveness in modeling non-small cell lung cancer (NSCLC) outcomes, a minority of proposed deep learning algorithms have been externally validated. Given a majority of these models are built on single institutional datasets, their generalizability across the entire population remains understudied. Moreover, the effect of biases that exist among institutional training dataset on overall generalizability of deep learning prognostic models is unclear. We attempted to identify demographic and clinical characteristics which if over-represented within training data could affect the generalizability of deep learning models aimed at predicting survival in patients with non-small cell lung cancer (NSCLC). Using a dataset of pre-treatment CT images of 422 patients diagnosed with non-small cell lung cancer (NSCLC), we examined deep learning model performance across demographic and tumor specific factors. Demographic factors of interest included age and gender. Clinical factors of interest included tumor histology, overall stage, T-Stage, and N-Stage. The effect of bias among training data was examined by varying the representation of demographic and clinical populations within the training and validation datasets. Model generalizability was measured by comparing AUC values among validation datasets (biased versus unbiased). AUC was estimated using 1,000 bootstrapped samples of 400 patients from validation cohorts. We found training datasets with biased representation of NSCLC histologist to be associated with greatest decrease in generalizability. Specifically, we found over-representation of adenocarcinoma within training datasets to be associated with an AUC reduction of 0.320 (0.296 - 0.344 CI, p<.001). Similarly over-representation of squamous cell carcinoma was associated with an AUC reduction of 0.177 (0.156 - 0.201 CI, p<.001). Biases in age (AUC 0.103, p<0.001), T stage (0.170, p=0.01 ), and N stage (0.120, p= 0.01) were also associated with reduced generalizability among deep learning models. Gender bias within training data was not associated with decreases in generalizability. Deep learning models of non-small cell lung cancer outcomes fail to generalize if trained on bias datasets. Specifically, overrepresentation of histologic subtypes may decrease the generalizability of deep learning models for NSCLC. Efforts to assure training data is representative of population demographics may lead to improved generalizability across more diverse patient populations. Citation Format: Aidan Gilson, Justin Du, Guneet Janda, Sachin Umrao, Marina Joel, Rachel Choi, Roy Herbst, Harlan Krumholz, Sanjay Aneja. The impact of phenotypic bias in the generalizability of deep learning models in non-small cell lung cancer [abstract]. In: Proceedings of the AACR Virtual Special Conference on Artificial Intelligence, Diagnosis, and Imaging; 2021 Jan 13-14. Philadelphia (PA): AACR; Clin Cancer Res 2021;27(5_Suppl):Abstract nr PO-074.
Purpose: Although deep learning (DL) models have shown increasing ability to accurately classify diagnostic images in oncology, significantly large amounts of well-curated data are often needed to match human level performance. Given the relative paucity of imaging datasets for less prevalent cancer types, there is an increasing need for methods which can improve the performance of deep learning models trained using limited diagnostic images. Deep metric learning (DML) is a potential method which can improve accuracy in deep learning models trained on limited datasets. Leveraging a triplet-loss function, DML exponentially increases training data compared to a traditional DL model. In this study, we investigated the utility of DML to improve the accuracy of DL models trained to classify cancerous lesions found on screening mammograms. Methods: Using a dataset of 2620 lesions found on routine screening mammogram, we trained both a traditional DL and DML models to classify suspicious lesions as cancerous or benign. The VGG16 architecture was used as the basis for the DL and DML models. Model performance was compared by calculating model accuracy, sensitivity, and specificity on a blinded test set of 378 lesions. In addition to individual model performance, we also measured agreement accuracy when both the DL and DML models were combined. Sub-analyses were conducted to identify phenotypes which were best suited for each model type. Both models underwent hyperparameters optimization to identify ideal batch size, learning rate, and regularization to prevent overfitting. Results: We found that the combination of the traditional DL model with DML model resulted in the highest overall accuracy (78.7%) representing a 7.1% improvement compared to the traditional DL model (p<.001). Alone, the traditional DL model had an improved accuracy compared to the DML model (71.4% vs 66.4%). The traditional DL model had a higher sensitivity (94.8% vs 73.6 %) , but lower specificity (34.7% vs 55.1%) compared the DML model. Sub-analyses suggested the traditional DL model was more accurate on higher density breasts, whereas the DML model was more accurate on lower density breasts. Additionally, the traditional DL model had the highest accuracy on oval shaped lesions, compared to the DML model which was most accurate on irregularly shaped breast lesions. Conclusion: Our study suggests that addition of DML models with traditional DL models can improve diagnostic image classification performance in cancer. Our results suggest DML models may provide increased specificity and help with classification of unique populations often misclassified by traditional DL models. Further studied investigating the utility of DML on other cancer imaging tasks are necessary to successfully build more robust DL models in cancer imaging. Citation Format: Justin Du, Sachin Umrao, Enoch Chang, Marina Joel, Aidan Gilson, Guneet Janda, Rachel Choi, Yongfeng Hui, Sanjay Aneja. The utility of deep metric learning for breast cancer identification on mammographic images [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 184.
radiation oncology and diagnostic radiology departments. Given the enormity of this expenditure, it is imperative to better understand the funding distribution of research topics to gain insight into funding appropriateness. Multiple studies have performed categorization of grants (including an ASTRO Grant Funding Portfolio Analysis in 2017) but these approaches have been limited to manual analysis of small corpora. Due to the high annual number of grants, there is a need for automatic and systematic extraction of research topics from grants to enable evaluation of trends, productivity, and geographic distribution. Materials/Methods: We analyzed 4346 R-type grants (excluding R25) awarded to Department of Radiation Oncology/Diagnostic Radiology funded by the National Cancer Institute and National Institute of Biomedical Imaging and Bioengineering from FY 2010-2020 using NIH ExPORTER. Preprocessing was done on 'Project Terms' to weight their importance using TF*IDF vectorization and principal component analysis. Spectral clustering was used to cluster the grants. The optimal cluster number was determined using scoring methods of intercluster distance. Manual validation was performed to verify cluster correctness. Results: We found the optimal number of 12 clusters to best represent separation of the R-type grant research directions. These clusters represent clear topics such as oncogenesis, image reconstruction, and assay development. Notable trends include increased funding of radiation technology and hepatobiliary therapy clusters averaging +$1.2M and +$1.1M annual growth, respectively, over 10 years, and decreased funding of image reconstruction and MRI clusters averaging -$0.71M and -$0.67M annually. Further analysis shows that the DNA damage cluster is most geographically skewed, with 52% of the funding going to institutions in three cities (Dallas, Houston, New Haven). Prostate cancer is also heavily geographically skewed, with the most funded city (San Francisco) having triple the funds of the second highest (Baltimore). The number of published articles per grant also shows a bias, with the workshops/conferences and image reconstruction clusters having publication rates of 35.4 and 9.9, respectively. Radiation biology research is the biggest cluster and has a publication rate of 12.7, which is lower than average for NCI (16.3), and may reflect the pace of basic vs. applied research. Conclusion:We propose an unsupervised machine learning framework to categorize grants by area of investigation, which is more efficient and systematic than manual labeling and more holistic than keyword searching. This clustering example suggests a trend from imaging-based towards therapy-based research over the past 10 years. We also identify biases in publication rate and geographic distribution of funds.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.