Human cancers are heterogenous by their cell composition and origination site. Cancer metastasis generates the conundrum of the unknown origin of migrated tumor cells. Tracing tissue of origin and tumor type in primary and metastasized cancer is vital for clinical significance. DNA methylation alterations play a crucial role in carcinogenesis and mark cell fate differentiation, thus can be used to trace tumor tissue of origin. In this study, we employed a novel tumor-type-specific hierarchical model using genome-scale DNA methylation data to develop a multilayer perceptron model, HiTAIC, to trace tissue of origin and tumor type in 27 cancers from 23 tissue sites in data from 7735 tumors with high resolution, accuracy, and specificity. In tracing primary cancer origin, HiTAIC accuracy was 99% in the test set and 93% in the external validation data set. Metastatic cancers were identified with a 96% accuracy in the external data set. HiTAIC is a user-friendly web-based application through https://sites.dartmouth.edu/salaslabhitaic/. In conclusion, we developed HiTAIC, a DNA methylation-based algorithm, to trace tumor tissue of origin in primary and metastasized cancers. The high accuracy and resolution of tumor tracing using HiTAIC holds promise for clinical assistance in identifying cancer of unknown origin.
Current Procedural Terminology Codes is a numerical coding system used to bill for medical procedures and services and crucially, represents a major reimbursement pathway. Given that Pathology services represent a consequential source of hospital revenue, understanding instances where codes may have been misassigned or underbilled is critical. Several algorithms have been proposed that can identify improperly billed CPT codes in existing datasets of pathology reports. Estimation of the fiscal impacts of these reports requires a coder (i.e., billing staff) to review the original reports and manually code them again. As the re-assignment of codes using machine learning algorithms can be done quickly, the bottleneck in validating these reassignments is in this manual re-coding process, which can prove cumbersome. This work documents the development of a rapidly deployable dashboard for examination of reports that the original coder may have misbilled. Our dashboard features the following main components: 1) a bar plot to show the predicted probabilities for each CPT code, 2) an interpretation plot showing how each word in the report combines to form the overall prediction, 3) a place for the user to input the CPT code they have chosen to assign. This dashboard utilizes the algorithms developed to accurately identify CPT codes to highlight the codes missed by the original coders. In order to demonstrate the function of this web application, we recruited pathologists to utilize it to highlight reports that had codes incorrectly assigned. We expect this application to accelerate the validation of reassigned codes through facilitating rapid review of false positive pathology reports. In the future, we will use this technology to review thousands of past cases in order to estimate the impact of underbilling has on departmental revenue.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.