PURPOSE This study documents the creation of automated, longitudinal, and prospective data and analytics platform for breast cancer at a regional cancer center. This platform combines principles of data warehousing with natural language processing (NLP) to provide the integrated, timely, meaningful, high-quality, and actionable data required to establish a learning health system. METHODS Data from six hospital information systems and one external data source were integrated on a nightly basis by automated extract/transform/load jobs. Free-text clinical documentation was processed using a commercial NLP engine. RESULTS The platform contains 141 data elements of 7,019 patients with newly diagnosed breast cancer who received care at our regional cancer center from January 1, 2014, to June 3, 2022. Daily updating of the database takes an average of 56 minutes. Evaluation of the tuning of NLP jobs found overall high performance, with an F1 of 1.0 for 19 variables, with a further 16 variables with an F1 of > 0.95. CONCLUSION This study describes how data warehousing combined with NLP can be used to create a prospective data and analytics platform to enable a learning health system. Although upfront time investment required to create the platform was considerable, now that it has been developed, daily data processing is completed automatically in less than an hour.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.