In the past, the liver tumors were reported manually in an unstructured format. There actually exists much valuable knowledge in these reports for further disease risk assessment, disease recognition and treatment recommendation. Yet, it is not easy to read and mine knowledge from the unstructured reports. Hence, how to extract the knowledge from these biomedical reports effectively and efficiently has been a challenging issue in the past decades. Although a set of Natural Language Processing techniques were proposed for Bio-medical information retrieval, few related works were made on transforming the unstructured CT liver-tumor reports into structured ones. To aim at this issue, in this paper, we propose a two-stage report structuring method by integrating effective Natural Language Processing (NLP) and interpretable machine learning. For the first stage, the candidate keywords in unstructured reports are extracted. Next, the feature keywords are determined by the feature-selection technique. For the second stage, the well-known multi-classifiers are performed, and finally the reports are labeled in a refined structure format. Further, the factor keywords in the classification model are filtered to interpret the performance. In overall, the proposed report structuring method generates a hierarchical data structure, including the common features and refined features in the 1 st and 2 nd levels/stages, respectively. To reveal the performance of proposed method, a set of evaluations were conducted and the results show that, the proposed method is more promising than the fashion neural networks such as Bert (Bidirectional Encoder Representations from Transformers) in terms of effectiveness and efficiency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.