This paper presents an approach for the automatic association of diagnoses in Bulgarian language to ICD-10 codes. Since this task is currently performed manually by medical professionals, the ability to automate it would save time and allow doctors to focus more on patient care. The presented approach employs a fine-tuned language model (i.e. BERT) as a multi-class classification model. As there are several different types of BERT models, we conduct experiments to assess the applicability of domain and language specific model adaptation. To train our models we use a big corpora of about 350,000 textual descriptions of diagnosis in Bulgarian language annotated with ICD-10 codes. We conduct experiments comparing the accuracy of ICD-10 code prediction using different types of BERT language models. The results show that the MultilingualBERT model (Accuracy Top 1-81%; Macro F1-86%, MRR Top 5-88%) outperforms other models. However, all models seem to suffer from the class imbalance in the training dataset. The achieved accuracy of prediction in the experiments can be evaluated as very high, given the huge amount of classes and noisiness of the data. The result also provides evidence that the collected dataset and the proposed approach can be useful in building an application to help medical practitioners with this task and encourages further research to improve the prediction
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.