With the rapid growth of information technology, the necessity for processing substantial amounts of health data using advanced information technologies is increasing. A large amount of valuable data exists in natural text such as diagnosis text, discharge summaries, online health discussions, and eligibility criteria of clinical trials. Health natural language processing, as an interdisciplinary field of natural language processing and health care, plays a substantial role in a wide scope of both methodology development and applications. This editorial shares the most recent methodology innovations of health natural language processing and applications in the medical domain published in this JMIR Medical Informatics special theme issue entitled "Health Natural Language Processing: Methodology Development and Applications".
With the rapid development of artificial intelligence (AI) technologies, and the large amount of pharmacovigilance-related data stored in an electronic manner, data-driven automatic methods need to be urgently applied to all aspects of pharmacovigilance to assist healthcare professionals. However, the quantity and quality of data directly affect the performance of AI, and there are particular challenges to implementing AI in limited-resource settings. Analyzing challenges and solutions for AI-based pharmacovigilance in resource-limited settings can improve pharmacovigilance frameworks and capabilities in these settings. In this review, we summarize the challenges into four categories: establishing a database for an AI-based pharmacovigilance system, lack of human resources, weak AI technology and insufficient government support. This study also discusses possible solutions and future perspectives on AI-based pharmacovigilance in resource-limited settings.
Background: Ulcerative colitis (UC) is a chronic nonspecific inflammatory disease of the colon and rectum with unknown etiology, and its symptoms include bloody diarrhea, abdominal pain, and hematochezia. Traditional Chinese medicine compound has a good therapeutic, multi-target effect on UC. Ganjiang decoction (GD), which is a traditional classic prescription in China, contains Zingiberis Rhizoma, Angelicae Sinensis Radix, Coptidis Rhizoma, Phellodendri Chinensis Cortex, Sanguisorbae Radix, Granati Pericarpium, and Asini Corii Colla and could be used to treat symptoms of UC. This study aimed to conduct a preliminary study before GD colon-targeted preparation, to explore the relationship between extraction method and efficacy of GD. Methods: High-performance liquid chromatography (HPLC) was used for the fingerprinting of five preparation methods of GD. HPLC and gas chromatography were used to quantitatively analyze the important chemical components of GD and compare their differences. Mice with UC induced by dextran sulphate sodium salt received the extracts from the five preparation methods of GD via gavage. Disease activity index (DAI) score, colonic length, relative weight of spleen, pathological analysis results, inflammatory factors, therapeutic effect of the five preparation methods of GD, and their relationship with extraction process were compared. Results: Cluster analysis revealed that the content of the components extracted by traditional extraction methods was significantly different from the other four methods. The third and fifth preparation methods extracted Coptidis Rhizoma and Phellodendri Chinensis Cortex with 50% ethanol to obtain more alkaloids. In the fourth and fifth methods, more volatile oils were detected by adding Zingiberis Rhizoma and Angelicae Sinensis Radix fine powder. According to DAI score, colonic length, relative weight of spleen, pathological analysis results, and inflammatory factors, the third method showed a good therapeutic effect, while the fifth method had the best therapeutic effect. Conclusions: The results showed that the difference of the five extracts of GD in the efficacy of DSS-induced UC in mice was closely related to the extraction method. Our study improved the extraction process of GD and provided a foundation for the process of enteric-soluble preparations and a new idea for traditional Chinese medicine compound preparation.
Background Eligibility criteria are the primary strategy for screening the target participants of a clinical trial. Automated classification of clinical trial eligibility criteria text by using machine learning methods improves recruitment efficiency to reduce the cost of clinical research. However, existing methods suffer from poor classification performance due to the complexity and imbalance of eligibility criteria text data. Methods An ensemble learning-based model with metric learning is proposed for eligibility criteria classification. The model integrates a set of pre-trained models including Bidirectional Encoder Representations from Transformers (BERT), A Robustly Optimized BERT Pretraining Approach (RoBERTa), XLNet, Pre-training Text Encoders as Discriminators Rather Than Generators (ELECTRA), and Enhanced Representation through Knowledge Integration (ERNIE). Focal Loss is used as a loss function to address the data imbalance problem. Metric learning is employed to train the embedding of each base model for feature distinguish. Soft Voting is applied to achieve final classification of the ensemble model. The dataset is from the standard evaluation task 3 of 5th China Health Information Processing Conference containing 38,341 eligibility criteria text in 44 categories. Results Our ensemble method had an accuracy of 0.8497, a precision of 0.8229, and a recall of 0.8216 on the dataset. The macro F1-score was 0.8169, outperforming state-of-the-art baseline methods by 0.84% improvement on average. In addition, the performance improvement had a p-value of 2.152e-07 with a standard t-test, indicating that our model achieved a significant improvement. Conclusions A model for classifying eligibility criteria text of clinical trials based on multi-model ensemble learning and metric learning was proposed. The experiments demonstrated that the classification performance was improved by our ensemble model significantly. In addition, metric learning was able to improve word embedding representation and the focal loss reduced the impact of data imbalance to model performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.