In this research, we explored various state-of-the-art biomedical-specific pre-trained Bidirectional Encoder Representations from Transformers (BERT) models for the National Library of Medicine - Chemistry (NLM CHEM) and LitCovid tracks in the BioCreative VII Challenge, and propose a BERT-based ensemble learning approach to integrate the advantages of various models to improve the system’s performance. The experimental results of the NLM-CHEM track demonstrate that our method can achieve remarkable performance, with F1-scores of 85% and 91.8% in strict and approximate evaluations, respectively. Moreover, the proposed Medical Subject Headings identifier (MeSH ID) normalization algorithm is effective in entity normalization, which achieved a F1-score of about 80% in both strict and approximate evaluations. For the LitCovid track, the proposed method is also effective in detecting topics in the Coronavirus disease 2019 (COVID-19) literature, which outperformed the compared methods and achieve state-of-the-art performance in the LitCovid corpus. Database URL: https://www.ncbi.nlm.nih.gov/research/coronavirus/.
Private entrepreneurs and government organizations widely adopt Facebook fan pages as an online social platform to communicate with the public. Posting on the platform to attract people’s comments and shares is an effective way to increase public engagement. Moreover, the comment functions allow users who have read the posts to express their thoughts. Hence, it also enables us to understand the users’ emotional feelings regarding that post by analyzing the comments. The goal of this study is to investigate the public image of organizations by exploring the content on fan pages. In order to efficiently analyze the enormous amount of public opinion data generated from social media, we propose a Bi-directional Long Short-Term Memory (BiLSTM) that can model detailed sentiment information hidden in those words. It first forecasts the sentiment information in terms of Valence and Arousal (VA) values of the smallest unit in a text, and later fuses this into a deep learning model to further analyze the sentiment of the whole text. Experiments show that our model can achieve state-of-the-art performance in terms of predicting the VA values of words. Additionally, combining VA with a BiLSTM model results in a boost of the performance for social media text sentiment analysis. Our method can assist governments or other organizations to improve their effectiveness in social media operations through the understanding of public opinions on related issues.
The BioCreative National Library of Medicine (NLM)-Chem track calls for a community effort to fine-tune automated recognition of chemical names in the biomedical literature. Chemicals are one of the most searched biomedical entities in PubMed, and—as highlighted during the coronavirus disease 2019 pandemic—their identification may significantly advance research in multiple biomedical subfields. While previous community challenges focused on identifying chemical names mentioned in titles and abstracts, the full text contains valuable additional detail. We, therefore, organized the BioCreative NLM-Chem track as a community effort to address automated chemical entity recognition in full-text articles. The track consisted of two tasks: (i) chemical identification and (ii) chemical indexing. The chemical identification task required predicting all chemicals mentioned in recently published full-text articles, both span [i.e. named entity recognition (NER)] and normalization (i.e. entity linking), using Medical Subject Headings (MeSH). The chemical indexing task required identifying which chemicals reflect topics for each article and should therefore appear in the listing of MeSH terms for the document in the MEDLINE article indexing. This manuscript summarizes the BioCreative NLM-Chem track and post-challenge experiments. We received a total of 85 submissions from 17 teams worldwide. The highest performance achieved for the chemical identification task was 0.8672 F-score (0.8759 precision and 0.8587 recall) for strict NER performance and 0.8136 F-score (0.8621 precision and 0.7702 recall) for strict normalization performance. The highest performance achieved for the chemical indexing task was 0.6073 F-score (0.7417 precision and 0.5141 recall). This community challenge demonstrated that (i) the current substantial achievements in deep learning technologies can be utilized to improve automated prediction accuracy further and (ii) the chemical indexing task is substantially more challenging. We look forward to further developing biomedical text–mining methods to respond to the rapid growth of biomedical literature. The NLM-Chem track dataset and other challenge materials are publicly available at https://ftp.ncbi.nlm.nih.gov/pub/lu/BC7-NLM-Chem-track/. Database URL https://ftp.ncbi.nlm.nih.gov/pub/lu/BC7-NLM-Chem-track/
We aimed to develop and validate a model for predicting mortality in patients with angina across the spectrum of dysglycemia. A total of 1479 patients admitted for coronary angiography due to angina were enrolled. All-cause mortality served as the primary endpoint. The models were validated with five-fold cross validation to predict long-term mortality. The features selected by least absolute shrinkage and selection operator (LASSO) were age, heart rate, plasma glucose levels at 30 min and 120 min during an oral glucose tolerance test (OGTT), the use of angiotensin II receptor blockers, the use of diuretics, and smoking history. This best performing model was built using a random survival forest with selected features. It had a good discriminative ability (Harrell’s C-index: 0.829) and acceptable calibration (Brier score: 0.08) for predicting long-term mortality. Among patients with obstructive coronary artery disease confirmed by angiography, our model outperformed the Global Registry of Acute Coronary Events discharge score for mortality prediction (Harrell’s C-index: 0.829 vs. 0.739, p < 0.001). In conclusion, we developed a machine learning model to predict long-term mortality among patients with angina. With the integration of OGTT, the model could help to identify a high risk of mortality across the spectrum of dysglycemia.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.