The transcription factor, early growth response 1 (EGR1), has important roles in various cell types in response to different stimuli. EGR1 is thought to be involved in differentiation of bovine skeletal muscle-derived satellite cells (MDSCs); however, the precise effects of EGR1 on differentiation of MDSCs and its mechanism of action remain unknown. In the present study, a
BackgroundSocial media analysis tools have been used to monitor public sentiment and communication methods during public health emergencies.Public health emergencies are required to better understand the impact of the crisis on the public and to provide reference material for the prevention of future public health emergencies. We are concentrating on the sentiments around the public health emergency created by COVID-19.ObjectiveThis study aims to better understand the impact of public health emergencies on citizens and provide reference material for future public health emergency prevention.MethodsThe Fuzzy-c-means method was used to divide the 850,083 content of Weibo from January 24, 2020, to March 31, 2020, into seven categories of emotions: fear, happiness, disgust, surprise, sadness, anger, and good. The changes in emotion were tracked over time.ResultsThe results indicated that people showed "surprise" overall (55.89%); however with time, the "surprise" decreased. As the knowledge regarding the coronavirus disease 2019 (COVID-19) increased (contents about COVID-19 knowledge: from 21.16% to 4.19%), the "surprise" of the citizens decreased (from 59.95% to 46.58%). Citizens' feelings of "fear" and "good" increased as the number of deaths associated with COVID-19 increased ("fear”: from 15.42% to 20.95% "good”: 10.31% to 18.89%). As the infection was suppressed, the feelings of "fear" and "good" diminished ("fear”: from 20.95% to 15.79% "good”: from 18.89% to 8.46%).ConclusionsIn this study, the emotions and changes in emotions of Weibo users were analyzed in chronological order. The results of this study can prepare for future public health emergencies.
BACKGROUND Modern medicine generates unstructured data containing a large amount of information. Extracting useful knowledge from this data and making scientific decisions for diagnosing and treating diseases have become increasingly necessary. Unstructured data, such as in the Medical Information Mart for Intensive Care III (MIMIC-III) dataset, contain several ambiguous words demonstrating the subjectivity of doctors. These data can be used to further improve the accuracy of medical support system assessments. OBJECTIVE We propose using fuzzy c-means (FCM) method and Gauss membership to quantify the subjective words in the clinical medical dataset MIMIC-III. METHODS Using 381,091 radiology reports collected from MIMIC-III, we extracted words representing the subjective degree from the text and converted them into corresponding membership intervals based on the words. RESULTS Consequently, the words representing each degree of each disease had a range of corresponding values. Examples of membership medians were atelectasis (2.971), pneumonia (3.121), pneumothorax (2.899), pulmonary edema (3.051), and pulmonary embolus (2.435). These membership sections can determine the symptoms of each disease. CONCLUSIONS In this study, we used the FCM and Gaussian functions to extract words from the MIMIC-III, which represent a subjective degree and cannot be processed by a computer, and performed fuzzy processing on them. It was concluded that words representing the degree in an English interpreted report can be extracted and quantified. The use of these words in medical support systems may improve diagnostic accuracy.
Background Due to the development of medical data, a large amount of clinical data has been generated. These unstructured data contain substantial information. Extracting useful knowledge from this data and making scientific decisions for diagnosing and treating diseases have become increasingly necessary. Unstructured data, such as in the Marketplace for Medical Information in Intensive Care III (MIMIC-III) data set, contain several ambiguous words that demonstrate the subjectivity of doctors, such as descriptions of patient symptoms. These data could be used to further improve the accuracy of medical diagnostic system assessments. To the best of our knowledge, there is currently no method for extracting subjective words that express the extent of these symptoms (hereinafter, “degree words”). Objective Therefore, we propose using the fuzzy c-means (FCM) method and Gaussian membership to quantify the degree words in the clinical medical data set MIMIC-III. Methods First, we preprocessed the 381,091 radiology reports collected in MIMIC-III, and then we used the FCM method to extract degree words from unstructured text. Thereafter, we used the Gaussian membership method to quantify the extracted degree words, which transform the fuzzy words extracted from the medical text into computer-recognizable numbers. Results The results showed that the digitization of ambiguous words in medical texts is feasible. The words representing each degree of each disease had a range of corresponding values. Examples of membership medians were 2.971 (atelectasis), 3.121 (pneumonia), 2.899 (pneumothorax), 3.051 (pulmonary edema), and 2.435 (pulmonary embolus). Additionally, all extracted words contained the same subjective words (low, high, etc), which allows for an objective evaluation method. Furthermore, we will verify the specific impact of the quantification results of ambiguous words such as symptom words and degree words on the use of medical texts in subsequent studies. These same ambiguous words may be used as a new set of feature values to represent the disorders. Conclusions This study proposes an innovative method for handling subjective words. We used the FCM method to extract the subjective degree words in the English-interpreted report of the MIMIC-III and then used the Gaussian functions to quantify the subjective degree words. In this method, words containing subjectivity in unstructured texts can be automatically processed and transformed into numerical ranges by digital processing. It was concluded that the digitization of ambiguous words in medical texts is feasible.
BACKGROUND Biomedical terms extracted using Word2vec, the most popular word embedding model in recent years, serve as the foundation for various natural language processing (NLP) applications, such as biomedical information retrieval, relation extraction, and recommendation systems. OBJECTIVE The objective of this study is to examine how changes in the ratio of biomedical domain to general domain data in the corpus affect the extraction of similar biomedical terms using Word2vec. METHODS We downloaded abstracts of 214892 articles from PubMed Central (PMC) and the 3.9 GB Billion Word (BW) benchmark corpus from the computer science community. The datasets were preprocessed and grouped into 11 corpora based on the ratio of BW to PMC, ranging from 0:10 to 10:0, and then Word2vec models were trained on these corpora. The cosine similarities between the biomedical terms obtained from the Word2vec models were then compared in each model. RESULTS The results indicated that the models trained with both BW and PMC data outperformed the model trained only with medical data. The similarity between the biomedical terms extracted by the Word2vec model increased, when the ratio of biomedical domain to general domain data was 3: 7 to 5: 5. CONCLUSIONS This study allows NLP researchers to apply Word2vec based on more information and increase the similarity of extracted biomedical terms to improve their effectiveness in NLP applications, such as biomedical information extraction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.