As a new concept that emerged in the middle of 1990's, data mining can help researchers gain both novel and deep insights and can facilitate unprecedented understanding of large biomedical datasets. Data mining can uncover new biomedical and healthcare knowledge for clinical and administrative decision making as well as generate scientific hypotheses from large experimental data, clinical databases, and/or biomedical literature. This review first introduces data mining in general (e.g., the background, definition, and process of data mining), discusses the major differences between statistics and data mining and then speaks to the uniqueness of data mining in the biomedical and healthcare fields. A brief summarization of various data mining algorithms used for classification, clustering, and association as well as their respective advantages and drawbacks is also presented. Suggested guidelines on how to use data mining algorithms in each area of classification, clustering, and association are offered along with three examples of how data mining has been used in the healthcare industry. Given the successful application of data mining by health related organizations that has helped to predict health insurance fraud and under-diagnosed patients, and identify and classify at-risk people in terms of health with the goal of reducing healthcare cost, we introduce how data mining technologies (in each area of classification, clustering, and association) have been used for a multitude of purposes, including research in the biomedical and healthcare fields. A discussion of the technologies available to enable the prediction of healthcare costs (including length of hospital stay), disease diagnosis and prognosis, and the discovery of hidden biomedical and healthcare patterns from related databases is offered along with a discussion of the use of data mining to discover such relationships as those between health conditions and a disease, relationships among diseases, and relationships among drugs. The article concludes with a discussion of the problems that hamper the clinical use of data mining by health professionals.
The plethora of biomedical relations which are embedded in medical logs (records) demands researchers' attention. Previous theoretical and practical focuses were restricted on traditional machine learning techniques. However, these methods are susceptible to the issues of “vocabulary gap” and data sparseness and the unattainable automation process in feature extraction. To address aforementioned issues, in this work, we propose a multichannel convolutional neural network (MCCNN) for automated biomedical relation extraction. The proposed model has the following two contributions: (1) it enables the fusion of multiple (e.g., five) versions in word embeddings; (2) the need for manual feature engineering can be obviated by automated feature learning with convolutional neural network (CNN). We evaluated our model on two biomedical relation extraction tasks: drug-drug interaction (DDI) extraction and protein-protein interaction (PPI) extraction. For DDI task, our system achieved an overall f-score of 70.2% compared to the standard linear SVM based system (e.g., 67.0%) on DDIExtraction 2013 challenge dataset. And for PPI task, we evaluated our system on Aimed and BioInfer PPI corpus; our system exceeded the state-of-art ensemble SVM system by 2.7% and 5.6% on f-scores.
The state-of-the-art methods for protein-protein interaction (PPI) extraction are primarily based on kernel methods, and their performances strongly depend on the handcraft features. In this paper, we tackle PPI extraction by using convolutional neural networks (CNN) and propose a shortest dependency path based CNN (sdpCNN) model. The proposed method (1) only takes the sdp and word embedding as input and (2) could avoid bias from feature selection by using CNN. We performed experiments on standard Aimed and BioInfer datasets, and the experimental results demonstrated that our approach outperformed state-of-the-art kernel based methods. In particular, by tracking the sdpCNN model, we find that sdpCNN could extract key features automatically and it is verified that pretrained word embedding is crucial in PPI task.
Background Respiratory syncytial virus (RSV) is among the most important causes of acute lower respiratory tract infection (ALRI) in young children. We assessed the severity of RSV-ALRI in children less than 5 years old with bronchopulmonary dysplasia (BPD). Methods We searched for studies using EMBASE, Global Health, and MEDLINE. We assessed hospitalization risk, intensive care unit (ICU) admission, need for oxygen supplementation and mechanical ventilation, and in-hospital case fatality (hCFR) among children with BPD compared with those without (non-BPD). We compared the (1) length of hospital stay (LOS) and (2) duration of oxygen supplementation and mechanical ventilation between the groups. Results Twenty-nine studies fulfilled our inclusion criteria. The case definition for BPD varied substantially in the included studies. Risks were higher among children with BPD compared with non-BPD: RSV hospitalization (odds ratio [OR], 2.6; 95% confidence interval [CI], 1.7–4.2; P < .001), ICU admission (OR, 2.9; 95% CI, 2.3–3.5; P < .001), need for oxygen supplementation (OR, 4.2; 95% CI, .5–33.7; P = .175) and mechanical ventilation (OR, 8.2; 95% CI, 7.6–8.9; P < .001), and hCFR (OR, 12.8; 95% CI, 9.4–17.3; P < .001). Median LOS (range) was 7.2 days (4–23) (BPD) compared with 2.5 days (1–30) (non-BPD). Median duration of oxygen supplementation (range) was 5.5 days (0–21) (BPD) compared with 2.0 days (0–26) (non-BPD). The duration of mechanical ventilation was more often longer (>6 days) in those with BPD compared with non-BPD (OR, 11.9; 95% CI, 1.4–100; P = .02). Conclusions The risk of severe RSV disease is considerably higher among children with BPD. There is an urgent need to establish standardized BPD case definitions, review the RSV prophylaxis guidelines, and encourage more specific studies on RSV infection in BPD patients, including vaccine development and RSV-specific treatment.
Background: To learn from errors, electronic patient safety event reporting systems (e-reporting systems) have been widely adopted to collect medical incidents from the frontline practitioners in US hospitals. However, two issues of underreporting and low-quality of reports pervade and thus the system effectiveness remains dubious.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.