Objective. The aim of this study was to provide a definition of big data in healthcare. Methods. A systematic search of PubMed literature published until May 9, 2014, was conducted. We noted the number of statistical individuals (n) and the number of variables (p) for all papers describing a dataset. These papers were classified into fields of study. Characteristics attributed to big data by authors were also considered. Based on this analysis, a definition of big data was proposed. Results. A total of 196 papers were included. Big data can be defined as datasets with Log(n∗p) ≥ 7. Properties of big data are its great variety and high velocity. Big data raises challenges on veracity, on all aspects of the workflow, on extracting meaningful information, and on sharing information. Big data requires new computational methods that optimize data management. Related concepts are data reuse, false knowledge discovery, and privacy issues. Conclusion. Big data is defined by volume. Big data should not be confused with data reuse: data can be big without being reused for another purpose, for example, in omics. Inversely, data can be reused without being necessarily big, for example, secondary use of Electronic Medical Records (EMR) data.
Adverse drug events (ADEs) are a public health issue. Their detection usually relies on voluntary reporting or medical chart reviews. The objective of this paper is to automatically detect cases of ADEs by data mining. 115,447 complete past hospital stays are extracted from six French, Danish, and Bulgarian hospitals using a common data model including diagnoses, drug administrations, laboratory results, and free-text records. Different kinds of outcomes are traced, and supervised rule induction methods (decision trees and association rules) are used to discover ADE detection rules, with respect to time constraints. The rules are then filtered, validated, and reorganized by a committee of experts. The rules are described in a rule repository, and several statistics are automatically computed in every medical department, such as the confidence, relative risk, and median delay of outcome appearance. 236 validated ADE-detection rules are discovered; they enable to detect 27 different kinds of outcomes. The rules use a various number of conditions related to laboratory results, diseases, drug administration, and demographics. Some rules involve innovative conditions, such as drug discontinuations.
Dehydration secondary to gastroenteritis is one of the most common reasons for office visits and hospital admissions. The indicator most commonly used to estimate dehydration status is acute weight loss. Post-illness weight gain is considered as the gold-standard to determine the true level of dehydration and is widely used to estimate weight loss in research. To determine the value of post-illness weight gain as a gold standard for acute dehydration, we conducted a prospective cohort study in which 293 children, aged 1 month to 2 years, with acute diarrhea were followed for 7 days during a 3-year period. The main outcome measures were an accurate pre-illness weight (if available within 8 days before the diarrhea), post-illness weight, and theoretical weight (predicted from the child’s individual growth chart). Post-illness weight was measured for 231 (79%) and both theoretical and post-illness weights were obtained for 111 (39%). Only 62 (21%) had an accurate pre-illness weight. The correlation between post-illness and theoretical weight was excellent (0.978), but bootstrapped linear regression analysis showed that post-illness weight underestimated theoretical weight by 0.48 kg (95% CI: 0.06–0.79, p<0.02). The mean difference in the fluid deficit calculated was 4.0% of body weight (95% CI: 3.2–4.7, p<0.0001). Theoretical weight overestimated accurate pre-illness weight by 0.21 kg (95% CI: 0.08–0.34, p = 0.002). Post-illness weight underestimated pre-illness weight by 0.19 kg (95% CI: 0.03–0.36, p = 0.02). The prevalence of 5% dehydration according to post-illness weight (21%) was significantly lower than the prevalence estimated by either theoretical weight (60%) or clinical assessment (66%, p<0.0001).These data suggest that post-illness weight is of little value as a gold standard to determine the true level of dehydration. The performance of dehydration signs or scales determined by using post-illness weight as a gold standard has to be reconsidered.
Background Common data models (CDMs) enable data to be standardized, and facilitate data exchange, sharing, and storage, particularly when the data have been collected via distinct, heterogeneous systems. Moreover, CDMs provide tools for data quality assessment, integration into models, visualization, and analysis. The observational medical outcome partnership (OMOP) provides a CDM for organizing and standardizing databases. Common data models not only facilitate data integration but also (and especially for the OMOP model) extends the range of available statistical analyses. Objective This study aimed to evaluate the feasibility of implementing French national electronic health records in the OMOP CDM. Methods The OMOP's specifications were used to audit the source data, specify the transformation into the OMOP CDM, implement an extract–transform–load process to feed data from the French health care system into the OMOP CDM, and evaluate the final database. Results Seventeen vocabularies corresponding to the French context were added to the OMOP CDM's concepts. Three French terminologies were automatically mapped to standardized vocabularies. We loaded nine tables from the OMOP CDM's “standardized clinical data” section, and three tables from the “standardized health system data” section. Outpatient and inpatient data from 38,730 individuals were integrated. The median (interquartile range) number of outpatient and inpatient stays per patient was 160 (19–364). Conclusion Our results demonstrated that data from the French national health care system can be integrated into the OMOP CDM. One of the main challenges was the use of international OMOP concepts to annotate data recorded in a French context. The use of local terminologies was an obstacle to conceptual mapping; with the exception of an adaptation of the International Classification of Diseases 10th Revision, the French health care system does not use international terminologies. It would be interesting to extend our present findings to the 65 million people registered in the French health care system.
Alzheimer’s disease (AD) is a frequent pathology, with a poor prognosis, for which no curative treatment is available in 2018. AD prevention is an important issue, and is an important research topic. In this manuscript, we have synthesized the literature reviews and meta-analyses relating to modifiable risk factors associated with AD. Smoking, diabetes, high blood pressure, obesity, hypercholesterolemia, physical inactivity, depression, head trauma, heart failure, bleeding and ischemic strokes, sleep apnea syndrome appeared to be associated with an increased risk of AD. In addition to these well-known associations, we highlight here the existence of associated factors less described: hyperhomocysteinemia, hearing loss, essential tremor, occupational exposure to magnetic fields. On the contrary, some oral antidiabetic drugs, education and intellectual activity, a Mediterranean-type diet or using Healthy Diet Indicator, consumption of unsaturated fatty acids seemed to have a protective effect. Better knowledge of risk factors for AD allows for better identification of patients at risk. This may contribute to the emergence of prevention policies to delay or prevent the onset of AD.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.