Gestational diabetes mellitus (GDM) is conventionally confirmed with oral glucose tolerance test (OGTT) in 24 to 28 weeks of gestation, but it is still uncertain whether it can be predicted with secondary use of electronic health records (EHRs) in early pregnancy. To this purpose, the cost-sensitive hybrid model (CSHM) and five conventional machine learning methods are used to construct the predictive models, capturing the future risks of GDM in the temporally aggregated EHRs. The experimental data sources from a nested case-control study cohort, containing 33,935 gestational women in West China Second Hospital. After data cleaning, 4,378 cases and 50 attributes are stored and collected for the data set. Through selecting the most feasible method, the cost parameter of CSHM is adapted to deal with imbalance of the dataset. In the experiment, 3940 samples are used for training and the rest 438 samples for testing. Although the accuracy of positive samples is barely acceptable (62.16%), the results suggest that the vast majority (98.4%) of those predicted positive instances are real positives. To our knowledge, this is the first study to apply machine learning models with EHRs to predict GDM, which will facilitate personalized medicine in maternal health management in the future.
Although the genotype-phenotype for familial medullary thyroid carcinoma (FMTC) is well studied, only few low susceptibility risk loci were identified for familial non-medullary thyroid carcinoma (FNMTC). The aim of this study is to screen and identify high-penetrate genes for FNMTC. A total of 34 families with more than two first-degree relatives diagnosed as papillary thyroid cancer without other familial syndrome were recruited. Whole exome and target gene sequencing were performed for candidate variants. These variants were screened and analyzed with ESP6500, ExAC, 1000 genomes project, and the Cancer Genome Atlas (TCGA) with SIFT score and Polyphen2 prediction. Finally, we identified recurrent genetic mutation of MAP2K5 variants c.G961A and c.T1100C (p. A321T and p.M367 T) as susceptibility loci for FNMTC. The frequencies of MAP2K5 c.G961A and c.T1100C were found, 0.0385 and 0.0259 in FNMTC and 0 and 0.00022523 in healthy Chinese controls (n = 2200, P < 0.001), respectively. Both variants were located in the protein kinase domain. The functional study showed that MAP2K5 A321T or M367 T could consistently phosphorylate downstream protein ERK5 on site Ser731 + Thr733 or Ser496, promoting nuclear translocation and subsequently altering target gene expressions. Our data revealed that MAP2K5 variants A321T or M367 T can activate MAP2K5-ERK5 pathway, alter downstream gene expression, and subsequently induce thyroid epithelial cell malignant transformation. While classic MAP2K1/2(MEK1/2)-ERK1/2 signaling is well known for driving sporadic NMTC, our research indicated that MAP2K5 (MEK5) is a susceptibility gene for FNMTC. These findings highlight the potential application of MAP2K5 for molecular diagnosis as well as early prevention.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.