2018
DOI: 10.1111/cge.13175
|View full text |Cite
|
Sign up to set email alerts
|

Genetic prediction of type 2 diabetes using deep neural network

Abstract: Type 2 diabetes (T2DM) has strong heritability but genetic models to explain heritability have been challenging. We tested deep neural network (DNN) to predict T2DM using the nested case-control study of Nurses' Health Study (3326 females, 45.6% T2DM) and Health Professionals Follow-up Study (2502 males, 46.5% T2DM). We selected 96, 214, 399, and 678 single-nucleotide polymorphism (SNPs) through Fisher's exact test and L1-penalized logistic regression. We split each dataset randomly in 4:1 to train prediction … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 17 publications
(19 citation statements)
references
References 25 publications
0
19
0
Order By: Relevance
“…DNN model is a promising model in machine learning, because it can capture the complex correlation caused by a large number of input parameters, find some structures in the training data, and gradually modify the data representation to obtain excellent accuracy of the training network [39,40]. In this study, the DNN model fully captured the complex nonlinear multi-level interaction between the annual pneumoconiosis DALY and the input characteristic variables, including the number of pneumoconiosis patients, the average age of onset, the average dust exposure time and the gross industrial production through training.…”
Section: Discussionmentioning
confidence: 99%
“…DNN model is a promising model in machine learning, because it can capture the complex correlation caused by a large number of input parameters, find some structures in the training data, and gradually modify the data representation to obtain excellent accuracy of the training network [39,40]. In this study, the DNN model fully captured the complex nonlinear multi-level interaction between the annual pneumoconiosis DALY and the input characteristic variables, including the number of pneumoconiosis patients, the average age of onset, the average dust exposure time and the gross industrial production through training.…”
Section: Discussionmentioning
confidence: 99%
“…However, as metabolites were measured only from 2005 to 2006, we filtered the data to include only 7515 individuals who participated in the KoGES from 2005 to 2006 as the baseline data; the follow-up was conducted until 2015–2016. Participants with self-reported type 2 diabetes, on type 2 diabetes medications, or meeting the American Diabetes Association diagnostic criteria (fasting glucose (GLU0) ≥7.0 mmol/L, 2-h glucose ≥11.1 mmol/L, or glycated hemoglobin (HbA1c) ≥48 mmol/mol (6.5%)) 27 were defined as patients with type 2 diabetes. Among these participants, we selected 1905 participants with information on type 2 diabetes diagnosis, serum glucose, HbA1c, genotype, and metabolites for baseline examinations.…”
Section: Methodsmentioning
confidence: 99%
“…To avoid the curse of dimensionality, feature selection and feature extraction are often used [79,80]. Some researchers [81][82][83][84] try using multiple datasets to provide a larger number of samples to balance the number of features and samples. Other researchers used association analysis techniques to reduce the dimensions of the original dataset to a significantly smaller number, mostly using logistic regression and selecting only SNPs that passed a threshold level of p-value [45].…”
Section: A the Curse Of Dimensionalitymentioning
confidence: 99%
“…Another obstacle that limits the ability of machine learning models is when the number of samples is extremely different in each class. As the aim of ML for classification problems, such as classifying healthy vs unhealthy [82] or responding to disease treatment vs unresponsive to the treatment [88], is to obtain an efficient model for such discrimination cases, a satisfactory number of samples per class should be provided.…”
Section: Imbalance Classesmentioning
confidence: 99%