2023
DOI: 10.1101/2023.04.28.23289285
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Unsupervised representation learning improves genomic discovery and risk prediction for respiratory and circulatory functions and diseases

Abstract: Background: High-dimensional clinical data are becoming more accessible in biobank-scale datasets. However, accurately phenotyping high-dimensional clinical data remains a major impediment to genetic discovery. Methods: We introduce a general deep learning framework, REpresentation learning for Genetic discovery on Low-dimensional Embeddings (REGLE), for discovering associations between genetic variants and high-dimensional clinical data. REGLE uses convolutional variational autoencoders to compute a non-linea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(8 citation statements)
references
References 73 publications
0
8
0
Order By: Relevance
“…Next, we analyzed ECG lead I plus PPG and observed a modest correlation, indicative of complementary signals (0.43 ± 0.01) (Supplementary Table 2). As a reference point, we observed that the spirogram data–a measure of lung function–in UKB [7, 8] has a lower projected correlation with lead I ECG (0.29 ± 0.03) and PPG (0.37 ± 0.03). It is worth noting that we do expect to see some non-zero correlation as HDCDs capture general health information of individuals such as age, sex, and BMI.…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…Next, we analyzed ECG lead I plus PPG and observed a modest correlation, indicative of complementary signals (0.43 ± 0.01) (Supplementary Table 2). As a reference point, we observed that the spirogram data–a measure of lung function–in UKB [7, 8] has a lower projected correlation with lead I ECG (0.29 ± 0.03) and PPG (0.37 ± 0.03). It is worth noting that we do expect to see some non-zero correlation as HDCDs capture general health information of individuals such as age, sex, and BMI.…”
Section: Resultsmentioning
confidence: 99%
“…However, we currently lack statistical methods to fully utilize these multimodal HDCD in genetic analyses. REGLE (Yun et al) [8] provides a means to study the genetic underpinnings of high dimensional data, but it is limited to one data modality, and does not leverage the shared and complementary information in multimodal HDCD. Here, we developed an unsupervised representation learning method, M-REGLE, which aims to utilize multimodal HDCD in a joint model to improve genetic analyses.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…In addition, the neural network parametrization of the Spiro-CLF model encodes nonlinear effects from the raw spirometry efforts. Previous works have shown that using neural networks, especially in conjunction with convolution layers, can improve representations of lung function with respect to prediction of COPD (Bhattacharjee et al, 2022), COPD subtypes (Bodduluri et al, 2020), and gene association Yun et al, 2023).…”
Section: Assigning Relative Importance To Spirometry Curvesmentioning
confidence: 99%