2021
DOI: 10.1101/2021.01.06.425527
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A multi-dimensional integrative scoring framework for predicting functional variants in the human genome

Abstract: Attempts to identify and prioritize functional DNA elements in coding and noncoding regions, particularly through use of in silico functional annotation data, continue to increase in popularity. However, specific functional roles may vary widely from one variant to another, making it challenging to summarize different aspects of variant function. Here we propose Multi-dimensional Annotation Class Integrative Estimation (MACIE), an unsupervised multivariate mixed model framework capable of integrating annotatio… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
13
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(13 citation statements)
references
References 48 publications
0
13
0
Order By: Relevance
“…A variety of functional annotations have been developed to measure multiple aspects of biological functionality of variants, including protein function (15)(16)(17), conservation (18,19), epigenetics (20,21), spatial genomics (22,23), network biology (24), mappability (25), local nucleotide diversity (26) and integrative composite annotations (4,(27)(28)(29). These annotations have successfully prioritized plausible causal variants of underlying GWAS signals according to their functional impact in experimental studies following GWAS findings (5,30), localizing causal variants in fine-mapping studies (4,8), partitioning heritability in GWAS (6), predicting genetic risk (6,7,9), and improving rare variant (RV) analysis of WGS association studies (12)(13)(14)31). For example, large-scale WGS/WES studies (1,3,32,33) assess the associations between complex diseases/traits and coding and non-coding rare variants across the genome.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations
“…A variety of functional annotations have been developed to measure multiple aspects of biological functionality of variants, including protein function (15)(16)(17), conservation (18,19), epigenetics (20,21), spatial genomics (22,23), network biology (24), mappability (25), local nucleotide diversity (26) and integrative composite annotations (4,(27)(28)(29). These annotations have successfully prioritized plausible causal variants of underlying GWAS signals according to their functional impact in experimental studies following GWAS findings (5,30), localizing causal variants in fine-mapping studies (4,8), partitioning heritability in GWAS (6), predicting genetic risk (6,7,9), and improving rare variant (RV) analysis of WGS association studies (12)(13)(14)31). For example, large-scale WGS/WES studies (1,3,32,33) assess the associations between complex diseases/traits and coding and non-coding rare variants across the genome.…”
Section: Introductionmentioning
confidence: 99%
“…For example, large-scale WGS/WES studies (1,3,32,33) assess the associations between complex diseases/traits and coding and non-coding rare variants across the genome. The recently developed STAAR method incorporates multi-faceted variant functional annotations to boost the power of rare variant association tests in WGS/WES studies (12)(13)(14).…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Therefore, models that employ PN approaches can suffer from high false-negative prediction rates (Kolosov et al, 2021). Recently, unsupervised and semi-supervised techniques have demonstrated an advantage in the analysis of non-coding regions because they are not biased by the lack of available labeled training examples (Li et al, 2022). Particularly, positive-unlabeled (PU) and negative-unlabeled (NU) learning are semi-supervised methods suited to genomic classification problems, as they can train a model with limited high-quality labeled data.…”
Section: Introductionmentioning
confidence: 99%