2016
DOI: 10.1101/069229
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Fun-Lda: A Latent Dirichlet Allocation Model for Predicting Tissue-Specific Functional Effects of Noncoding Variation

Abstract: We describe here a new method based on a latent Dirichlet allocation model for predicting functional effects of noncoding genetic variants in a cell type and tissue specific way (FUN-LDA) by integrating diverse epigenetic annotations for specific cell types and tissues from large scale epigenomics projects such as ENCODE and Roadmap Epigenomics. Using this unsupervised approach we predict tissue-specific functional effects for every position in the human genome. We demonstrate the usefulness of our predictions… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 12 publications
(14 citation statements)
references
References 77 publications
0
14
0
Order By: Relevance
“…For example, a two-step analysis procedure was proposed to first identify trait-relevant tissue group and then identify trait-relevant tissue within the tissue group [ 80 ]. In addition, using synthetic annotations generated from Genoskyline [ 28 ] or FUN-LDA [ 81 ] could be particularly useful for identifying fine scale trait-relevant tissues. Our method can be easily adapted to incorporate a two-step analysis procedure and/or accommodate synthetic annotations, and has the potential to yield better trait-tissue relevance resolution in the future.…”
Section: Discussionmentioning
confidence: 99%
“…For example, a two-step analysis procedure was proposed to first identify trait-relevant tissue group and then identify trait-relevant tissue within the tissue group [ 80 ]. In addition, using synthetic annotations generated from Genoskyline [ 28 ] or FUN-LDA [ 81 ] could be particularly useful for identifying fine scale trait-relevant tissues. Our method can be easily adapted to incorporate a two-step analysis procedure and/or accommodate synthetic annotations, and has the potential to yield better trait-tissue relevance resolution in the future.…”
Section: Discussionmentioning
confidence: 99%
“…11). Previous studies have indicated that functional variants (predicted by chromatin activity data) in enhancers are less likely to be shared across many tissues compared with those in promoters 26,27 , and that cell type-specific eQTLs are more dispersedly distributed around the transcription start site than eQTLs affected expression in multiple cell types 28,29 . These results seem to suggest that tissuespecific eQTLs are enriched in distal regulatory elements (i.e.…”
Section: Cis-eqtls With Tissue-specific Effectsmentioning
confidence: 99%
“… are the labels for m variants with MPRA validated labels; are the predicted values for a large number ( l ) of variants from a prior unsupervised method. We adopt FUN-LDA score in the current GenoNet algorithm because it is one of only a handful of tissue-specific functional scores available genome-wide, recognizing that other unsupervised scores can be readily incorporated into GenoNet in the future 18 ; X i are the functional annotations; γ I is a tuning parameter that controls how the unlabeled data are being used. When γ I = 0, the method is fully supervised; when γ I = ∞, the method is fully unsupervised (Supplementary Table 1 ).…”
Section: Resultsmentioning
confidence: 99%
“…We adopt Elastic-net because of its superior performance when the features are correlated and have sparse non-zero coefficients 21 ; are the labels for m variants with MPRA validated labels; are the predicted values for a large number ( l ) of variants from a prior unsupervised method. We choose the FUN-LDA score in the current GenoNet implementation because it is one of only a handful of tissue-specific functional scores available genome-wide, recognizing that other unsupervised scores can be readily incorporated into GenoNet in the future 18 . γ I is a tuning parameter that controls how the unlabeled data are being used 18 .…”
Section: Methodsmentioning
confidence: 99%