2016
DOI: 10.1016/j.biosystems.2016.08.011
|View full text |Cite
|
Sign up to set email alerts
|

Predicting gene expression level by the transcription factor binding signals in human embryonic stem cells

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 36 publications
0
8
0
Order By: Relevance
“…In order to further investigate the effects of TFs and HMs on prediction for genes in independent biological processes, we focus on the Gene Ontology biological processes [32, 33] for the high expression genes in the three cell lines (based on RPKM values, the top fifteen percent of all genes are selected as high expressed genes [3, 23]). Firstly, biological processes containing less than 30 genes are discarded, 1104, 1136 and 1070 sets of genes are remained, respectively, for H1-hESc, Gm12878 and K562 cell line.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…In order to further investigate the effects of TFs and HMs on prediction for genes in independent biological processes, we focus on the Gene Ontology biological processes [32, 33] for the high expression genes in the three cell lines (based on RPKM values, the top fifteen percent of all genes are selected as high expressed genes [3, 23]). Firstly, biological processes containing less than 30 genes are discarded, 1104, 1136 and 1070 sets of genes are remained, respectively, for H1-hESc, Gm12878 and K562 cell line.…”
Section: Resultsmentioning
confidence: 99%
“…Based on our previous study [3], signals of TFs binding are normalized by using the following Eq. (1), Nijk=(ni,jk×109)/(ntagk×200) (1)…”
Section: Methodsmentioning
confidence: 99%
“…From a computational point of view, machine learning methods are attractive in terms of their ability to derive predictive models without a need for strong assumptions about underlying mechanisms; hence they are especially useful to deal with certain biological questions of which our a priori knowledge is frequently unknown or insufficiently defined 14 . As a proof of concept, gene expression levels can be accurately predicted from a broad set of epigenetic features 1620 or binding profiles of diverse transcription factors (TFs) 2124 using various machine-learning-based approaches, although our knowledge about how the selected features determine the expression output is largely unknown. Modeling is, therefore, a key ingredient to derive novel biological insights by integrating large-scale data sets.…”
Section: Introductionmentioning
confidence: 99%
“…In other words, highly correlated features often contain redundant information [8] . For example, whereas the dozens of pluripotent factors such as Oct4, Sox2, Klf4, and c-Myc, are all useful to predict genes expressed in stem cells [9] , [10] , [11] , combining some pluripotent factors with endothelial lineage factors such as Lmo2 and Erg would add power to also predict genes expressed in endothelial cells; therefore, it can be more powerful using combined information from transcription factors with distinct functions, as opposed to an analysis using the transcription factors with similar effects on a shared set of target genes. More importantly, colocalization of low-correlation chromatin features may still happen in a biologically meaningful manner to implement important functions.…”
Section: Introductionmentioning
confidence: 99%