2017
DOI: 10.1101/222927
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

k-mer grammar uncovers maize regulatory architecture

Abstract: Only a small percentage of the genome sequence is involved in regulation of gene expression, but to biochemically identify this portion is expensive and laborious. In species like maize, with diverse intergenic regions and lots of repetitive elements, this is an especially challenging problem. While regulatory regions are rare, they do have characteristic chromatin contexts and sequence organization (the grammar) with which they can be identified. We developed a computational framework to exploit this sequence… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2018
2018
2018
2018

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 52 publications
0
1
0
Order By: Relevance
“…For example, after training on a large corpus of English language documents, given vectors representing words that are countries and capitals, MadridSpain+France will result in a vector that is similar to Paristrue→, more than other vectors in the corpus (Mikolov et al , 2013). This type of representation has led to better performance in downstream classification problems, including in biomedical literature classification (Chen et al , 2018; Minarro-Giménez et al , 2014), annotations (Duong et al , 2018; Zwierzyna and Overington, 2017) and genomic sequence classifications (Dutta et al , 2018; Du et al , 2018; Mejia Guerra and Buckler, 2017; Zhang et al , 2018).…”
Section: Methodsmentioning
confidence: 99%
“…For example, after training on a large corpus of English language documents, given vectors representing words that are countries and capitals, MadridSpain+France will result in a vector that is similar to Paristrue→, more than other vectors in the corpus (Mikolov et al , 2013). This type of representation has led to better performance in downstream classification problems, including in biomedical literature classification (Chen et al , 2018; Minarro-Giménez et al , 2014), annotations (Duong et al , 2018; Zwierzyna and Overington, 2017) and genomic sequence classifications (Dutta et al , 2018; Du et al , 2018; Mejia Guerra and Buckler, 2017; Zhang et al , 2018).…”
Section: Methodsmentioning
confidence: 99%