2008
DOI: 10.1101/gr.076117.108
|View full text |Cite
|
Sign up to set email alerts
|

Transcription factor and microRNA motif discovery: The Amadeus platform and a compendium of metazoan target sets

Abstract: We present a threefold contribution to the computational task of motif discovery, a key component in the effort of delineating the regulatory map of a genome: (1) We constructed a comprehensive large-scale, publicly-available compendium of transcription factor and microRNA target gene sets derived from diverse high-throughput experiments in several metazoans. We used the compendium as a benchmark for motif discovery tools. (2) We developed Amadeus, a highly efficient, user-friendly software platform for genome… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
211
1

Year Published

2010
2010
2015
2015

Publication Types

Select...
5
2
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 168 publications
(214 citation statements)
references
References 40 publications
2
211
1
Order By: Relevance
“…Promoter sequences for each RP gene were defined as 600 bases upstream of ATG and truncated when neighboring ORFs overlapped with this region. Cis-regulatory motifs were discovered using the Amadeus software package (28), searching for up to 5 motifs of lengths 8-12 that are significantly enriched as compared with the background set of promoters. Motif targets were identified via the TestMOTIF software program (32) using a three-order Markov background model estimated from the entire set of promoters per genome.…”
Section: Methodsmentioning
confidence: 99%
“…Promoter sequences for each RP gene were defined as 600 bases upstream of ATG and truncated when neighboring ORFs overlapped with this region. Cis-regulatory motifs were discovered using the Amadeus software package (28), searching for up to 5 motifs of lengths 8-12 that are significantly enriched as compared with the background set of promoters. Motif targets were identified via the TestMOTIF software program (32) using a three-order Markov background model estimated from the entire set of promoters per genome.…”
Section: Methodsmentioning
confidence: 99%
“…Assuming equal probability for each of the four possible spatial arrangements of two motifs on two strands, the probability for observing k or more occurrences of a specific arrangement out of a total of n occurrences of the motif pair can be computed using the binomial distribution with p = 0.25. The HG and binomial scores are described in detail in Halperin et al (2009) and Linhart et al (2008). SAGE data were obtained from the Genome BC C. elegans Gene Expression Consortium (http://elegans.bcgsc.bc.ca).…”
Section: Computational Analysesmentioning
confidence: 99%
“…In Figure 1(b), we compare the performance of the two similarity measures based on the correlation of score profiles using sequences of length 4 10 with three column-based measures using as CEBPB (10) CTCF (125) EP300 (15) EZH2 (14) JUN (10) JUND (10) MAX (11) MYC (22) RAD21 (13) RELA (10) REST (12) TAF1 ( column-wise measures Pearson correlation, Euclidean distance, and the symmetric Kullback-Leibler divergence (Harbison et al, 2004;Linhart et al, 2008), and with Mosta (Pape et al, 2008). We find that Mosta and the two measures based on the correlation of score profiles yield a substantially increased AUC-PR compared to the column-based measures.…”
Section: Benchmarksmentioning
confidence: 99%