Fast index based algorithms and software for matching position specific scoring matrices

Beckstette, Michael; Homann, Robert; Giegerich, Robert; Kurtz, Stefan

doi:10.1186/1471-2105-7-389

Cited by 131 publications

(119 citation statements)

References 29 publications

Supporting

Mentioning

119

Contrasting

Order By: Relevance

“…Combining these data sets yielded a group of candidate genes that are both conserved in sequenced corynebacteria and most likely under transcriptional control by GlxR and its orthologs, respectively. The workflow is completed by performing electrophoretic mobility shift assays (EMSAs) for in vitro verification of detected binding sites in the model organism C. glutamicum ATCC 13032. raising sensitivity at the cost of diminished specificity and thereby increasing the number of detected motif instances (Beckstette et al, 2006). Subsequently, a bi-directional best blast analysis with an E-value threshold of 10 −5 was employed to identify potentially orthologous genes.…”

Section: Detection Of Highly Conserved Glxr Target Genes In Corynebacmentioning

confidence: 99%

“…DNA binding sites were detected using the PoSSuMsearch algorithm (Beckstette et al, 2006) with all upstream regions of the organism, extracted from the genomic sequence as described previously for C. glutamicum (Kohl et al, 2008). The position weight matrix (PWM)-based model of the GlxR binding motif used in the search was derived from all verified binding sites in C. glutamicum ATCC 13032.…”

Section: Detection Of Glxr Binding Sitesmentioning

confidence: 99%

See 1 more Smart Citation

The GlxR regulon of the amino acid producer Corynebacterium glutamicum: Detection of the corynebacterial core regulon and integration into the transcriptional regulatory network model

Kohl

Tauch

2009

Journal of Biotechnology

101

View full text Add to dashboard Cite

Section: Detection Of Highly Conserved Glxr Target Genes In Corynebacmentioning

confidence: 99%

Section: Detection Of Glxr Binding Sitesmentioning

confidence: 99%

The GlxR regulon of the amino acid producer Corynebacterium glutamicum: Detection of the corynebacterial core regulon and integration into the transcriptional regulatory network model

Kohl

Tauch

2009

Journal of Biotechnology

101

View full text Add to dashboard Cite

“…Putative transcription factor-binding sites were identified using the PSSM search module of Biopython. The significance threshold for binding sites in the context of multiple-hypothesis testing was defined by computing the exact probability distributions for site scores under the PSSM and genomic background models with dynamic programming and controlling the rate of false-positive results by defining the probability of finding at least one false-positive result in a sequence of 350 bp (␣ 350 ϭ 0.01) (46,47). Comparative genomics analysis.…”

Section: T T T T a T T C A G T C T T A G A A T T G A T G C A G A T A mentioning

confidence: 99%

Identification and Characterization of VpsR and VpsT Binding Sites in Vibrio cholerae

Zamorano‐Sánchez¹,

Fong²,

Kilic³

et al. 2015

J. Bacteriol.

120

View full text Add to dashboard Cite

The ability to form biofilms is critical for environmental survival and transmission of Vibrio cholerae, a facultative human pathogen responsible for the disease cholera. Biofilm formation is controlled by several transcriptional regulators and alternative sigma factors. In this study, we report that the two main positive regulators of biofilm formation, VpsR and VpsT, bind to nonoverlapping target sequences in the regulatory region of vpsL in vitro. VpsR binds to a proximal site (the R1 box) as well as a distal site (the R2 box) with respect to the transcriptional start site identified upstream of vpsL. The VpsT binding site (the T box) is located between the R1 and R2 boxes. While mutations in the T and R boxes resulted in a decrease in vpsL expression, deletion of the T and R2 boxes resulted in an increase in vpsL expression. Analysis of the role of H-NS in vpsL expression revealed that deletion of hns resulted in enhanced vpsL expression. The level of vpsL expression was higher in an hns vpsT double mutant than in the parental strain but lower than that in an hns mutant. In silico analysis of the regulatory regions of the VpsR and VpsT targets resulted in the identification of conserved recognition motifs for VpsR and VpsT and revealed that operons involved in biofilm formation and vpsT are coregulated by VpsR and VpsT. Furthermore, a comparative genomics analysis revealed substantial variability in the promoter region of the vpsT and vpsL genes among extant V. cholerae isolates, suggesting that regulation of biofilm formation is under active selection. IMPORTANCEVibrio cholerae causes cholera and is a natural inhabitant of aquatic environments. One critical factor that is important for environmental survival and transmission of V. cholerae is the microbe's ability to form biofilms, which are surface-associated communities encased in a matrix composed of the exopolysaccharide VPS (Vibrio polysaccharide), proteins, and nucleic acids. Two proteins, VpsR and VpsT, positively regulate VPS production and biofilm formation. We characterized the structural features of the promoter of the vpsL gene, determined the target sequences recognized by VpsT and VpsR, and analyzed their distribution and conservation patterns in multiple V. cholerae isolates. This work fills a fundamental gap in our understanding of the regulatory mechanisms employed by the master regulators VpsR and VpsT in controlling biofilm matrix production. Biofilms are microbial communities composed of aggregated microorganisms and an exopolymeric matrix typically made up of exopolysaccharides, proteins, and nucleic acids. These microbial structures are prevalent in nature and are often found attached to abiotic or biotic surfaces (1). Vibrio cholerae, a human pathogen that can colonize the human intestine and cause the diarrheal disease cholera, is an autochthonous member of estuarine environments (2, 3). In aquatic environments, V. cholerae can form biofilms on various surfaces, including phytoplankton, zooplankton, and sediments (4-8). The a...

show abstract

“…[39,44,51]), use a brute-force sliding window approach, and the profile matching problem is still regarded as a not yet satisfactorily well-solved problem in computational biology [20]. In recent years a bunch of advanced algorithms based on score properties [53], indexing data structures [6,7,13], Fast Fourier Transform [41], data compression [17], matrix partitioning [33], filtering algorithms [7,33,38], pattern matching [38], and superalphabet [38] have been proposed to reduce the expected time of computation. The aim of this paper is to survey these methods to give the reader an overview of the state of the art of the topic and possibly stimulate future research in the field.…”

Section: Introductionmentioning

confidence: 99%

Fast profile matching algorithms — A survey

Pizzi

Ukkonen

2008

Theoretical Computer Science

View full text Add to dashboard Cite

Position-specific scoring matrices are a popular choice for modelling signals or motifs in biological sequences, both in DNA and protein contexts. A lot of effort has been dedicated to the definition of suitable scores and thresholds for increasing the specificity of the model and the sensitivity of the search. It is quite surprising that, until very recently, little attention has been paid to the actual process of finding the matches of the matrices in a set of sequences, once the score and the threshold have been fixed. In fact, most profile matching tools still rely on a simple sliding window approach to scan the input sequences. This can be a very time expensive routine when searching for hits of a large set of scoring matrices in a sequence database. In this paper we will give a survey of proposed approaches to speed up profile matching based on statistical significance, multipattern matching, filtering, indexing data structures, matrix partitioning, Fast Fourier Transform and data compression. These approaches improve the expected searching time of profile matching, thus leading to implementation of faster tools in practice.

show abstract

Fast index based algorithms and software for matching position specific scoring matrices

Cited by 131 publications

References 29 publications

The GlxR regulon of the amino acid producer Corynebacterium glutamicum: Detection of the corynebacterial core regulon and integration into the transcriptional regulatory network model

The GlxR regulon of the amino acid producer Corynebacterium glutamicum: Detection of the corynebacterial core regulon and integration into the transcriptional regulatory network model

Identification and Characterization of VpsR and VpsT Binding Sites in Vibrio cholerae

Fast profile matching algorithms — A survey

Contact Info

Product

Resources

About