[15] k-tuple frequency analysis: From intron/exon discrimination to T-cell epitope mapping

Claverie, Jean‐Michel; Sauvaget, Isabelle; Bougueleret, Lydie

doi:10.1016/0076-6879(90)83017-4

Cited by 81 publications

(27 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is well known that genomes are characterized by species-specific compositional features, and that coding and non-coding DNA are distinguishable in terms of their pentamer and hexamer distributions (Claverie et al, 1990). In promoter regions except core promoter elements such as TATA boxes, CAAT boxes and transcription initiation sites (INR), there exists a couple of other individual elements or sequence properties that are associated with promoter sequences.…”

Section: Compositional Featuresmentioning

confidence: 99%

A hybrid neural network system for prediction and recognition of promoter regions in human genome

Chen

2005

J Zheijang Univ Sci B

View full text Add to dashboard Cite

This paper proposes a high specificity and sensitivity algorithm called PromPredictor for recognizing promoter regions in the human genome. PromPredictor extracts compositional features and CpG islands information from genomic sequence, feeding these features as input for a hybrid neural network system (HNN) and then applies the HNN for prediction. It combines a novel promoter recognition model, coding theory, feature selection and dimensionality reduction with machine learning algorithm. Evaluation on Human chromosome 22 was ~66% in sensitivity and ~48% in specificity. Comparison with two other systems revealed that our method had superior sensitivity and specificity in predicting promoter regions. PromPredictor is written in MATLAB and requires Matlab to run. PromPredictor is freely available at

show abstract

Section: Compositional Featuresmentioning

confidence: 99%

A hybrid neural network system for prediction and recognition of promoter regions in human genome

Chen

2005

J Zheijang Univ Sci B

View full text Add to dashboard Cite

show abstract

“…There are several ways to derive a discriminant function (see for example: Claverie et al, 1990;Staden, 1990;Fickett and Tung, 1992). The classical linear discriminant analysis is well-suited to address such a problem.…”

Section: Discriminant Functionsmentioning

confidence: 99%

A computer filtering method to drive out tiny genes from the yeast genome

Barry

Fichant

Kalogeropoulos

et al. 1996

Yeast

View full text Add to dashboard Cite

The authors of the first yeast chromosome sequence defined a minimum threshold requirement of 100 codons, above which an open reading frame (ORF) is retained as a putative coding sequence. However, at least 58 yeast genes shorter than 100 codons have an assigned protein function. Therefore, the yeast genome may contain other tiny but functionally important genes that are discarded from analyses by this simple filtering rule. We have established discriminant functions from the in‐phase hexamer frequencies of functional genes and of simulated ORFs derived from a stationary Markov chain model. Fifty‐two out of the 58 genes were recognized as coding ORFs by our discriminating method. The test was also applied to all the small ORFs (36 to 100 codons) found in the intergenic regions of published chromosomes. It retained 140 new potential tiny coding sequences, among which we identified seven new genes by similarity searches. Our method, used conjointly with similarity searches, can also highlight sequencing errors resulting from the disruption of the coding frame of longer ORFs. This method, by its ability to detect potential coding ORFs, can be a very useful tool for functional analysis.

show abstract

“…The biological evidences provide hints to discriminate them from each other, such as the basic differences between exons and introns in terms of the tri-nucleotides can be explained by the circular code theory (Arquèsa and Michel 1996) and the sharp transition at flank regions of splice sites (Zhan 1998). Many of exon/intron discrimination methods are based on analysis of sequence composition, such as the consensus sequences (Weir and Rice 2004), oligo-nucleotide frequencies (Claverie and Bougueleret 1986;Claverie et al 1990;Solovyev et al 1994;Louie et al 2003), base/codon/triplet usage (Zhan 1998), the sequence determinants of splice sites (Mengeritsky and Smith 1989) and a multi-source recognition method recruiting the consensus features and statistical differences of bases usage (Nakata et al 1985). However, none of the above methods is a single sequence-based methodology, and hence the day of fulfilling the vision of simulating the processes of splicing machinery is still awaited.…”

Section: Related Workmentioning

confidence: 99%

An exon/intron disparity framework based on the nucleotide profile of single sequence

Liou

Huang

2012

Netw Model Anal Health Inform Bioinforma

View full text Add to dashboard Cite

The RNA sequences are the major materials accessible for the nuclear splicing machinery, therefore, understanding how they are transformed into a binary decision of intron removal and exon ligation is critical in resolving the mystery of pre-mRNA splicing. This paper proposed an exon/intron discrimination framework (EIDF) to profile the intrinsic differences between exons and their immediate introns based on information of single sequence. The EIDF focuses on the frequencies of specific mono-/di-/ tri-nucleotides in the individual sequence and a simple exon/intron classifier is implemented accordingly. The experimental results showed the proposed EIDF is a valuable profile of splice site sequences and the possibility of simulating the processes of splicing machinery in silico is also revealed.

show abstract

[15] k-tuple frequency analysis: From intron/exon discrimination to T-cell epitope mapping

Cited by 81 publications

References 16 publications

A hybrid neural network system for prediction and recognition of promoter regions in human genome

A hybrid neural network system for prediction and recognition of promoter regions in human genome

A computer filtering method to drive out tiny genes from the yeast genome

An exon/intron disparity framework based on the nucleotide profile of single sequence

Contact Info

Product

Resources

About