Finding new protein-coding genes is one of the most important goals of eukaryotic genome sequencing projects. However, genomic organization of novel eukaryotic genomes is diverse and ab initio gene finding tools tuned up for previously studied species are rarely suitable for efficacious gene hunting in DNA sequences of a new genome. Gene identification methods based on cDNA and expressed sequence tag (EST) mapping to genomic DNA or those using alignments to closely related genomes rely either on existence of abundant cDNA and EST data and/or availability on reference genomes. Conventional statistical ab initio methods require large training sets of validated genes for estimating gene model parameters. In practice, neither one of these types of data may be available in sufficient amount until rather late stages of the novel genome sequencing. Nevertheless, we have shown that gene finding in eukaryotic genomes could be carried out in parallel with statistical models estimation directly from yet anonymous genomic DNA. The suggested method of parallelization of gene prediction with the model parameters estimation follows the path of the iterative Viterbi training. Rounds of genomic sequence labeling into coding and non-coding regions are followed by the rounds of model parameters estimation. Several dynamically changing restrictions on the possible range of model parameters are added to filter out fluctuations in the initial steps of the algorithm that could redirect the iteration process away from the biologically relevant point in parameter space. Tests on well-studied eukaryotic genomes have shown that the new method performs comparably or better than conventional methods where the supervised model training precedes the gene prediction step. Several novel genomes have been analyzed and biologically interesting findings are discussed. Thus, a self-training algorithm that had been assumed feasible only for prokaryotic genomes has now been developed for ab initio eukaryotic gene identification.
We describe a new ab initio algorithm, GeneMark-ES version 2, that identifies protein-coding genes in fungal genomes. The algorithm does not require a predetermined training set to estimate parameters of the underlying hidden Markov model (HMM). Instead, the anonymous genomic sequence in question is used as an input for iterative unsupervised training. The algorithm extends our previously developed method tested on genomes of Arabidopsis thaliana, Caenorhabditis elegans, and Drosophila melanogaster. To better reflect features of fungal gene organization, we enhanced the intron submodel to accommodate sequences with and without branch point sites. This design enables the algorithm to work equally well for species with the kinds of variations in splicing mechanisms seen in the fungal phyla Ascomycota, Basidiomycota, and Zygomycota. Upon self-training, the intron submodel switches on in several steps to reach its full complexity. We demonstrate that the algorithm accuracy, both at the exon and the whole gene level, is favorably compared to the accuracy of gene finders that employ supervised training. Application of the new method to known fungal genomes indicates substantial improvement over existing annotations. By eliminating the effort necessary to build comprehensive training sets, the new algorithm can streamline and accelerate the process of annotation in a large number of fungal genome sequencing projects.[Supplemental material is available online at www.genome.org. The new software program GeneMark-ES version 2 is freely available for download from http://exon.gatech.edu/genemark/gmhmm- es-2008.] Reliable ab initio gene prediction in eukaryotic genomic sequences remains an open problem in spite of impressive progress made in developing gene prediction algorithms (Burge and Karlin 1997;Krogh 1997;Parra et al. 2000;Reese et al. 2000;Stanke and Waack 2003;Guigo et al. 2006). Much attention has been given to developing alternative, extrinsic methods that use EST/cDNA to genome mapping, spliced alignments of known protein sequences, or patterns of conservation between related genomes (Gelfand et al. 1996;Mott 1997;Mathe et al. 2002;Birney et al. 2004;Stanke et al. 2008). The extrinsic methods exhibit, as a rule, high specificity (Sp), while ab initio, intrinsic methods show high sensitivity (Sn). These properties make methods of both types indispensable in genome annotation pipelines. For accurate statistical description of protein-coding regions, efficient ab initio algorithms employ the fifth-order three periodic Markov chain models incorporated into hidden Markov models (HMMs) (Kulp et al. 1996;Burge and Karlin 1997) The number of algorithm parameters, several thousands, is high; thus, a training set of ∼1000 experimentally validated genes is necessary for parameter estimation. Compilation of such large training sets represents a bottleneck in genome annotation pipelines and a practical challenge that hampers the use of ab initio gene prediction algorithms.To circumvent this difficulty, we developed a gene...
The concept of a prion as an infectious self-propagating protein isoform was initially proposed to explain certain mammalian diseases. It is now clear that yeast also has heritable elements transmitted via protein. Indeed, the "protein only" model of prion transmission was first proven using a yeast prion. Typically, known prions are ordered cross-b aggregates (amyloids). Recently, there has been an explosion in the number of recognized prions in yeast. Yeast continues to lead the way in understanding cellular control of prion propagation, prion structure, mechanisms of de novo prion formation, specificity of prion transmission, and the biological roles of prions. This review summarizes what has been learned from yeast prions.
The cause of Huntington's disease is expansion of polyglutamine (polyQ) domain in huntingtin, which makes this protein both neurotoxic and aggregation prone. Here we developed the first yeast model, which establishes a direct link between aggregation of expanded polyQ domain and its cytotoxicity. Our data indicated that deficiencies in molecular chaperones Sis1 and Hsp104 inhibited seeding of polyQ aggregates, whereas ssa1, ssa2, and ydj1–151 mutations inhibited expansion of aggregates. The latter three mutants strongly suppressed the polyQ toxicity. Spontaneous mutants with suppressed aggregation appeared with high frequency, and in all of them the toxicity was relieved. Aggregation defects in these mutants and in sis1–85 were not complemented in the cross to the hsp104 mutant, demonstrating an unusual type of inheritance. Since Hsp104 is required for prion maintenance in yeast, this suggested a role for prions in polyQ aggregation and toxicity. We screened a set of deletions of nonessential genes coding for known prions and related proteins and found that deletion of the RNQ1 gene specifically suppressed aggregation and toxicity of polyQ. Curing of the prion form of Rnq1 from wild-type cells dramatically suppressed both aggregation and toxicity of polyQ. We concluded that aggregation of polyQ is critical for its toxicity and that Rnq1 in its prion conformation plays an essential role in polyQ aggregation leading to the toxicity.
In vivo propagation of [PSI؉ ], an aggregation-prone prion isoform of the yeast release factor Sup35 (eRF3), has previously been shown to require intermediate levels of the chaperone protein Hsp104. Here we perform a detailed study on the mechanism of prion loss after Hsp104 inactivation. Complete or partial inactivation of Hsp104 was achieved by the following approaches: deleting the HSP104 gene; modifying the HSP104 promoter that results in low level of its expression; and overexpressing the dominant-negative ATPase-inactive mutant HSP104 allele. In contrast to guanidine-HCl, an agent blocking prion proliferation, Hsp104 inactivation induced relatively rapid loss of [PSI ؉ ] and another candidate yeast prion, [PIN ؉ ]. Thus, the previously hypothesized mechanism of prion dilution in cell divisions due to the blocking of prion proliferation is not sufficient to explain the effect of Hsp104 inactivation. The [PSI ؉ ] response to increased levels of another chaperone, Hsp70-Ssa, depends on whether the Hsp104 activity is increased or decreased. A decrease of Hsp104 levels or activity is accompanied by a decrease in the number of Sup35 PSI؉ aggregates and an increase in their size. This eventually leads to accumulation of huge agglomerates, apparently possessing reduced prion forming capability and representing dead ends of the prion replication cycle. Thus, our data confirm that the primary function of Hsp104 in prion propagation is to disassemble prion aggregates and generate the small prion seeds that initiate new rounds of prion propagation (possibly assisted by Hsp70-Ssa).Prions (37) are protein isoforms that are capable of reproducing themselves by converting normal proteins of the same primary structure into a prion state. In mammals, including humans, the prion protein PrP Sc is associated with infectious neurodegenerative diseases, such as mad cow disease (see reference 38 for a review). In yeast and fungi, prions serve as protein-based genetic elements, inherited via cytoplasm in a non-Mendelian fashion (see references 5, 46, and 52 for reviews). Prions form insoluble proteinase-resistant aggregates in vivo, in contrast to their normal (nonprion) counterparts, which are usually soluble. In vitro, prion proteins form amyloid-like polymers. It has been suggested that in vivo replication of prion conformation occurs by a nucleated polymerization mechanism (25,26). This relates prion phenomena to other amyloidoses and neural inclusion disorders (see reference 21 for a review). An alternative model explains replication of prion conformation via a monomer-directed or template-assisted conformational switch in the heterodimer, suggesting that aggregate formation occurs as a consequence of the conformational switch (see reference 17 for a review). Recent data indicate that in vitro propagation of the yeast prion amyloids may combine features of both models and therefore could be termed a nucleated conformational conversion (45).Since prion propagation apparently operates at the level of protein folding and assemb...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.