Sulfolobus acidocaldarius is an aerobic thermoacidophilic crenarchaeon which grows optimally at 80°C and pH 2 in terrestrial solfataric springs. Here, we describe the genome sequence of strain DSM639, which has been used for many seminal studies on archaeal and crenarchaeal biology. The circular genome carries 2,225,959 bp (37% G؉C) with 2,292 predicted protein-encoding genes. Many of the smaller genes were identified for the first time on the basis of comparison of three Sulfolobus genome sequences. Of the protein-coding genes, 305 are exclusive to S. acidocaldarius and 866 are specific to the Sulfolobus genus. Moreover, 82 genes for untranslated RNAs were identified and annotated. Owing to the probable absence of active autonomous and nonautonomous mobile elements, the genome stability and organization of S. acidocaldarius differ radically from those of Sulfolobus solfataricus and Sulfolobus tokodaii. The S. acidocaldarius genome contains an integrated, and probably encaptured, pARN-type conjugative plasmid which may facilitate intercellular chromosomal gene exchange in S. acidocaldarius. Moreover, it contains genes for a characteristic restriction modification system, a UV damage excision repair system, thermopsin, and an aromatic ring dioxygenase, all of which are absent from genomes of other Sulfolobus species. However, it lacks genes for some of their sugar transporters, consistent with it growing on a more limited range of carbon sources. These results, together with the many newly identified protein-coding genes for Sulfolobus, are incorporated into a public Sulfolobus database which can be accessed at http://dac.molbio.ku.dk/dbs/Sulfolobus.Sulfolobus acidocaldarius strain DSM639, the type strain of the archaeal genus Sulfolobus, was the first hyperthermoacidophile to be characterized from terrestrial solfataras by Brock et al. (12). It grows optimally at 75 to 80°C and pH 2 to 3, under strictly aerobic conditions, on complex organic substrates, including yeast extract, tryptone, and Casamino Acids and a limited number of sugars.Many of the seminal studies on archaea and crenarchaea were performed on S. acidocaldarius. Thus, S. acidocaldarius was employed to demonstrate the similarity of the archaeal and eukaryal transcription apparatuses (6, 36, 46). Moreover, its sensitivity to a wide range of ribosomal antibiotics (1) and ease of transformation (3) have rendered S. acidocaldarius a focus for in vivo genetic studies. Proteins responsible for chromatin folding (Sac7c) and the highly abundant Sac10b (Alba) protein, implicated in the regulation of chromatin and/or cellular RNAs in Sulfolobus (7, 30), were first characterized for this organism (29).S. acidocaldarius has also been used for studying genetic fidelity at high temperatures and is the only hyperthermophilic archaeon for which the rate and type of spontaneous mutation have been quantified in vivo (26). Its relatively low mutation rate, despite its high-temperature environment, has stimulated a strong interest in its efficient repair systems. It ...
Human and mouse genome sequences contain roughly 100,000 regions that are unalignable in primary sequence and neighbor corresponding alignable regions between both organisms. These pairs are generally assumed to be nonconserved, although the level of structural conservation between these has never been investigated. Owing to the limitations in computational methods, comparative genomics has been lacking the ability to compare such nonconserved sequence regions for conserved structural RNA elements. We have investigated the presence of structural RNA elements by conducting a local structural alignment, using FOLDALIGN, on a subset of these 100,000 corresponding regions and estimate that 1800 contain common RNA structures. Comparing our results with the recent mapping of transcribed fragments (transfrags) in human, we find that high-scoring candidates are twice as likely to be found in regions overlapped by transfrags than regions that are not overlapped by transfrags. To verify the coexpression between predicted candidates in human and mouse, we conducted expression studies by RT-PCR and Northern blotting on mouse candidates, which overlap with transfrags on human chromosome 20. RT-PCR results confirmed expression of 32 out of 36 candidates, whereas Northern blots confirmed four out of 12 candidates. Furthermore, many RT-PCR results indicate differential expression in different tissues. Hence, our findings suggest that there are corresponding regions between human and mouse, which contain expressed non-coding RNA sequences not alignable in primary sequence.
Many Archaea, in contrast to bacteria, produce a high proportion of leaderless transcripts, show a wide variation in their consensus Shine-Dalgarno (S-D) sequences and frequently use GUG and UUG start codons. In order to understand the basis for these differences, 18 complete archaeal genomes were examined for sequence signals that are positionally conserved upstream from genes. These functional motifs include box A promoter sequences for leaderless transcripts and S-D sequences for transcripts with leaders. Most of the box A sequences were preceded by a BRE-like motif and followed by a previously undetected A/T peak centred on position -10. Moreover, the sequence of the predominant S-D motifs in an archaeon is shown to depend on the precise number of nucleotides between the conserved anti-S-D CCUCC sequence and the 3'-terminal nucleotide of 16S RNA. Correlations with phylogenetic trees, constructed for the 18 Archaea, reveal that usage of high levels of both S-D motifs, and GUG and UUG start codons occurs exclusively in the shorter branched Archaea. High levels of leaderless transcripts are found in the longer branched Archaea.
The program foldalignM is implemented in JAVA and is, along with some accompanying PERL scripts, available at http://foldalign.ku.dk/
Recent computational scans for non-coding RNAs (ncRNAs) in multiple organisms have relied on existing multiple sequence alignments. However, as sequence similarity drops, a key signal of RNA structure-frequent compensating base changes-is increasingly likely to cause sequence-based alignment methods to misalign, or even refuse to align, homologous ncRNAs, consequently obscuring that structural signal. We have used CMfinder, a structure-oriented local alignment tool, to search the ENCODE regions of vertebrate multiple alignments. In agreement with other studies, we find a large number of potential RNA structures in the ENCODE regions. We report 6587 candidate regions with an estimated false-positive rate of 50%. More intriguingly, many of these candidates may be better represented by alignments taking the RNA secondary structure into account than those based on primary sequence alone, often quite dramatically. For example, approximately one-quarter of our predicted motifs show revisions in >50% of their aligned positions. Furthermore, our results are strongly complementary to those discovered by sequence-alignment-based approaches-84% of our candidates are not covered by Washietl et al., increasing the number of ncRNA candidates in the ENCODE region by 32%. In a group of 11 ncRNA candidates that were tested by RT-PCR, 10 were confirmed to be present as RNA transcripts in human tissue, and most show evidence of significant differential expression across tissues. Our results broadly suggest caution in any analysis relying on multiple sequence alignments in less well-conserved regions, clearly support growing appreciation for the biological significance of ncRNAs, and strongly support the argument for considering RNA structure directly in any searches for these elements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.