The complete sequence of the genome of a hyper-thermophilic archaebacterium, Pyrococcus horikoshii OT3, has been determined by assembling the sequences of the physical map-based contigs of fosmid clones and of long polymerase chain reaction (PCR) products which were used for gap-filling. The entire length of the genome was 1,738,505 bp. The authenticity of the entire genome sequence was supported by restriction analysis of long PCR products, which were directly amplified from the genomic DNA. As the potential protein-coding regions, a total of 2061 open reading frames (ORFs) were assigned, and by similarity search against public databases, 406 (19.7%) were related to genes with putative function and 453 (22.0%) to the sequences registered but with unknown function. The remaining 1202 ORFs (58.3%) did not show any significant similarity to the sequences in the databases. Sequence comparison among the assigned ORFs in the genome provided evidence that a considerable number of ORFs were generated by sequence duplication. By similarity search, 11 ORFs were assumed to contain the intein elements. The RNA genes identified were a single 16S-23S rRNA operon, two 5S rRNA genes and 46 tRNA genes including two with the intron structure. All the assigned ORFs and RNA coding regions occupied 91.25% of the whole genome. The data presented in this paper are available on the internet at http:@www.nite.go.jp.
The complete genomic sequence of an aerobic thermoacidophilic crenarchaeon, Sulfolobus tokodaii strain7 which optimally grows at 80 degrees C, at low pH, and under aerobic conditions, has been determined by the whole genome shotgun method with slight modifications. The genomic size was 2,694,756 bp long and the G + C content was 32.8%. The following RNA-coding genes were identified: a single 16S-23S rRNA cluster, one 5S rRNA gene and 46 tRNA genes (including 24 intron-containing tRNA genes). The repetitive sequences identified were SR-type repetitive sequences, long dispersed-type repetitive sequences and Tn-like repetitive elements. The genome contained 2826 potential protein-coding regions (open reading frames, ORFs). By similarity search against public databases, 911 (32.2%) ORFs were related to functional assigned genes, 921 (32.6%) were related to conserved ORFs of unknown function, 145 (5.1%) contained some motifs, and remaining 849 (30.0%) did not show any significant similarity to the registered sequences. The ORFs with functional assignments included the candidate genes involved in sulfide metabolism, the TCA cycle and the respiratory chain. Sequence comparison provided evidence suggesting the integration of plasmid, rearrangement of genomic structure, and duplication of genomic regions that may be responsible for the larger genomic size of the S. tokodaii strain7 genome. The genome contained eukaryote-type genes which were not identified in other archaea and lacked the CCA sequence in the tRNA genes. The result suggests that this strain is closer to eukaryotes among the archaea strains so far sequenced. The data presented in this paper are also available on the internet homepage (http://www.bio.nite.go.jp/E-home/genome_list-e.html/).
The complete sequence of the genome of an aerobic hyper-thermophilic crenarchaeon, Aeropyrum pernix K1, which optimally grows at 95 degrees C, has been determined by the whole genome shotgun method with some modifications. The entire length of the genome was 1,669,695 bp. The authenticity of the entire sequence was supported by restriction analysis of long PCR products, which were directly amplified from the genomic DNA. As the potential protein-coding regions, a total of 2,694 open reading frames (ORFs) were assigned. By similarity search against public databases, 633 (23.5%) of the ORFs were related to genes with putative function and 523 (19.4%) to the sequences registered but with unknown function. All the genes in the TCA cycle except for that of alpha-ketoglutarate dehydrogenase were included, and instead of the alpha-ketoglutarate dehydrogenase gene, the genes coding for the two subunits of 2-oxoacid:ferredoxin oxidoreductase were identified. The remaining 1,538 ORFs (57.1%) did not show any significant similarity to the sequences in the databases. Sequence comparison among the assigned ORFs suggested that a considerable member of ORFs were generated by sequence duplication. The RNA genes identified were a single 16S-23S rRNA operon, two 5S rRNA genes and 47 tRNA genes including 14 genes with intron structures. All the assigned ORFs and RNA coding regions occupied 89.12% of the whole genome. The data presented in this paper are available on the internet homepage (http://www.mild.nite.go.jp).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.