Amino acid composition of proteins varies substantially between taxa and, thus, can evolve. For example, proteins from organisms with (G + C)-rich (or (A + T)-rich) genomes contain more (or fewer) amino acids encoded by (G + C)-rich codons. However, no universal trends in ongoing changes of amino acid frequencies have been reported. We compared sets of orthologous proteins encoded by triplets of closely related genomes from 15 taxa representing all three domains of life (Bacteria, Archaea and Eukaryota), and used phylogenies to polarize amino acid substitutions. Cys, Met, His, Ser and Phe accrue in at least 14 taxa, whereas Pro, Ala, Glu and Gly are consistently lost. The same nine amino acids are currently accrued or lost in human proteins, as shown by analysis of non-synonymous single-nucleotide polymorphisms. All amino acids with declining frequencies are thought to be among the first incorporated into the genetic code; conversely, all amino acids with increasing frequencies, except Ser, were probably recruited late. Thus, expansion of initially under-represented amino acids, which began over 3,400 million years ago, apparently continues to this day.
Intragenomic and intergenomic comparisons of upstream nucleotide sequences of archaeal genes were performed with the goal of predicting transcription regulatory sites (operators) and identifying likely regulons. Learning sets for the detection of regulatory sites were constructed using the available experimental data on archaeal transcription regulation or by analogy with known bacterial regulons, and further analysis was performed using iterative profile searches. The information content of the candidate signals detected by this method is insufficient for reliable predictions to be made. Therefore, this approach has to be complemented by examination of evolutionary conservation in different archaeal genomes. This combined strategy resulted in the prediction of a conserved heat shock regulon in all euryarchaea, a nitrogen fixation regulon in the methanogens Methanococcus jannaschii and Methanobacterium thermoautotrophicum and an aromatic amino acid regulon in M.thermoautotrophicum. Unexpectedly, the heat shock regulatory site was detected not only for genes that encode known chaperone proteins but also for archaeal histone genes. This suggests a possible function for archaeal histones in stress-related changes in DNA condensation. In addition, comparative analysis of the genomes of three Pyrococcus species resulted in the prediction of their purine metabolism and transport regulon. The results demonstrate the feasibility of prediction of at least some transcription regulatory sites by comparing poorly characterized prokaryotic genomes, particularly when several closely related genome sequences are available.
We showed previously that rad50 and mre11 genes of thermophilic archaea are organized in an operon-like structure with a third gene (nurA) encoding a 5' to 3' exonuclease. Here, we show that the rad50, mre11 and nurA genes from the hyperthermophilic archaeon Sulfolobus acidocaldarius are co-transcribed with a fourth gene encoding a DNA helicase. This enzyme (HerA) is the prototype of a new class of DNA helicases able to utilize either 3' or 5' single-stranded DNA extensions for loading and subsequent DNA duplex unwinding. To our knowledge, DNA helicases capable of translocating along the DNA in both directions have not been identified previously. Sequence analysis of HerA shows that it is a member of the TrwB, FtsK and VirB4/VirD4 families of the PilT class NTPases. HerA homologs are found in all thermophilic archaeal species and, in all cases except one, the rad50, mre11, nurA and herA genes are grouped together. These results suggest that the archaeal Rad50-Mre11 complex might act in association with a 5' to 3' exonuclease (NurA) and a bipolar DNA helicase (HerA) indicating a probable involvement in the initiation step of homologous recombination.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.