SummaryThe phylogenetic distribution of the components comprising the transcriptional machinery in the crenarchaeal and euryarchaeal lineages of the Archaea was analyzed in a systematic manner by genome-wide profiling of transcription complements in fifteen complete archaeal genome sequences. Initially, a reference set of transcription-associated proteins (TAPs) consisting of sequences functioning in all aspects of the transcriptional process, and originating from the three domains of life, was used to query the genomes. TAP-families were detected by sequence clustering of the TAPs and their archaeal homologues, and through extensive database searching, these families were assigned a function. The phylogenetic origins of archaeal genes matching hidden Markov model profiles of protein domains associated with transcription, and those encoding the TAP-homologues, showed there is extensive lineage-specificity of proteins that function as regulators of transcription: most of these sequences are present solely in the Euryarchaeota, with nearly all of them homologous to bacterial DNA-binding proteins. Strikingly, the hidden Markov model profile searches revealed that archaeal chromatin and histone-modifying enzymes also display extensive taxon-restrictedness, both across and within the two phyla.
Keywords: genome profiling, protein families, sequence clustering, transcription-associated proteins.
IntroductionTranscription, a core gene expression process, involves different agents participating in initiation, elongation, termination and regulation. The basic principles of transcriptional regulation in the Bacteria and Eukaryota have been outlined (Ptashne and Gann 1997), contrasting with the Archaea, where such mechanisms are less well-understood. The study of the archaeal transcriptional machinery is important to an understanding of both the molecular mechanisms, and the evolutionary history, of transcriptional regulation in all three domains of life. Furthermore, these analyses may reveal how archaea respond to environmental challenges, in particular, given their possible association with various aspects of human disease (Eckburg et al. 2003). Several previous studies have shown that the RNA polymerase core enzyme exhibits structural similarity between the Archaea and the Eukaryota (Puhler et al. 1989). Moreover, the minimal set of factors required for in vitro transcription initiation in archaea consists of TATA-box binding protein (TBP), TFIIB and RNA polymerase II (Werner and Weinzierl 2002). In bacteria, however, the process appears to be fundamentally different (Struhl 1999), with regulation accomplished by an entirely different set of proteins (Gralla 1996).Evidence from sequence similarity studies between RNA polymerases suggests that archaeal transcription shared certain components with that of the Eukarya (Puhler et al. 1989), a conclusion further supported by the discovery of TFIIB in Pyrococcus furiosus (Ouzounis and Sander 1992) and TBP in P. woesei (Rowlands et al. 1994). The strong similarity of archaeal...