Extracytoplasmic function σ factors (ECFs) represent one of the major bacterial signal transduction mechanisms in terms of abundance, diversity and importance, particularly in mediating stress responses. Here, we performed a comprehensive phylogenetic analysis of this protein family by scrutinizing all proteins in the NCBI database. As a result, we identified an average of ∼10 ECFs per bacterial genome and 157 phylogenetic ECF groups that feature a conserved genetic neighborhood and a similar regulation mechanism. Our analysis expands previous classification efforts ∼50-fold, enriches many original ECF groups with previously unclassified proteins and identifies 22 entirely new ECF groups. The ECF groups are hierarchically related to each other and are further composed of subgroups with closely related sequences. This two-tiered classification allows for the accurate prediction of common promoter motifs and the inference of putative regulatory mechanisms across subgroups composing an ECF group. This comprehensive, high-resolution description of the phylogenetic distribution of the ECF family, together with the massive expansion of classified ECF sequences and an openly accessible data repository called ‘ECF Hub’ (https://www.computational.bio.uni-giessen.de/ecfhub), will serve as a powerful hypothesis-generator to guide future research in the field.
The activity of extracytoplasmic function σ-factors (ECFs) is typically regulated by anti-σ factors. In a number of highly abundant ECF groups, including ECF41 and ECF42, σ-factors contain fused C-terminal protein domains, which provide the necessary regulatory function instead. Here, we identified the contact interface between the C-terminal extension and the core σ-factor regions required for controlling ECF activity. We applied direct coupling analysis (DCA) to infer evolutionary covariation between contacting amino acid residues for groups ECF41 and ECF42. Mapping the predicted interactions to a recently solved ECF41 structure demonstrated that DCA faithfully identified an important contact interface between the SnoaL-like extension and the linker between the σ 2 and σ 4 domains. Systematic alanine substitutions of contacting residues support this model and suggest that this interface stabilizes a compact conformation of ECF41 with low transcriptional activity. For group ECF42, DCA supports a structural homology model for their C-terminal tetratricopeptide repeat (TPR) domains and predicts an intimate contact between the first TPR-helix and the σ 4 domain. Mutational analyses demonstrate the essentiality of the predicted interactions for ECF42 activity. These results indicate that C-terminal extensions indeed bind and regulate the core ECF regions, illustrating the potential of DCA for discovering regulatory motifs in the ECF subfamily.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.