Bacillus subtilis is the best-characterized member of the Gram-positive bacteria. Its genome of 4,214,810 base pairs comprises 4,100 protein-coding genes. Of these protein-coding genes, 53% are represented once, while a quarter of the genome corresponds to several gene families that have been greatly expanded by gene duplication, the largest family containing 77 putative ATP-binding transport proteins. In addition, a large proportion of the genetic capacity is devoted to the utilization of a variety of carbon sources, including many plant-derived molecules. The identification of five signal peptidase genes, as well as several genes for components of the secretion apparatus, is important given the capacity of Bacillus strains to secrete large amounts of industrially important enzymes. Many of the genes are involved in the synthesis of secondary metabolites, including antibiotics, that are more typically associated with Streptomyces species. The genome contains at least ten prophages or remnants of prophages, indicating that bacteriophage infection has played an important evolutionary role in horizontal gene transfer, in particular in the propagation of bacterial pathogenesis.
The plant Arabidopsis thaliana (Arabidopsis) has become an important model species for the study of many aspects of plant biology. The relatively small size of the nuclear genome and the availability of extensive physical maps of the five chromosomes provide a feasible basis for initiating sequencing of the five chromosomes. The YAC (yeast artificial chromosome)-based physical map of chromosome 4 was used to construct a sequence-ready map of cosmid and BAC (bacterial artificial chromosome) clones covering a 1.9-megabase (Mb) contiguous region, and the sequence of this region is reported here. Analysis of the sequence revealed an average gene density of one gene every 4.8 kilobases (kb), and 54% of the predicted genes had significant similarity to known genes. Other interesting features were found, such as the sequence of a disease-resistance gene locus, the distribution of retroelements, the frequent occurrence of clustered gene families, and the sequence of several classes of genes not previously encountered in plants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.