A long-term goal of Arabidopsis research is to define the minimal gene set needed to produce a viable plant with a normal phenotype under diverse conditions. This will require both forward and reverse genetics along with novel strategies to characterize multigene families and redundant biochemical pathways. Here we describe an initial dataset of 250 EMB genes required for normal embryo development in Arabidopsis. This represents the first large-scale dataset of essential genes in a flowering plant. When compared with 550 genes with other knockout phenotypes, EMB genes are enriched for basal cellular functions, deficient in transcription factors and signaling components, have fewer paralogs, and are more likely to have counterparts among essential genes of yeast (Saccharomyces cerevisiae) and worm (Caenorhabditis elegans). EMB genes also represent a valuable source of plant-specific proteins with unknown functions required for growth and development. Analyzing such unknowns is a central objective of genomics efforts worldwide. We focus here on 34 confirmed EMB genes with unknown functions, demonstrate that expression of these genes is not embryo-specific, validate a strategy for identifying interacting proteins through complementation with epitope-tagged proteins, and discuss the value of EMB genes in identifying novel proteins associated with important plant processes. Based on sequence comparison with essential genes in other model eukaryotes, we identify 244 candidate EMB genes without paralogs that represent promising targets for reverse genetics. These candidates should facilitate the recovery of additional genes required for seed development.
Arabidopsis thaliana, a small annual plant belonging to the mustard family, is the subject of study by an estimated 7000 researchers around the world. In addition to the large body of genetic, physiological and biochemical data gathered for this plant, it will be the first higher plant genome to be completely sequenced, with completion expected at the end of the year 2000. The sequencing effort has been coordinated by an international collaboration, the Arabidopsis Genome Initiative (AGI). The rationale for intensive investigation of Arabidopsis is that it is an excellent model for higher plants. In order to maximize use of the knowledge gained about this plant, there is a need for a comprehensive database and information retrieval and analysis system that will provide user-friendly access to Arabidopsis information. This paper describes the initial steps we have taken toward realizing these goals in a project called The Arabidopsis Information Resource (TAIR) (www.arabidopsis.org).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.