The Fabaceae, the third largest family of plants and the source of many crops, has been the target of many genomic studies. Currently, only the grasses surpass the legumes for the number of publicly available expressed sequence tags (ESTs). The quantity of sequences from diverse plants enables the use of computational approaches to identify novel genes in specific taxa. We used BLAST algorithms to compare unigene sets from Medicago truncatula, Lotus japonicus, and soybean (Glycine max and Glycine soja) to nonlegume unigene sets, to GenBank's nonredundant and EST databases, and to the genomic sequences of rice (Oryza sativa) and Arabidopsis. As a working definition, putatively legume-specific genes had no sequence homology, below a specified threshold, to publicly available sequences of nonlegumes. Using this approach, 2,525 legume-specific EST contigs were identified, of which less than three percent had clear homology to previously characterized legume genes. As a first step toward predicting function, related sequences were clustered to build motifs that could be searched against protein databases. Three families of interest were more deeply characterized: F-box related proteins, Pro-rich proteins, and Cys cluster proteins (CCPs). Of particular interest were the .300 CCPs, primarily from nodules or seeds, with predicted similarity to defensins. Motif searching also identified several previously unknown CCP-like open reading frames in Arabidopsis. Evolutionary analyses of the genomic sequences of several CCPs in M. truncatula suggest that this family has evolved by local duplications and divergent selection.Legumes constitute a large plant family that presents humans with a treasure trove of resources for a variety of uses. Throughout the world, legumes provide important sources of protein, oil, mineral nutrients, and nutritionally important natural products (Graham and Vance, 2003). Grain legume species, including pea (Pisum sativum), common bean (Phaseolus vulgaris), and lentil (Lens culinaris), account for over 33% of human dietary protein. Other legumes, including clovers (Trifolium spp.) and medics (Medicago spp.), are widely used as animal fodder. Refined oils, such as soybean (Glycine max) oil, have industrial applications in paint, diesel fuel, electrical insulation, and solvents. Legumes also accumulate phytochemicals, including isoflavonoids, which impact human health through pharmaceutical use and as dietary supplements (Dixon and Sumner, 2003).An important feature of legumes is their ability to obtain nutrients via symbioses with soil microbes. The formation of nitrogen-fixing nodules via interaction with bacteria collectively know as rhizobia is virtually unique to legumes, although some species in eight families of the eurosid I clade of dicots can form nodules in association with nitrogen-fixing actinomycetes (Soltis et al., 1995;Doyle and Luckow, 2003). An exchange of specific signal molecules between host and microbe triggers many developmental events in the host, including extensive modulation...