DNA barcoding involves sequencing a standard region of DNA as a tool for species identification. However, there has been no agreement on which region(s) should be used for barcoding land plants. To provide a community recommendation on a standard plant barcode, we have compared the performance of 7 leading candidate plastid DNA regions (atpF-atpH spacer, matK gene, rbcL gene, rpoB gene, rpoC1 gene, psbK-psbI spacer, and trnH-psbA spacer). Based on assessments of recoverability, sequence quality, and levels of species discrimination, we recommend the 2-locus combination of rbcL؉matK as the plant barcode. This core 2-locus barcode will provide a universal framework for the routine use of DNA sequence data to identify specimens and contribute toward the discovery of overlooked species of land plants.matK ͉ rbcL ͉ species identification L arge-scale standardized sequencing of the mitochondrial gene CO1 has made DNA barcoding an efficient species identification tool in many animal groups (1). In plants, however, low substitution rates of mitochondrial DNA have led to the search for alternative barcoding regions. From initial investigations of plastid regions (2-4), 7 leading candidates have emerged (5, 6). Four are portions of coding genes (matK, rbcL, rpoB, and rpoC1), and 3 are noncoding spacers (atpF-atpH, trnH-psbA, and psbK-psbI). Different research groups have proposed various combinations of these loci as their preferred plant barcodes, but no consensus has emerged (5-12). This lack of an agreed standard has impeded progress in plant barcoding.Our aim here is to identify a standard DNA barcode for land plants. To achieve this goal, we have pooled data across laboratories including sequence data from 907 samples, representing 445 angiosperm, 38 gymnosperm, and 67 cryptogam species. Using various subsets of these data, we evaluated the 7 candidate loci using criteria in the Consortium for the Barcode of Life's (CBOL) data standards and guidelines for locus selection (http:// www.barcoding.si.edu/protocols.html). Universality: Which loci can be routinely sequenced across the land plants? Sequence quality and coverage: Which loci are most amenable to the production of bidirectional sequences with few or no ambiguous base calls? Discrimination: Which loci enable most species to be distinguished? ResultsUniversality. Direct universality assessments using a single primer pair for each locus in angiosperms resulted in 90%-98% PCR and sequencing success for 6/7 regions. Success for the seventh region, psbK-psbI, was 77% (Fig. 1A). Greater problems were encountered in other land plant groups, with rpoB, matK, atpF-atpH, and psbK-psbI all showing Ͻ50% success in gymnosperms and/or cryptogams based on data compiled from several laboratories (Fig. 1 A).Sequence Quality. Evaluation of sequence quality and coverage from the candidate loci demonstrated that high quality bidirectional sequences were routinely obtained from rbcL, rpoC1, and rpoB (Fig. 1B, x axis). The remaining 4 loci required more manual editing and produced f...
DNA barcoding is a technique in which species identification is performed by using DNA sequences from a small fragment of the genome, with the aim of contributing to a wide range of ecological and conservation studies in which traditional taxonomic identification is not practical. DNA barcoding is well established in animals, but there is not yet any universally accepted barcode for plants. Here, we undertook intensive field collections in two biodiversity hotspots (Mesoamerica and southern Africa). Using >1,600 samples, we compared eight potential barcodes. Going beyond previous plant studies, we assessed to what extent a ''DNA barcoding gap'' is present between intra-and interspecific variations, using multiple accessions per species. Given its adequate rate of variation, easy amplification, and alignment, we identified a portion of the plastid matK gene as a universal DNA barcode for flowering plants. Critically, we further demonstrate the applicability of DNA barcoding for biodiversity inventories. In addition, analyzing >1,000 species of Mesoamerican orchids, DNA barcoding with matK alone reveals cryptic species and proves useful in identifying species listed in Convention on International Trade of Endangered Species (CITES) appendixes.
Sequencing of target-enriched libraries is an efficient and cost-effective method for obtaining DNA sequence data from hundreds of nuclear loci for phylogeny reconstruction. Much of the cost of developing targeted sequencing approaches is associated with the generation of preliminary data needed for the identification of orthologous loci for probe design. In plants, identifying orthologous loci has proven difficult due to a large number of whole-genome duplication events, especially in the angiosperms (flowering plants). We used multiple sequence alignments from over 600 angiosperms for 353 putatively single-copy protein-coding genes identified by the One Thousand Plant Transcriptomes Initiative to design a set of targeted sequencing probes for phylogenetic studies of any angiosperm group. To maximize the phylogenetic potential of the probes, while minimizing the cost of production, we introduce a k-medoids clustering approach to identify the minimum number of sequences necessary to represent each coding sequence in the final probe set. Using this method, 5–15 representative sequences were selected per orthologous locus, representing the sequence diversity of angiosperms more efficiently than if probes were designed using available sequenced genomes alone. To test our approximately 80,000 probes, we hybridized libraries from 42 species spanning all higher-order groups of angiosperms, with a focus on taxa not present in the sequence alignments used to design the probes. Out of a possible 353 coding sequences, we recovered an average of 283 per species and at least 100 in all species. Differences among taxa in sequence recovery could not be explained by relatedness to the representative taxa selected for probe design, suggesting that there is no phylogenetic bias in the probe set. Our probe set, which targeted 260 kbp of coding sequence, achieved a median recovery of 137 kbp per taxon in coding regions, a maximum recovery of 250 kbp, and an additional median of 212 kbp per taxon in flanking non-coding regions across all species. These results suggest that the Angiosperms353 probe set described here is effective for any group of flowering plants and would be useful for phylogenetic studies from the species level to higher-order groups, including the entire angiosperm clade itself.
SummaryThe origin of fire-adapted lineages is a long-standing question in ecology. Although phylogeny can provide a significant contribution to the ongoing debate, its use has been precluded by the lack of comprehensive DNA data. Here, we focus on the 'underground trees' (=geoxy-les) of southern Africa, one of the most distinctive growth forms characteristic of fire-prone savannas.We placed geoxyles within the most comprehensive dated phylogeny for the regional flora comprising over 1400 woody species. Using this phylogeny, we tested whether African geoxyles evolved concomitantly with those of the South American cerrado and used their phylogenetic position to date the appearance of humid savannas.We found multiple independent origins of the geoxyle life-form mostly from the Pliocene, a period consistent with the origin of cerrado, with the majority of divergences occurring within the last 2 million yr. When contrasted with their tree relatives, geoxyles occur in regions characterized by higher rainfall and greater fire frequency.Our results indicate that the geoxylic growth form may have evolved in response to the interactive effects of frequent fires and high precipitation. As such, geoxyles may be regarded as markers of fire-maintained savannas occurring in climates suitable for forests.
Several major lineages with geographical coherence, as identified in previous studies based on smaller data sets, are supported. Other lineages with either geographical or ecological correspondence are recognized for the first time. Coffea subgenus Baracoffea is shown to be monophyletic, but Coffea subgenus Coffea is paraphyletic. Sequence data do not substantiate the monophyly of either Coffea or Psilanthus. Low levels of sequence divergence do not allow detailed resolution of relationships within Coffea, most notably for species of Coffea subgenus Coffea occurring in Madagascar. The origin of C. arabica by recent hybridization between C. canephora and C. eugenioides is supported. Phylogenetic separation resulting from the presence of the Dahomey Gap is inferred based on sequence data from Coffea.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.