Phylogenetic analyses based on comparison of a limited number of genes recently suggested that Amborella trichopoda is the most ancient angiosperm. Here we present the complete sequence of the chloroplast genome of this plant. It does not display any of the genes characteristic of chloroplast DNA of the gymnosperm Pinus thunbergii (chlB, chlL, chlN, psaM, and ycf12). The majority of phylogenetic analyses of protein-coding genes of this chloroplast DNA suggests that Amborella is not the basal angiosperm and not even the most basal among dicots.
Determining the phylogenetic relationships among the major lines of angiosperms is a long-standing problem, yet the uncertainty as to the phylogenetic affinity of these lines persists. While a number of studies have suggested that the ANITA (Amborella-Nymphaeales-Illiciales-Trimeniales-Aristolochiales) grade is basal within angiosperms, studies of complete chloroplast genome sequences also suggested an alternative tree, wherein the line leading to the grasses branches first among the angiosperms. To improve taxon sampling in the existing chloroplast genome data, we sequenced the chloroplast genome of the monocot Acorus calamus. We generated a concatenated alignment (89,436 positions for 15 taxa), encompassing almost all sequences usable for phylogeny reconstruction within spermatophytes. The data still contain support for both the ANITA-basal and grasses-basal hypotheses. Using simulations we can show that were the ANITA-basal hypothesis true, parsimony (and distance-based methods with many models) would be expected to fail to recover it. The self-evident explanation for this failure appears to be a long-branch attraction (LBA) between the clade of grasses and the out-group. However, this LBA cannot explain the discrepancies observed between tree topology recovered using the maximum likelihood (ML) method and the topologies recovered using the parsimony and distance-based methods when grasses are deleted. Furthermore, the fact that neither maximum parsimony nor distance methods consistently recover the ML tree, when according to the simulations they would be expected to, when the out-group (Pinus) is deleted, suggests that either the generating tree is not correct or the best symmetric model is misspecified (or both). We demonstrate that the tree recovered under ML is extremely sensitive to model specification and that the best symmetric model is misspecified. Hence, we remain agnostic regarding phylogenetic relationships among basal angiosperm lineages.
Angiosperms (flowering plants) dominate contemporary terrestrial flora with roughly 250,000 species, but their origin and early evolution are still poorly understood. In recent years, molecular evidence has accumulated suggesting a dicotyledonous origin of monocots. Phylogenetic reconstructions have suggested that several dicotyledonous groups that include taxa such as Amborella, Austrobaileya, and Nymphaea branch off as the most basal among angiosperms. This has led to the concept of monocots, "eudicots," "basal dicots," and "ANITA" groupings. Here, we present the sequence and phylogenetic analyses of the chloroplast DNA of Nymphaea alba. Phylogenetic analyses of our 14-species data set, consisting of 29,991 aligned nucleotide positions per chloroplast genome, revealed consistent support for Nymphaea being a divergent member of a monophyletic dicot assemblage. Three distinct angiosperm lineages were supported in the majority of our phylogenetic analyses-eudicots, Magnoliopsida, and monocots. However, the monocot lineage leading to the grasses was the deepest branching. Although analyses of only one individual gene alignment (out of 61) is consistent with some recently proposed hypotheses for the paraphyly of dicots, we also report observations that nine genes do not support paraphyly of dicots. Instead, they support the basal monocot-dicot split. Consistent with this finding, we also report observations suggesting that the monocot lineage leading to the grasses has the strongest phylogenetic affinity to gymnosperms. Our findings have general implications for studies of substitution model specification and analyses of concatenated genome data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.