Phylogenomics heavily relies on well-curated sequence data sets that comprise, for each gene, exclusively 1:1 orthologos. Paralogs are treated as a dangerous nuisance that has to be detected and removed. We show here that this severe restriction of the data sets is not necessary. Building upon recent advances in mathematical phylogenetics, we demonstrate that gene duplications convey meaningful phylogenetic information and allow the inference of plausible phylogenetic trees, provided orthologs and paralogs can be distinguished with a degree of certainty. Starting from tree-free estimates of orthology, cograph editing can sufficiently reduce the noise to find correct event-annotated gene trees. The information of gene trees can then directly be translated into constraints on the species trees. Although the resolution is very poor for individual gene families, we show that genomewide data sets are sufficient to generate fully resolved phylogenetic trees, even in the presence of horizontal gene transfer.orthology | paralogy | gene tree | species tree | cograph M olecular phylogenetics is primarily concerned with the reconstruction of evolutionary relationships between species based on sequence information. To this end, alignments of protein or DNA sequences are used, whose evolutionary history is believed to be congruent to that of the respective species. This property can be ensured most easily in the absence of gene duplications and horizontal gene transfer (HGT). Phylogenetic studies judiciously select families of genes that rarely exhibit duplications (such as rRNAs, most ribosomal proteins, and many of the housekeeping enzymes). In phylogenomics, elaborate automatic pipelines such as HaMStR (1), are used to filter genomewide data sets to at least deplete sequences with detectable paralogs (homologs in the same species).In the presence of gene duplications, however, it becomes necessary to distinguish between the evolutionary history of genes (gene trees) and the evolutionary history of the species (species trees) in which these genes reside. Leaves of a gene tree represent genes. Their inner nodes represent two kinds of evolutionary events, namely the duplication of genes within a genome-giving rise to paralogs-and speciations, in which the ancestral gene complement is transmitted to two daughter lineages. Two genes are (co)orthologous if their last common ancestor in the gene tree represents a speciation event, whereas they are paralogous if their last common ancestor is a duplication event; see refs. 2 and 3 for a more recent discussion on orthology and paralogy relationships. Speciation events, in turn, define the inner vertices of a species tree. However, they depend on both the gene and the species phylogeny, as well as the reconciliation between the two. The latter identifies speciation vertices in the gene tree with a particular speciation event in the species tree and places the gene duplication events on the edges of the species tree. Intriguingly, it is nevertheless possible in practice to distinguis...