Background: To account for genome-wide discordance among gene trees, several widely-used methods seek to find a species tree with the minimum distance to input gene trees. To efficiently explore the large space of species trees, some of these methods, including ASTRAL, use dynamic programming (DP). The DP paradigm can restrict the search space, and thus, ASTRAL and similar methods use heuristic methods to define a restricted search space. However, arbitrary constraints provided by the user on the output tree cannot be trivially incorporated into such restrictions. The ability to infer trees that honor user-defined constraints is needed for many phylogenetic analyses, but no solution currently exists for constraining the output of ASTRAL. Results: We introduce methods that enable the ASTRAL dynamic programming to infer constrained trees in an effective and scalable manner. To do so, we adopt a recently developed tree completion algorithm and extend it to allow multifurcating input and output trees. In simulation studies, we show that the approach for honoring constraints is both effective and fast. On real data, we show that constrained searches can help interrogate branches not recovered in the optimal ASTRAL tree to reveal support for alternative hypotheses.
Conclusions:The new algorithm is added ASTRAL to all user-provided constraints on the species tree.
BackgroundPhylogeny inference requires solving an optimization problem over the space of all trees. The super-exponential growth of the tree topology space makes examining all trees impossible, even for moderately large datasets. As a result, tree inference algorithms have adopted several heuristics strategies, including iterative search (e.g., hill-climbing), used by most maximum parsimony and maximum likelihood methods.An increasingly popular alternative is searching the tree space using dynamic programming (DP). For an