Abstract. A phylogenetic tree is a rooted tree with unbounded degree such that each leaf node is uniquely labelled from 1 to n. The descendent subtree of of a phylogenetic tree T is the subtree composed by all edges and nodes of T descending from a vertex. Given a set of phylogenetic trees, we present linear time algorithms for finding all leaf-agree descendent subtrees as well as all isomorphic descendent subtrees.
The normalized cluster distance, d(A, B), of two sets is defined by d(A, B) = ∆(A, B)/(|A| + |B|), where ∆(A, B)denotes the symmetric set difference of two sets. We show that computing all pairs normalized cluster distances between descendent subtrees of two phylogenetic trees can be done in O(n 2 ) time. Since the total size of the outputs will be Θ(n 2 ), the algorithm is thus computationally optimal. A nearest subtree of a subset of leaves is such a descendent subtree that has the smallest normalized cluster distance to these leaves. Here we show that finding nearest subtrees for a collection of pairwise disjointed subsets of leaves can be done in O(n) time. Several applications of these algorithms in areas of bioinformatics is considered. Among them, we discuss the 2CS (Two component systems) functional analysis and classifications on bacterial genome.